Power management - Task 023
Fit to programme
This task has been identified by the working groups as part of the agenda behind WP 1.1.
The task number is 023.
Summary
As HPC systems become more power hungry, and as grid provision becomes more variable, it is increasingly important to be able to moderate HPC power demand in response to supply, electricity cost and grid carbon intensity.
This project invites proposals to investigate simple methods for dynamic power capping, consider performance impacts, and to study the degree of control achievable from a system perspective. Options to dynamically control power based on inputs from external sources (e.g. carbon intensity) should be developed.
The findings should be written up and presented at conferences and workshops.
Approach + methodology
Multiple techniques to control node power consumption within large scale computing systems should be documented and explored, including OS-based scheduling, batch system management and BIOS control. The change in power consumption when running various codes, as well as the total energy to complete a workflow should be considered.
Additionally, a study of whole system consumption and embodied carbon should be included in the proposals to ensure that a holistic approach is taken.
A number of community test codes should be explored, investigating how dynamic power control affects overall performance and “time-to-science”. Impacts on performance should be categorised, and recipes for whole-system approaches should be developed allowing adjustment of system power consumption whilst in production.
Inputs from external sources should be captured, and algorithms developed to control power in response to these.
The findings will be written up and presented on a website and at conferences.
This project has several achievable goals which should not present any risk to delivery. There are then stretch goals, including investigation of additional codes, interaction with real HPC systmes, more investigation into different power control methods, etc, which should be delivered if achievable, and the risks of attempting these should be considered.
This project facilitates several learning opportunities for the RTPs involved in the final projects, including investigation into power control techniques, algorithms, and impact of external factors.
Outputs
Documentation for the RTP community on the SHAREing website and presentation at workshops and events should be provided.
A working prototype system capable of reducing power demand in response to carbon intensity or grid pricing should be delivered if feasible, and it is hoped that there will be scope to place such a system into production, given necessary approvals.