You are here

Develop power management solution for DICE desktops

Project ID: 
94
Current stage: 
Manager: 
Unit: 
What: 

Description: The project will implement a component to manage sleep on DICE machines. The component shall cause the machine to enter and exit an ACPI sleep state at appropriate times.

Deliverables: An LCFG sleep component. A mechanism for other components to interact with the sleep component where necessary. Documentation for system administrators. Documentation for users.

Why: 

Customer: The School of Informatics, although other parts of the University using LCFG to manage Linux may also derive benefit.

Case statement: Project 34, Investigate power management options for DICE desktops concluded that it should be possible to make significant financial savings by implementing an automated LCFG management system for ACPI sleep on DICE desktops. Delivering such a system is one of the Managed Platform Unit's top priorities. A significant reduction in energy use will mean direct financial savings for the School of Informatics. Reducing energy consumption is one of the University's key goals.

When: 

Status:

Timescales:

Priority: The MPU has assessed this as being one of the highest priority projects on its list.

Time: Several months for development and several more for testing.

How: 

Proposal: The project will implement a component to manage sleep on DICE machines. The component shall cause the machine to enter and exit an ACPI sleep state at appropriate times.

It shall be possible to specify that the machine will wake during the night to perform nightly system admin functions such as running the boot component.

It shall be possible to specify that a machine in a Condor pool will wake periodically to check for Condor jobs to run.

The sleep component shall be as generic as possible; DICE's particular needs, for example for the boot component to run at night or for periodic Condor checks, shall be added in using some mechanism which makes them separable from the basic sleep component itself - for example using a hook mechanism.

It shall be possible to use resources to influence the sleep component's decisions on when or whether it is appropriate to initiate or terminate sleep.

The project will aim to ensure that cron jobs are run at appropriate times.

Systems which currently depend on regular cron jobs to maintain a proper configuration will still maintain a proper configuration on a machine using the sleep component.

A machine's user shall be able to initiate sleep manually using existing facilities such as the Gnome Power Manager menu.

A machine's user shall be able to terminate sleep manually by means of the usual method, that is by pressing the machine's power button.

The component will initially be targeted at current versions of Linux. However some care will be taken to avoid introducing barriers to adoption on other operating systems.

It will be necessary to test a range of available hardware to find out where the sleep component will work.

Resources: Initially one recent-model DICE desktop machine will be needed. In later stages it will be necessary to have access to a variety of models of DICE desktop in order to test hardware compatibility. Iain Rae has agreed to act as a consultant to help integrate the sleep and condor components.

Plan:

  • DESIGN
    • list necessary resources
    • think of structure
    • communicate with Informatics COs
    • communicate with LCFG community
    • modify this plan as appropriate
  • IMPLEMENTATION (with alpha testing throughout)
    • component can sleep the machine
    • component can analyse whether or not a machine is idle - initial thoughts:
      • no cron job is currently running
      • the boot component is not doing its nightly run
      • there are no condor jobs running or imminent
      • all users have been idle for at least a given number of minutes
      • the load average is below a certain level, indicating idleness
    • component can decide when to wake the machine up
    • component can set a wake-up time then sleep the machine
    • component wakes the machine at an appropriate time to run the boot component
    • component wakes the machine at an appropriate time to run the boot component, then runs the boot component (and runs it at no other time).
    • component wakes the machine intermittently and the machine checks with the condor queue for jobs to be run
    • periodic cron jobs are run correctly - meaning firstly that where possible they run while the machine is awake rather than being scheduled to run during sleep; and secondly that they do not run more often than they should, with for instance a daily cron job not running more than once per day.
    • the sleep component sets the wake-up time as soon as it can so that a user can manually sleep the machine without disrupting anything.
    • Wake-up hooks are in place where needed - for instance for amd on fc6.
  • HARDWARE TESTING
    • Test software on a variety of models, look for differences and problems
  • BETA TESTING
    • Gather some willing beta-testers
    • Document the state of the system
    • Document questions to be asked
    • Get some kind of beta-testers community going
    • Distribute the software
    • Gather reports, fix bugs, refine software, test again.
  • DEPLOYMENT
    • Establish final DICE settings, with headers
    • Document DICE settings and headers
    • Establish suitable LCFG defaults
    • Document the component and its default settings
    • Contribute to the Support FAQ
    • Add a DICE Sleep section to relevant web sites e.g. www.inf/systems
    • Roll out system cautiously
Other: 

Dependencies: None.

Risks: If not integrated correctly with the rest of the DICE infrastructure, an automated sleep system runs the risk of disrupting the participation of its host DICE desktop machine in the provision of DICE services. For instance, LDAP information will have to be kept up to date even when a machine is sleeping most of the day and night.

Milestones

Proposed date Achieved date Name Description
2008-10-02 2008-10-03 cron The component can analyse the timing of cron jobs and wake a machine appropriately. Expected work time: 4 days in 2 weeks.
2008-09-22 2008-10-03 sleep The component can send a machine to sleep. Expected work time: 2 days in 1 week.
2008-09-23 2008-10-03 wake The component sets the wake-up time for the machine. Expected work time: 4 days in 2 weeks.
2009-03-03 2008-12-12 condor Condor and sleep component cooperate. Expected time 2 days in 1 week.
2009-03-03 2008-12-12 idle The component analyses the idleness of a machine and decides whether or not to sleep the machine. Expected work time: 12 days in 6 weeks.
2009-03-03 2008-12-12 lock Implement a locking mechanism for LCFG tasks which shouldn't be interrupted by sleep. Expected work time 4 days in 2 weeks.
2009-06-02 2009-06-06 hardware Test on various varieties of hardware used for desktop machines. Expected work time 12 days in 6 weeks.
2009-08-31 2009-07-01 beta Organise beta testing involving a variety of people. Expected time 20 days in 2 months.
2009-09-11 2009-09-11 deploy Deploy the finished solution on the HPs in the student labs.
2009-03-30 2009-05-06 tidy Rationalise the existing DICE cron jobs, eliminating unnecessary ones and reducing necessary ones as far as possible. Expected time 8 days in 4 weeks.
2009-03-04 2009-03-06 min awake (Formerly part of "idle" milestone.)
Test for minimum awake time before sleep.
2009-03-04 2009-03-06 min sleep (Formerly part of "idle" milestone.) Test for minimum sleep time before sleep.
2009-03-12 2009-03-13 ignore cron (Formerly part of "idle" milestone.) Facility to optionally ignore specified LCFG and vendor cron jobs when performing cron jobs sleep assessment.
2009-03-06 2009-03-13 extra tests (Formerly part of "idle" milestone.)
Provide facility for added sysman-provided sleep veto test scripts.
2009-09-02 2009-05-01 condor test (Formerly part of "condor" milestone.) Test for satisfactory interaction of condor and sleep components.