You are here

Gridengine Configuration Review

Project ID: 
28
Current stage: 
Manager: 
Unit: 
What: 

Description: Review the gridengine configuration used on each of the gridengine clusters and develop a new policy for scheduling jobs.

Gridengine employs a scheduler to manage the movement of jobs from queues to the cluster nodes. This process runs periodically, assigns the queued jobs a priority, works out which jobs can be started on the cluster and then goes to sleep. This cycle is repeated at an interval defined by the cluster administrator and is 15s by default.

The scheduler uses a combination of three policies based on who the submitter is, the jobs requirements and past usage. Currently the scheduler is using the default settings which will try to maximise the number of jobs through the cluster without giving preferences to the submitter, resource requirements or past usage.

The built in scheduler component can also be replaced or enhanced by a third party scheduler such as maui [http://www.clusterresources.com/pages/products/maui-cluster-scheduler.php].

In addition to the scheduler there are a number of other ways of controlling how and where jobs are scheduled including defining multiple preemptive queues, defining time based queues and tracking resource usage. These would also be considered by this project

Deliverables: An upgraded gridengine component capable of configuring the scheduler as required and supporing the full range of group/project functionality. A policy document clearly stating the criterion under which jobs are scheduled and how priorities are calculated.

Why: 

Customer: The gridengine userbase:- research staff and MSc/Phd students.

Case statement: Scheduling on the clusters currently appears as something of a black box to the users. Often it is not clear to them why their jobs are queued whilst others, perhaps submitted later, are started. Whilst anyone can work out the current configuration from running qstat and consulting the gridengine manuals it would be better to have a School policy and the reasoning behind it written down somewhere in black and white.

At the very least we need to document the current configuration and provide the users with some examples on how to monitor the scheduling process.

The current configuration is open to inadvertant or deliberate abuse. Users submitting jobs via a misconfigured script on the head node can quite easily flood the cluster queue with hundreds, thousands or tens of thousands of jobs effectively preventing other users from using the cluster. Jobs which require the whole cluster or large parts of it can be significantly delayed.

Finally it is also possible that support, research groups or people on individual projects using the cluster may wish to track accounting information or buy resources on the cluster. For example a research group may buy additional nodes or matlab licenses for their priority or exclusive use. Currently the component doesn't support configuring groups or projects and it is not possible to allocate resources or track usage on anything other than a per user basis.

When: 

Status:

Timescales: Given that the implementation is partly driven by the review process it's not possible to predict a full timescale.

Review Capabilities and current usage by end January

Meet with current userbase and review options by End February

Implem

Priority: There are issues with the current configuration that are causing problems for a small number of users, as more people use the clusters this is likely to get worse.

The component will have to be able to manage user, group and projects type objects in order to support any of the more sofisticated scheduling policies. Adding this functionality should go ahead ASAP. this could be done alongside changes needed as part of the cluster upgrade to FC5.

Time: This is not fully quantifiable at this stage as the later work depends on the outcome of the review. Some of the work could be done in paralled with the review and with the cluster upgrades to FC5 and this would take 1 FTE week

The review process is likely to take 3-4 FTE weeks of effort.

How: 

Proposal: Review current cluster usage and user requirements and rework the gridengine component to match the outcome. Clearly and explicitly document how the scheduling process works for each cluster and the reasoning behind the policy chosen.

Resources: Neil will conduct a peer review of the policy documents generated and of the implementation.

Plan:

1. Review the capabilities of the existing scheduler and Identify any other possible replacements

2. Review current usage patterns for each cluster and produce a list of "typical" jobs.

3. Poll the users on requirements.

4. Produce paper on possible configurations.

5. Conduct a meeting of shareholders to thrash out an acceptable scheduling policy (or policies).

6. Rewrite gridengine component to handle new scheduling config.?

7. Test new component on test cluster

8. Produce user and support documentation

9. Deploy new system on initial cluster for beta testingRun for ~ 1 Month

10. Deploy on remaining clustersover a period of ~ 1 week

Other: 

Dependencies: This project has no dependencies.

Risks: None foreseen.

URL: http://www.dice.inf.ed.ac.uk/units/research_and_teaching/projects/griden...

Milestones

Proposed date Achieved date Name Description
2007-03-01 2007-02-20 Capabilities re Review the capabilities of the existing scheduler and Identify any other possible replacements
2007-05-19 2007-05-04 Review usage pa Review current usage patterns for each cluster and produce a list of "typical" jobs
2007-08-04 User requiremen Poll the users on requirements
2007-07-24 2007-05-24 Discussion Pape Produce paper on possible configurations
2007-08-20 Policy Meeting Conduct a meeting of shareholders to thrash out an acceptable scheduling policy (or policies).
2007-08-29 Implementation Rewrite gridengine component to handle new scheduling config
2007-09-06 Alpha testing Test new component on test cluster
2007-09-10 Docs Produce user and support documentation
2007-09-25 Beta Test Deploy new system on initial cluster for beta testing
2007-10-06 Deployment Deploy on remaining clusters