You are here

LCFG SL5 port (inf level)

Project ID: 
74
Current stage: 
Manager: 
Unit: 
What: 

Description:

Develop Inf level support for Scientific Linux 5 (SL5) targetting the desktop and server environments on i386 and x86_64 cpu architectures. This work will form
the basis of a College of Science and Engineering SL5 platform for both desktops and servers. ECDF runs Scientific Linux. It is highly likely that it will form the basis for a DICE server platform.

Deliverables:

This project will deliver an Inf level managed version of SL5 for i386 and x86_64 cpu architectures. This will utilise DICE authentication/authorization and name services in client mode (no local services). AFS home directories will be supported.

This will also create an SL5 platform at the LCFG level which can be utilised by groups which are external to Informatics.

Why: 

Customer:

The primary customers are the Schools of Physics and Informatics. Other potential customers are IS (central labs) and the Schools of Geosciences and Engineering.

Case statement: The school policy is that the DICE desktop platform is upgraded to the latest
underlying OS version on a yearly basis. Although there is no equivalent policy for servers, past practice has been that servers are upgraded on the same basis.
Servers have been upgraded on this basis largely because of the short life of
the Fedora platform and consequent concerns re availability of security patches.

However, this annual server upgrade is proving to be very costly in terms of effort in porting services, upgrading servers, and disruption to services.
Moving to a more "stable", long-life, platform such as Scientific Linux, for
servers, should allow us to move to a more manageable biennial upgrade cycle.

Other schools, in CSE, have expressed a strong interest in having a standard LCFG managed SL5 platform for servers, and in some cases desktops. Eddie, the ECDF compute cluster, is based on Scientific Linux (currently V4, but shortly V5); there is an obvious usability win in having a compatible platform for our researchers.

When: 

Status:

24/07/2007
New proposal

Timescales: These timescales have not yet been discussed amongst the project team.

  • Managed Inf level desktop available by Friday 17th August.
  • Installable Inf level desktop available by Friday 7th September.

Priority:

Time:

How: 

Proposal: Post-project report:

The SL5 LCFG Port: some post-project reflections

Cooperation

The project was shared between Chris Cooke (Informatics) and Panos Kritikakos (EPCC). We cooperated pretty well.

We had temporary use of some EPCC office space for the project, although since there was only one desk there and the other desk in the room was still used by someone else, we didn't use it a lot; we did most of the project sitting at our own desks in our own offices.

The two facilities which did greatly help cooperation and communication were email and the project diary. Both helped us immensely. Email you're no doubt familiar with, but I think it's worth saying some more about the project diary.

The Project Diary

Not long after we started the project I realised that my memory just wasn't going to be up to keeping track of the detail of what had been done and what hadn't been done. More importantly, I also wasn't going to be able to remember all the problems we had encountered, which ones had been solved and how, and which ones still remained. So we started a project diary page on the wiki. It's still there at http://wiki.lcfg.org/bin/view/LCFG/SL5Diary.

The key to understanding the diary was that as it developed it became not just a place for noting what we'd done each day - that would have been useful but dull - but also a place for noting down all the problems we encountered, especially the ones we hadn't yet solved. The practice was: encounter a significant problem; describe its symptoms on the wiki; then investigate further and/or seek help. If we didn't know exactly what the problem was, or if we didn't understand something, we said so in the diary.

It had several benefits:

  • trying to give a clear description of the problem forced us to confront the facts of the problem and write down what could be proved or demonstrated, rather than just a lot of suppositions which could be inaccurate.
  • this could itself sometimes lead us to a solution.
  • if not, we didn't have to revisit the problem the next day/week to remind ourselves of the state of play; we could start where we left off.
  • we encouraged a few colleagues to read the diary now and then. Often they would see us struggling with something that they knew the answer to, and they'd get in touch and give us an explanation or the solution or an idea which led to it.
  • when facing an apparent brick wall, the act of describing the apparently insoluble problem at least made you feel as if you were doing something useful; the sudden burst of optimism this engendered would also sometimes help the search for a solution.
  • we ended up with a very detailed account of all of our work; so that for example an area could be revisited weeks or months later to find out what had been done, what hadn't been done and what our thinking had been at the time.
  • it gave a great sense of progress. Even documenting one's total incomprehension of a situation felt like solid progress, and often was.
  • it helped keep colleagues apprised of our thinking, so they could spot where we were going off track and give us fresh ideas. In other words, even problems that we weren't aware that we had were spotted and corrected.
  • it was fun!

Granting Permissions

Since he wasn't an Informatics CO, Panos had to be granted permission to access various restricted resources, for instance the LCFG subversion and CVS repositories. Finding the right permissions to give Panos, so that he could get the project work done without being given unnecessary permissions, was a process of trial and error. The following is what we found in the end to be a workable compromise. The only regular task which this set of permissions did not allow Panos to perform was to certify a new installation using a Kerberos admin key; near the end of the project when we were testing automated installations I regularly had to trot along to his room to type in a password.

Most of the permissions needed to use our repositories and LCFG infrastructure is controlled via capabilities in our authorisation system. This automatically gives Informatics users most of the permissions they need. I got fed up of giving Panos roles, each containing one or two capabilities, one by one as we found he needed them only to later find another one that he also needed - so I created a catch-all role called "portinglcfg" to hold them all. All I had to do from then on was to add more capabilities to that role as necessary. At the time of writing it granted the following capabilities:

  • "linuxman" for rpmsubmit
  • "group/cvs_dice" and "group/cvs_dice_locks" for the CVS repository
  • "@lcfgsvn" for the LCFG subversion repository
  • "rfe/lcfg/write", "rfe/lcfg/create" and "rfe/lcfg/edit" for editing lcfg files.
  • "om/all" and "om/test" for running components
  • "lcfg/inventory/write" for luck

Shortcomings of the Project Plan

The project plan was more or less the same one that had been used for, and refined after, several previous LCFG ports (fc6_64, fc6, fc5, etc.). As an aide memoire it was excellent. As an explanation of the rationale of each stage of the project it was somewhat cryptic! When embarking on most stages of the project we had to go to Stephen (a veteran of these LCFG port projects) for an explanation of what was needed and how to go about it. (We should say at this point that he was very patient and extremely helpful all the way through the project, as were Alastair Scobie and Kenny MacDonald who also helped very enthusiastically and capably. Thanks!) I'm not sure whether or not the project plan is the best place for these more detailed explanations, but they're certainly needed somewhere. As with everything else that puzzled or troubled us during the project, our attempts to understand brief instructions in the project plan are documented in the project diary.

Things to add to the project plan for future LCFG ports

  • An rpmsubmit stage before the current stage 5:
    • get an NFS client working
    • install AMD packages
    • copy a DICE config file and tweak for remote LDAP server
    • lcfg-buildtools
    • rpmsubmit
  • A getupdates stage:
    • Port getupdates script
    • mirror updates
    • ensure that updates are applying.

Resources:

In terms of manpower resources all timescales are calculated based on the people involved being experienced in porting LCFG/DICE to new versions of Fedora. In practice, this project will be resourced by COs with no such experience, so the timescales will often be greater than specified.

In terms of physical resources, the current FC6 platform is using about 70GB of file space for RPMs and SRPMs. SL5 doesn't have a large "extras" bucket and won't
have all our teaching applications; on the other hand it will have a longer life and therefore the "updates" bucket will be larger in the long run. A guesstimate of required space is 30Gb. Note that the master RPMs/SRPMs partition will need to grow to accommodate SL5

Plan:

  1. Install SL5 (1 day)
    • Standard SL5 desktop machine
    • Get onto the Informatics network.
    • Authentication with kerberos
    • Directory services from ldap.
  2. RPM repositories (0.5 day)
    • Create repository directory structure
    • Populate base, updates, extras
  3. Package lists (0.5 day)
    • Create lists for SL5 base, updates, postship
    • Create empty lists for lcfg components
  4. Essential headers (0.5 day)
    • Create any essential headers for each platform
    • Add basics to lcfg/defaults/profile.h and
      lcfg/defaults/updaterpms.h
  5. Auto-build and run tests for all LCFG components (2 days). Also auto-build:
    • openafs client support - makes porting a lot easier
    • openssh with our patches
  6. Create basic development platform (3 days)
    • Develop Inf level to create a basic profile with most components removed
    • lcfg-buildtools
    • lcfg-utils
    • lcfg-ngeneric
    • lcfg-client
    • lcfg-file
    • lcfg-inventory
    • lcfg-logserver
    • lcfg-authorize
    • lcfg-om
    • lcfg-updaterpms
    • lcfg-amd (for rpmsubmit)
    • rpmsubmit
  7. Components necessary to keep a machine LCFG managed (2 days)
    • lcfg-auth
    • lcfg-boot
    • lcfg-cron
    • lcfg-etcservices
    • lcfg-init
    • lcfg-lcfginit
    • lcfg-nsu
    • lcfg-pam
    • lcfg-syslog
    • lcfg-tcpwrappers
  8. Components for auth/authz, directory services and dns in client mode. (2 days)
    • lcfg-dns
    • lcfg-kerberos
    • lcfg-nsswitch
    • lcfg-ntp
    • lcfg-openldap
    • lcfg-openssh
  9. X support. (1 day)
    • lcfg-gdm
    • lcfg-xfree
  10. Other components, mainly just auto-build and install. (1 day)
    • lcfg-alias
    • lcfg-mailng
    • lcfg-mailcap
    • lcfg-prelink
    • lcfg-rpmcache
    • lcfg-xinetd
  11. Installation systems (4 days)
    • lcfg-fstab
    • lcfg-grub
    • lcfg-hardware
    • lcfg-install
    • lcfg-kernel
    • lcfg-network
    • Create installroot and installbase package lists
    • Build, install and test lcfg-buildinstallroot
    • Set up PXE nfs root, installer, etc
  12. Port MPU managed resources to the DICE level. (3 days)
  13. Document new platforms (2 days)
  14. Back port lcfg-buildtools to all other supported platforms
  15. Add SL5 to the list of supported platforms on the LCFG website.
Other: 

Dependencies:

All LCFG components will be auto-built to aid speedy development. We are dependent on the relevant units to test the operation and configuration of the software and sign-off each component. This should not be a block on the ongoing development work of this project but it will be required before completion.

Risks: There is a risk of SL5 being unstable in unexpected ways. Also, we are assuming in this plan that there will be no porting issues due to substantial changes in important sub-systems since FC6.

There is also the longer term risk that Scientific Linux's viability would be affected by any shift in Redhat's policy towards distributions based on rebuilt RHEL SRPMs.

URL:http://wiki.lcfg.org/bin/view/LCFG/SL5Diary

Milestones

Proposed date Achieved date Name Description
2007-07-12 2007-07-12 01 Install SL5
2007-07-18 2007-07-18 02 RPM repositories
2007-07-18 2007-07-18 03 Package lists
2007-10-01 2007-10-12 15 Add SL5 to the list of supported platforms on the LCFG website.
2007-07-20 2007-07-20 04 Essential headers
2007-08-03 2007-10-12 14 Back port lcfg-buildtools to all other supported platforms
2007-08-09 2007-07-30 05 Auto-build and run tests for all LCFG components.
2007-08-08 2007-08-10 06 Create basic development platform
2007-08-16 2007-08-20 07 Components necessary to keep a machine LCFG managed
2007-08-17 2007-08-28 08 Components for auth/authz, directory services and dns in client mode.
2007-08-21 2007-08-31 09 X support
2007-08-27 2007-09-05 10 Other components, mainly just auto-build and install.
2007-09-19 2007-09-21 11 Installation systems
2007-09-25 2007-10-04 12 Port MPU managed resources to the DICE level.
2007-09-26 2007-10-12 13 Document new platforms.