You are here

Automated upstream package repository mirroring

Project ID: 
Current stage: 

Description: Currently we maintain our local package repositories using the, locally developed, getupdates script. This is very basic and just uses ftp to synchronise with individual directories on remote sites. The main problems with this are that it has to be run manually and does not give a complete copy of a remote site, for instance we do not have local copies of SRPMs. It would be much more efficient to keep complete copies of remote sites and do so in an entirely automated manner.



Customer: Informatics

Case statement: The current script for synchronising with remote sites is a bit fragile. It has to be run manually which is inefficient and tends to only get run when Stephen is around and remembers. We do not have local copies of everything for a platform. This is particularly an issue for SRPMs, we would like to keep copies of all SRPMs, for example, in case the upstream provider later drops a package that we still want and need to patch. We also don't have local copies of debuginfo packages or other types of updates (i.e. non-security) which might be useful.

Another issue is the way we currently layout our package "buckets" in the local RPM repository. We only have base and updates directories for sl5, whereas there are actually separate upstream repositories for each minor release of sl5. This forces us to run our local scripts rather than just using something like rsync to mirror an entire site.

Beyond this we currently have to manually process the list of package updates to generate LCFG package lists for base, updates, postship, kernel and xen. Much of this work could be automated by converting to a simple rules based system. There is a potential to considerable improve the handling of LCFG package lists such that we can keep up-to-date with repositories such as epel as well as just the main platform. This would give us more timely security updates and bug fixes.







Proposal: The project would split into roughly 3 parts:

  1. Develop a solution for automatically synchronising with remote sites
  2. Switch to using the new repository layout (e.g. with updaterpms.)
  3. Create scripts which can find and categorise new packages

The most important parts of this project and stages are 1 and 2, part 3 would be nice to have from an efficiency point-of-view but is not as essential.

Resources: 2w of Stephen time. Much more if someone else.






Proposed date Achieved date Name Description