A local mirror is useful for a site that has many Ubuntu Servers and Workstations. Apt-cacher is a good way to do this without a massive installation, however, apt-cacher does not allow for much control over what is downloaded and when it is downloaded.
By maintaining an internal mirror, a site can control when updates are grabbed from the Internet. There are two good ways (that I know of) to maintain an internal mirror without using a proxy or apt-cacher.
Use Debmirror - Debmirror will allow you to select which parts of the archive (dapper, edgy, edgy-security, etc) you want to mirror. This is useful because it allows a finer degree of control and works well with limited hard drive space.
- Use rsync - The great thing about rsync is it will grab EVERY file that is in those archives. The problem is that because it does not discriminate, you have the full mirror... all 600 some gigs worth of mirror. So, as more versions come online, the more you will ultimately have to support. As of this writing, my mirror contains hoary, breezy, dapper, edgy, and the packages for the new feisty.
This document will cover the second option and get you on up to speed.
A local mirror might be suggested for sites with 50 or more sites. The use of a mirror in these cases will reduce the amount of bandwidth required for these clients on your end of the link, but also the bandwidth required for Official Mirrors.
- An Ubuntu Server with ample storage (800+ Gigs)
rsync - http://launchpad.net/rsync
- mailx (optional)
Now, on to the instructions
Allocate the necessary space
At this writing, the Ubuntu Archive takes up about 680 GB of space. The amount of space taken up will probably increase as time, so allocating double the amount of space required should allow for adequate growth without needing a hardware upgrade. In addition, using a large hard drive will help prevent against file fragmentation Using a partition 800 GB or larger is recommended.
Picking a file system can be tricky and is a highly debated subject. XFS has been very successful for the author of this document.
Format your partition, pick a mount point, and add the setup to your fstab.
For the purpose of this document, we will assume that the mount point is /media/mirror and that our archives will be in the folder /media/mirror/ubuntu.
Download the necessary packages
- rsync is required to build and maintain our mirror.
- mailx is only required if you want to have your script email you when something goes wrong.
# apt-get install rsync mailx
Starting the first download
Because we are downloading 600+ Gigabytes from the Internet this download will take a VERY long time. If you can let the download run at full speed for the full gauntlet, more power to you. However, you will probably need to limit the speed of your download so it can download at a reduced speed for the next few days (weeks?).
Make the ubuntu directory
# mkdir /media/mirror/ubuntu
Now that we have our destination, start the rsync download. We will limit it to 128 KiloBytes per second to keep our sanity during working hours.
# rsync -a --bwlimit=128 rsync://archive.ubuntu.com/ubuntu /media/mirror/ubuntu
If you want to see the progress of the download, you can add the --progress flag to the command.
NOTE: Even with the --progress flag turned on, rsync may take a very long time before giving any output. This is normal behavior for a mirror this large.
NOTE: rsync measures bandwidth in Bytes, not bits. Be careful of this mathematical difference.
Building the script to keep it up to date
While we are waiting for the initial update, we can build our script to keep the mirror up to date.
Caution: Do not initialize your script or set your cron job until the initial download is done!
Open an editor (vi, pico, or gedit) and save the following code as /usr/local/bin/ubuntu-mirror-sync.sh
## Mirror Synchronization Script /usr/local/bin/ubuntu-mirror-sync.sh ## Version 1.01 Updated 13 Feb 2007 by Peter Noble ## Point our log file to somewhere and setup our admin email log=/var/log/mirrorsync.log email@example.com # Set to 0 if you do not want to receive email sendemail=1 # Subject is the subject of our email subject="Ubuntu Mirror Sync Finished" ## Setup the server to mirror remote=rsync://archive.ubuntu.com/ubuntu ## Setup the local directory / Our mirror local=/media/mirror/ubuntu ## Initialize some other variables complete="false" failures=0 status=1 pid=$$ echo "`date +%x-%R` - $pid - Started Ubuntu Mirror Sync" >> $log while [[ "$complete" != "true" ]]; do if [[ $failures -gt 0 ]]; then ## Sleep for 5 minutes for sanity's sake ## The most common reason for a failure at this point ## is that the rsync server is handling too many concurrent connections. sleep 5m fi if [[ $1 == "debug" ]]; then echo "Working on attempt number $failures" rsync -a --delete-after --progress $remote $local status=$? else rsync -a --delete-after $remote $local >> $log status=$? fi if [[ $status -ne "0" ]]; then complete="false" (( failures += 1 )) else echo "`date +%x-%R` - $pid - Finished Ubuntu Mirror Sync" >> $log # Send the email if [[ -x /usr/bin/mail && "$sendemail" -eq "1" ]]; then mail -s "$subject" "$adminmail" <<OUTMAIL Summary of Ubuntu Mirror Synchronization PID: $pid Failures: $failures Finish Time: `date` Sincerely, $HOSTNAME OUTMAIL fi complete="true" fi done exit 0
Once your initial download completes, go ahead and run the script to finish up any last minute updates.
NOTE: The script executes rsync with the --delete-after option so that users updating WHILE an synchronization is in progress will not experience any problems.
Use Cron to keep the mirror up to date
Official Ubuntu Mirrors are recommended to update their mirrors every 6 hours. In the case of your site, this may not be necessary nor desired. Chances are, you want to update during hours when your bandwidth will be under-utilized: only once every day in the middle of the night.
# crontab -e
Insert the following line, observing your particular time requirements, into the cron table. This example sets the system to update every night at 9:15 PM.
15 21 * * * /usr/local/bin/sync-ubuntu-mirror.sh > /dev/null 2> /dev/null
Publish the mirror on the Apache server
This is the easy part. Assuming that you have Apache configured to follow symbolic links, all you need to do is add a symbolic link to your mirror!
cd /var/www/ ln /media/mirror/ubuntu -s
You can test to see if this was successful by using a web browser to visit the site. Goto ubuntumirror.mydomain/ubuntu
You should see some directories named "dists", "indices", "pool", "project", and a file named "ls-lR.gz".
Update Your Clients
Now that you have your very own Ubuntu Mirror, you need to point all of your workstations and servers to this mirror for their updates. This mirror will be good for main, universe, multi-verse, and restricted.
Replace the server name for the Ubuntu Archives with your local mirror. The existing server will likely be something like us.ubuntu.com
If your server is called ubuntumirror.mydomain then your /etc/apt/sources.list file should look something like this
deb http://ubuntumirror.mydomain/ubuntu/ feisty main restricted deb-src http://ubuntumirror.mydomain/ubuntu/ feisty main restricted deb http://ubuntumirror.mydomain/ubuntu/ feisty-updates main restricted deb-src http://ubuntumirror.mydomain/ubuntu/ feisty-updates main restricted
You can test your mirror by running
# apt-get update
You should see some output referencing your server, similar to this:
Get:1 http://ubuntumirror.mydomain feisty Release.gpg [191B] Ign http://ubuntumirror.mydomain feisty/main Translation-en_US Ign http://ubuntumirror.mydomain feisty/restricted Translation-en_US Get:2 http://ubuntumirror.mydomain feisty-updates Release.gpg [191B] Ign http://ubuntumirror.mydomain feisty-updates/main Translation-en_US Ign http://ubuntumirror.mydomain feisty-updates/restricted Translation-en_US
Extra Credit: Become an Official Mirror
If you are at a site with bandwidth to spare, you may want to consider becoming an official Ubuntu Mirror.
Visit: https://launchpad.net/ubuntu/+archivemirrors to view and add mirror