There are many different applications that are available to backup Ubuntu. Each one has its strengths and weaknesses. Some are made for enterprise environments where it is necessary to back up many computers quickly and efficiently. Others are made for home environments and come with a simple wizard-driven GUI interface. Still others need to be accessed with the command line.
Besides the interfaces of various backup applications, these applications have varied functions. For example, a particular application can backup to one or many DVDs, CDs, disk drives, FTP sites, SMB drives, and other media or may not backup to one or more of those things. As varied as these programs are, so are the needs of the individuals who use them.
Installation and Setup
- Duplicity can be installed by searching for it by name in Synaptic, marking it for installation, and then simply applying the change. Alternatively, it can be installed by typing the following command in a terminal window:
sudo apt-get install duplicity
- Additionally, you may need to install the NcFTP package to be able to use Duplicity ftp backup. Simply run the following command:
sudo apt-get install ncftp
- The next step is to prepare an off-site location to receive the backup files. For this Howto we will be using a remote FTP server as the off-site backup file storage location. Although any remote FTP server will do, an excellent off-site location to store files is on a web host account. An FTP account can be easily setup by using the web host account control panel (perhaps CPanel, depending on the hosting provider).
If you decide to use a web host account, it's a good idea to make a separate FTP account rather than using the main account credentials. This will help to keep things separate and is a good security measure as well.
- Once the FTP account is setup, it should be tested using an FTP client such as gFTP from the machine that will be running Duplicity. This will ensure that a connection to the remote site is possible and that the FTP site has been setup correctly.
- It's a good idea at this time to create a directory structure for your backups on the remote FTP server. In this Howto, we will be using the directory name of the directory that we intend to backup in Ubuntu. In other words, when backing up the /etc directory, then create a /etc directory off the root of the backup account to house that particular backup set.
Once Duplicity is installed and an FTP account is ready to receive the backup files, then it's time to make a small script to test it out. Simply open up gedit. Copy and paste the following lines of code into the new document. Then save the file as backup.sh. You can then make the file executable and not readable by others using the command chmod 700 backup.sh in a terminal window or by right-clicking the file in Nautilus, clicking on Properties from the menu and then changing the permissions on the Permissions tab.
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity /etc ftp://FtpUserID@ftp.domain.com/etc unset PASSPHRASE unset FTP_PASSWORD
The above script will cause Duplicity to backup the /etc directory into compressed and encrypted volumes with the specified pass phrase, and then upload the backup files to the FTP account that is setup for it using the username and password specified.
After modifying the sample, run it by typing ./backup.sh from a terminal window.
Here's an example of the output the script should return:
--------------[ Backup Statistics ]-------------- StartTime 1133074801.81 (Sun Nov 27 01:00:01 2005) EndTime 1133074927.82 (Sun Nov 27 01:02:07 2005) ElapsedTime 126.01 (2 minutes 6.01 seconds) SourceFiles 3446 SourceFileSize 27195497 (25.9 MB) NewFiles 3446 NewFileSize 27195497 (25.9 MB) DeletedFiles 0 ChangedFiles 0 ChangedFileSize 0 (0 bytes) ChangedDeltaSize 0 (0 bytes) DeltaEntries 3446 RawDeltaSize 27018423 (25.8 MB) TotalDestinationSizeChange 6865063 (6.55 MB) Errors 0 -------------------------------------------------
On the FTP server you should find a few files similar to the following list:
duplicity-full-signatures.2005-11-27T01:00:01-05:00.sigtar.gpg duplicity-full.2005-11-27T01:00:01-05:00.manifest.gpg duplicity-full.2005-11-27T01:00:01-05:00.vol1.difftar.gpg duplicity-full.2005-11-27T01:00:01-05:00.vol2.difftar.gpg
I'm going to make some assumptions on what the files contain. But before I do, you should be aware of the Duplicity man page at http://www.nongnu.org/duplicity/duplicity.1.html, which contains more detail than what I am giving you here.
- The signatures file contains, signatures of each file that is backed up so that Duplicity can figure out which part of a file has changed. With that information it can upload only the missing part to complete a new backup set.
- The manifest file contains a listing of all the files in the backup set and a SHA1 hash of each file, probably so Duplicity can tell very quickly whether a file has been changed or not since the last backup.
- The volume files (vol1 and vol2) contain the actual file data. It appears that Duplicity volumes are at most 5MB. That's helpful during restores so the entire backup set does't not need to be downloaded to retrieve a single file. Duplicity will only download the volume containing that file.
Depending on the parameters and order of the parameters in the duplicity command, different functions can be performed. For example, an archive can be verified to see if a complete backup was made and what files, if any, have changed since the last backup. The code below is an example of how to verify the archive taken by the backup.sh script (remember to remove read permissions for other users):
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity verify ftp://FtpUserID@ftp.domain.com/etc /etc unset PASSPHRASE unset FTP_PASSWORD
Here is the output:
Verify complete: 3503 files compared, 2 differences found.
As can be seen from the output, two files already changed since the backup last ran. But which two, one might ask? For that, the verbosity level of the command must be increased from the default of level 3 to level 4. Change the verify.sh script given above by putting in a -v4. Here's what the new command looks like:
duplicity verify -v4 ftp://FtpUserID@ftp.domain.com/etc /etc
Here is the output again:
Difference found: File . has mtime Fri Dec 2 05:58:42 2005, expected Wed Nov 30 11:42:01 2005 Difference found: File resolv.conf has mtime Fri Dec 2 05:58:42 2005, expected Mon Nov 28 18:58:28 2005 Verify complete: 3503 files compared, 2 differences found.
It looks like 2 files have different modifications times than the ones recorded in the backup set: resolv.conf and the current directory (/etc) designated with a single dot.
List Archived Files
It's sometimes handy to check which files are in the latest backup set. That can be done with the following script:
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity list-current-files ftp://FtpUserID@ftp.domain.com/etc unset PASSPHRASE unset FTP_PASSWORD
When run, a long list of files is returned. Here is a section of the output:
... Sat Nov 12 19:58:23 2005 rcS.d/S70xorg-common Sat Nov 12 20:06:57 2005 rcS.d/S75sudo Sat Nov 12 20:35:43 2005 readahead Mon Nov 28 18:57:23 2005 readahead/readahead Mon Nov 28 18:57:23 2005 readahead/readahead.new Tue Jan 25 15:03:18 2005 reportbug.conf Mon Nov 28 18:58:28 2005 resolv.conf Sat Nov 12 19:55:51 2005 resolvconf Sat Nov 12 20:06:57 2005 resolvconf/update-libc.d Wed Aug 17 21:17:26 2005 resolvconf/update-libc.d/fetchmail Mon Jun 27 07:16:38 2005 rmt Mon Sep 19 06:41:04 2005 rpc Sat Nov 19 13:08:21 2005 samba Thu Jul 21 13:31:14 2005 samba/gdbcommands Sat Nov 19 13:08:21 2005 samba/smb.conf Sat Nov 19 13:05:32 2005 samba/smb.conf~ Sat Nov 12 20:00:22 2005 sane.d Tue Sep 27 09:14:55 2005 sane.d/abaton.conf Tue Sep 27 09:14:55 2005 sane.d/agfafocus.conf ...
At some point, it may be necessary to restore a file. That task can be accomplished quickly and easily by making minor modifications to the following script:
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity --file-to-restore apt/sources.list ftp://FtpUserID@ftp.domain.com/etc /home/user/sources.list unset PASSPHRASE unset FTP_PASSWORD
One time, I ruined my sources.list file that is used by the apt program and Synaptic to obtain packages and upgrades for Ubuntu. In order for the programs to work again, I needed to be able to retrieve a previous copy of it. The command above restores sources.list from the last backup found on my web host FTP account to my home directory.
Notice a couple things concerning the duplicity command in the restore.sh script:
- The path to the file that is to be restored is relative to the directory on which the backup set is based. So in the command above, apt/sources.list plus the directory on which we based our backup (/etc) equals /etc/apt/sources.list. It would not work to put /etc/apt/sources.list as the source path because the backup will not recognize /etc as a valid path.
- Duplicity will not overwrite an existing file. Here's the output if a change is made to the script above to restore the file to /etc/apt/sources.list:
Restore destination directory /etc/apt/sources.list already exists. Will not overwrite.
Also note that by default there is no output to this command if it completes successfully. It will simply place the file in the path you specify and exit. However, the verbosity level can be changed as we did in the verify.sh script to see more information.
There is another restore scenario that happens to me now and then. I make a change to a file, let's say sources.list again. A few days go by and I find something that I want to install, but Synaptic gives me errors about the sources. I've messed something up. But because I have run the backup.sh script several times in the last few days, the bad sources.list file is now in the latest backup set. How do I recover a good copy?
Item 3 of the requirements was to be able to backup and restore multiple versions of files and Duplicity allows just that. I would take a look at the last modified date of the sources.list file. It shows it was modified 3 days ago. So, I need to restore the file from the backup taken 4 days ago. That can be done by simply adding the restore time parameter -t3D. Here's what the edited restore.sh script looks like:
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity -t 3D --file-to-restore apt/sources.list ftp://FtpUserID@ftp.domain.com/etc /home/user/sources.list unset PASSPHRASE unset FTP_PASSWORD
That's all there is to it. If you make changes on a daily basis to the file and don't know which day you want to retrieve but you think it was between 4 to 6 days ago, then why not retrieve them all in that range. Edit your restore.sh file with multiple duplicity commands like so:
export PASSPHRASE=SomeLongGeneratedHardToCrackKey export FTP_PASSWORD=WhateverPasswordYouSetUp duplicity -t4D --file-to-restore apt/sources.list ftp://FtpUserID@ftp.domain.com/etc /home/user/sources.list.t4D duplicity -t5D --file-to-restore apt/sources.list ftp://FtpUserID@ftp.domain.com/etc /home/user/sources.list.t5D duplicity -t6D --file-to-restore apt/sources.list ftp://FtpUserID@ftp.domain.com/etc /home/user/sources.list.t6D unset PASSPHRASE unset FTP_PASSWORD
There are several other ways listed in the Duplicity man page to designate which backup set you want to restore from. Check it out for more information.
The only item left from the requirements that has not yet been solved is to automate Duplicity, so there is no need to bother with it manually.
When I first started working with Duplicity I simply ran the backup.sh file similar to the one above in the root crontab using the following statement:
0 0 * * * /root/scripts/etc/backup.sh >>/var/log/duplicity/etc.log
Notice that the Duplicity results are going to a log file so that it can be checked to see if the backups are completing successfully each night.
After a while, I noticed that it takes a while to restore files when there are so many incremental backups. After all, the base file that I want to restore only exists in the first full backup. The incremental backups only hold the changes to that file. So, Duplicity not only has to download the volume which holds the base file, but all the incremental backup files which hold the changes that need to be applied to the file in order to come up with the file I asked for. So, some management is necessary to keep the backup system from getting out of control.
I now do two forms of management: I make full backups on the first day of each month, and I delete old backups after a year (which might be adjusted later to something less, depending on space).
Deleting old backups
You can invoke duplicity followed by remove-older-than and a time constraint. The first example script (below) removes backups older than one year:
duplicity remove-older-than 1Y --force ftp://FtpUserID@ftp.domain.com/etc
It is not possible to do a backup (full or incremental) and delete old files with the same command.
Switching backup mode
Suppose we want to run full backups on the first day of the month, and only then. The second example script (below) shows the simplest way to do this, with the switch 'full-if-older-than 1M'. This switch makes sure that duplicity will perform a full backup once a month - even when the backup on the 1st day of a month fails.
Alternatively, suppose one's determination of when to do full backups is more complex. Then one should do the appropriate calculations in the driver script. The first example script (below) calculates the first day of the month using bash.
first exemplar script
This script backs up the GPass password file that I use. Notice that I have Duplicity set to not encrypt the GPass files. Not only is it not necessary to encrypt those files since GPass has already encrypted them, but it's especially important not to encrypt them again since I keep my long generated pass phrase that I'm using to encrypt the other backup sets in the GPass password file. If the GPass file was encrypted, it would really be a problem if my computer was destroyed and I wanted to restore some of my files elsewhere. In order to get the pass phrase to restore the files that I keep in GPass, I would have to decrypt the GPass files with the very same pass phrase.
# # Script created on 12-1-2005 # # This script was created to make Duplicity backups. # Full backups are made on the 1st day of each month. # Then incremental backups are made on the other days. # # Loading the day of the month in a variable. date=`date +%d` # Setting the pass phrase to encrypt the backup files. export PASSPHRASE='SomeLongGeneratedHardToCrackKey' export PASSPHRASE # Setting the password for the FTP account that the # backup files will be transferred to. FTP_PASSWORD='WhateverPasswordYouSetUp' export FTP_PASSWORD # Check to see if we're at the first of the month. # If we are on the 1st day of the month, then run # a full backup. If not, then run an incremental # backup. if [ $date = 01 ] then duplicity full --no-encryption /home/user/.gpass ftp://FtpUserID@ftp.domain.com/passwords >>/var/log/duplicity/passwords.log duplicity full /media/data/backup ftp://FtpUserID@ftp.domain.com/personal >>/var/log/duplicity/personal.log duplicity full /etc ftp://FtpUserID@ftp.domain.com/etc >>/var/log/duplicity/etc.log else duplicity --no-encryption /home/user/.gpass ftp://FtpUserID@ftp.domain.com/passwords >>/var/log/duplicity/passwords.log duplicity /media/data/backup ftp://FtpUserID@ftp.domain.com/personal >>/var/log/duplicity/personal.log duplicity /etc ftp://FtpUserID@ftp.domain.com/etc >>/var/log/duplicity/etc.log fi # Check http://www.nongnu.org/duplicity/duplicity.1.html # for all the options available for Duplicity. # Deleting old backups duplicity remove-older-than 1Y --force ftp://FtpUserID@ftp.domain.com/passwords >>/var/log/duplicity/passwords.log duplicity remove-older-than 1Y --force ftp://FtpUserID@ftp.domain.com/personal >>/var/log/duplicity/personal.log duplicity remove-older-than 1Y --force ftp://FtpUserID@ftp.domain.com/etc >>/var/log/duplicity/etc.log # Unsetting the confidential variables so they are # gone for sure. unset PASSPHRASE unset FTP_PASSWORD exit 0
second exemplar script
export PASSPHRASE=(insert your value here) export FTP_PASSWORD=(insert your value here) # doing a monthly full backup (1M) duplicity --full-if-older-than 1M /etc ftp://(insert your FTP server here)/etc # exclude /var/tmp from the backup duplicity --full-if-older-than 1M --exclude /var/tmp /var ftp://(insert your FTP server here)/var duplicity --full-if-older-than 1M /root ftp://(insert your FTP server here)/root # cleaning the remote backup space (deleting backups older than 6 months (6M, alternatives would 1Y fo 1 year etc.) duplicity remove-older-than 6M --force ftp://(insert your FTP server here)/etc duplicity remove-older-than 6M --force ftp://(insert your FTP server here)/var duplicity remove-older-than 6M --force ftp://(insert your FTP server here)/root unset PASSPHRASE unset FTP_PASSWORD
If you want to learn more about Duplicity, you can find more information at the Duplicity home page:
If you want to learn an alternate way to use Duplicity, you can take a look at the article here: