DataRecovery

Contents

CAUTION
Guidelines
Lost Partition
Imaging a damaged device, filesystem or drive
1. Software choices
2. Ran out of space while imaging the drive?
Extract filesystem from recovered image
1. Mounting partitions on the image
Extract individual files from recovered image
Ntfsprogs
Sleuth Kit and Autopsy
1. Autopsy
2. Sleuthkit
Cleaning up
Prevention
Other links

Deleted or lost files can sometimes be recovered from failed or formatted drives and partitions, CD-ROMs and memory cards using the free/libre software available in the Ubuntu repositories. The data is recoverable because the information is not immediately removed from the disk. Follow these steps to recover lost data.

CAUTION

You should NOT write to the failed device, as it can worsen a hardware failure, and overwrite existant data in case of lost files .

Shut down the affected machine as soon as possible, and restart it from a LiveCD or LiveUSB. Be certain that the "live" cd does not automatically mount any partition or swap space.

Guidelines

The following software will passively try to recover your data from failed or failing hardware. If your data is not replaceable, do not attempt to write to the failed device if the following applications do not work but seek professional advice instead.

If your device is damaged, it is advisable to image the device and work on the image file for data recovery. (See below.) If hardware failure is not the problem, you can recover data directly from the device.

To recover data from a failed device, you will need another device of equal or greater storage onto which to save your data. If you need to make an image of the failed device, you will need yet another quantity of space. You should run these tools from another OS which resides on another disk or a "live" CD.

An Ubuntu Desktop CD will work fine. If you do not have a lot of ram, or do not have an internet connection on the failed computer, you can use SystemRescueCd, a live cd data recovery toolkit. It includes most of the software mentioned in this page.

[More info needed here: Is this page covering two separate scenarios: one for lost partitions, and one for damaged or deleted files? Or are the techniques described for damaged or deleted files ALSO needed if a partition has been inadvertently lost? If they are separate scenarios, it would be good to say that here.]

Lost Partition

If you made a mistake while partitioning and the partition no longer appears in the partition table, so long as you have not written data in that space, all your data is still there.

GNU Parted

Run Parted from the command line to recover your partition.

When changing the partition table on your hard drive, you must ensure that no partition on the disk is mounted. This includes swap space. The easiest way to accomplish this is to run the live cd. Parted is installed on the base Ubuntu system. Once at the desktop, open a terminal and run_:

sudo swapoff -a

Next run parted and tell it to use the device in question. For example, if your /dev/sda drive is the drive from which you want to recover, run:

sudo parted /dev/sda

Then, use the rescue option:

rescue START END

where Start is the area of the disk where you believe the partition began and END is its end. If parted finds a potential partition, it will ask you if you want to add it to the partition table.

Testdisk

Alternatively, the testdisk application may recover your partition. Use any method to install the testdisk package.

Run testdisk and it will scan your computer for media and offer you a menu-driven way to recover your partition.

sudo testdisk

Gpart

Another program that can scan drives and re-create a partition table based on "guesses" is Gpart. Use any method to install the package gpart.

To scan the first hard disk using default settings type

sudo gpart /dev/sda

sudo gpart /dev/hda

depending on your Ubuntu version.

You can restore the "guessed" partition table, only after checking it very carefully (you're strongly advised to write to another device instead), using

sudo gpart -W /dev/sda /dev/sda

Imaging a damaged device, filesystem or drive

Software choices

There are two different programs for making an image of a damaged device, in preparation for rescuing files. They are confusingly given the same name:

GNU ddrescue (packaged as gddrescue, though once installed the command is "ddrescue")
- This is the one you want. This documentation currently only applies to GNU ddrescue.
dd_rescue (packaged as ddrescue)
- This is an older, slower shell script that needs to be run in combination with another script to do the same thing as the GNU version.

From /usr/share/doc/gnuddrescue/README

GNU ddrescue is a data recovery tool. It copies data from one file or block device (hard disc, cdrom, etc) to another, trying hard to rescue data in case of read errors.

Ddrescue does not truncate the output file if not asked to. So, every time you run it on the same output file, using a logfile, it tries to fill in the gaps.

The basic operation of ddrescue is fully automatic. That is, you don't have to wait for an error, stop the program, read the log, run it in reverse mode, etc.

If you use the logfile feature of ddrescue, the data is rescued very efficiently (only the needed blocks are read). Also you can interrupt the rescue at any time and resume it later at the same point.

Automatic merging of backups: If you have two or more damaged copies of a file, cdrom, etc, and run ddrescue on all of them, one at a time, with the same output file, you will probably obtain a complete and error-free file. This is so because the probability of having damaged areas at the same places on different input files is very low. Using the logfile, only the needed blocks are read from the second and successive copies.

If the filesystem you are imaging is greater than 4 gigs in size, you will not be able to use an MSDOS (VFAT) filesystem (usually found on USB drives) to store the image, since there is a 4G limit to the maximum size of a file on such filesystems. Use EXT3 or another filesystem that can handle such file sizes.

Use any method to install the following package:

gddrescue

Run gnuddrescue like this:

ddrescue [options] infile outfile [logfile]

So, if /dev/sda is unreadable, you will need to acquire another disk (or other media) onto which to save the output image. You will need to have more room on the new media than on the failed disk.

sudo ddrescue -r 3 /dev/sda /media/usbdrive/image /media/usbdrive/logfile

Run successive passes like this:

sudo ddrescue -r 3 -C /dev/sda /media/usbdrive/image /media/usbdrive/logfile

and gnuddrescue will use the log file to only read the gaps with errors. In both cases, the -r option determines the number of times gddrescue will try to read when it encounters an error (-1 = infinity).

From Forensics Wiki:

First you copy as much data as possible, without retrying or splitting sectors:

ddrescue --no-split /dev/hda1 imagefile logfile

Now let it retry previous errors 3 times, using uncached reads:

ddrescue --direct --max-retries=3 /dev/hda1 imagefile logfile

If that fails you can try again but retrimmed, so it tries to reread full sectors:

ddrescue --direct --retrim --max-retries=3 /dev/hda1 imagefile logfile

Other examples:

These two examples are taken directly from the ddrescue info pages.

Example 1: Rescue an ext2 partition in /dev/hda2 to /dev/hdb2

ddrescue -r3 /dev/hda2 /dev/hdb2 logfile
e2fsck -v -f /dev/hdb2
mount -t ext2 -o ro /dev/hdb2 /mnt

Example 2: Rescue a CD-ROM in /dev/cdrom

ddrescue -b 2048 /dev/cdrom cdimage logfile

write cdimage to a blank CD-ROM

Ran out of space while imaging the drive?

Using Gnu ddrescue with a log file, you can continue imaging to another drive and then span the images. In this example, you have imaged some of the drive to a file on one drive, and the rest of the drive to a file on another drive. Here is how you put the pieces together:

sudo losetup /dev/loop1 /media/Drive1/image
sudo losetup /dev/loop2 /media/Drive2/image
sudo mdadm -B /dev/md0 -l linear -n 2 /dev/loop1 /dev/loop2

Your complete image fill be found at /dev/md0.

And then to take the array down:

sudo mdadm -S /dev/md0
sudo losetup -d /dev/loop1
sudo losetup -d /dev/loop2

An separate disk space concern is that /var/log/kern.log and /var/log/syslog can become extremely large due to frequent I/O errors. For a severely problematic drive, it may be necessary replace these logs with links to /dev/null during use of ddrescue to avoid the / filesystem from becoming full.

Extract filesystem from recovered image

Now that the drive has been imaged, you can recover the filesystem from the image. If the filesystem is not recoverable, you can try to recover individual files.

Mounting partitions on the image

If you imaged the whole drive, you can mount the individual partitions on the image by using the "offset" option when mounting a loop filesystem. The Sleuth Kit can help.

Use any method to install the following package:

sleuthkit

mmls can show you the partitions found within an image:

$ mmls file -b
DOS Partition Table
Offset Sector: 0
Units are in 512-byte sectors

     Slot    Start        End          Length       Size    Description
00:  -----   0000000000   0000000000   0000000001   0512B   Primary Table (#0)
01:  -----   0000000001   0000000031   0000000031   0015K   Unallocated
02:  00:01   0000000032   0001646591   0001646560   0803M   DOS FAT16 (0x06)
03:  00:00   0001646592   0002013183   0000366592   0179M   DOS FAT16 (0x06)

This shows several partitions. In this example, we want to mount the DOS partition starting at block 32. To calculate the number of bytes, multiply by 512:

$ bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
32 * 512
16384
quit

Mount the partition:

sudo mount -o loop,offset=16384 file mnt

(32 multiplied by 512 byte blocks = 16384)

For mounting a typical NTFS partition created by Windows use:

sudo mount -t ntfs -o r,force,loop,offset=32256 file mnt

(63 multiplied by 512 byte blocks = 32256)

Extract individual files from recovered image

Foremost

Foremost is a command-line tool which can recover files from a number of filesystems, including fat, ext3 and NTFS. It can be installed and run from the live cd.

Boot from the live cd and then enable the universe repository and install foremost:

Use any method to install the following package:

foremost

Foremost can recover files from an image of the drive, or from the drive directly. If the drive has suffered hardware problems, use gnuddrescue to image the drive first.

Assuming the lost files are on hda, you need to create a writeable directory on another drive where you can put the recovered files (lets say you have a big external usb drive (sdb)

sudo mount /dev/sdb1 /recovery
sudo mkdir /recovery/foremost

And then run foremost:

sudo foremost -i /dev/hda -o /recovery/foremost

To run formost on an image, just substitute the filename for the device

sudo foremost -i image -o /recovery/foremost

The recovered files will then be owned by root. Change their ownership so that you can use them:

sudo chown -R youruser:youruser /recovery/foremost

Use the -w switch to obtain only an audit of recoverable files:

sudo foremost -w -i /dev/hda -o /recovery/foremost

To recover only specific file types, use the -t switch:

sudo foremost -t jpg -i /dev/hda -o /recovery/foremost

Available types:

Filetype	Comment
jpg	Support for the JFIF and Exif formats including implementations used in modern digital cameras.
gif
png
bmp	Support for windows bmp format.
avi
exe	Support for Windows PE binaries, will extract DLL and EXE files along with their compile times.
mpg	Support for most MPEG files (must begin with 0x000001BA)
wav
riff	This will extract AVI and RIFF since they use the same file format (RIFF). note faster than running each separately.
wmv	Note may also extract -wma files as they have similar format.
mov
pdf
ole	This will grab any file using the OLE file structure. This includes PowerPoint, Word, Excel, Access, and StarWriter
doc	Note it is more efficient to run OLE as you get more bang for your buck. If you wish to ignore all other ole files then use this.
zip	Note is will extract .jar files as well because they use a similar format. Open Office docs are just zip’d XML files so they are extracted as well. These include SXW, SXC, SXI, and SX? for undetermined OpenOffice files.
rar
htm
cpp	C source code detection, note this is primitive and may generate documents other than C code.
all	Run all pre-defined extraction methods. [Default if no -t is specified]

Scalpel

Scalpel is a fast file carver that reads a database of header and footer definitions and extracts matching files from a set of image files or raw device files. It is similar to foremost and may have some improvements.

By default, all file types in the database (/etc/scalpel/scalpel.conf) are commented out. To specify which filetypes you want to carve, you need to edit the file and uncomment each line.

sudo scalpel FILE -o Directory

Where FILE is the image file (or device) and Directory is the output directory.

Magic Rescue

Another program that scans for files using "magic bytes" to identify their presence and type, and which can be extended for many file types using "recipes", can be obtained by installing, using any method, the package magicrescue.

Note that most of the provided recipes need other software installed to work, so open the desired recipes in /usr/share/magicrescue/recipes/ using a text editor and read the comments contained.

If you want to recover (for example) gzip files and PNG images from a partition named /dev/sda1, you can run

mkdir ~/output
sudo magicrescue -r gzip -r png -d ~/output /dev/hdb1

This will write all recovered files in a directory output inside your home directory.

Photorec

Photorec is file data recovery software designed to recover lost pictures from digital camera memory or even Hard Disks. It has been extended to search also for non audio/video headers. It searches for 80 different types of files. Photorec is part of the Testdisk package. Use any method to install the following package:

testdisk

To run Photorec on an image file, do:

sudo photorec imagefilename

To recover files directly from a device, run photorec without any arguments and you will be given a menu of available devices.

sudo photorec

See this link for a detailed description of how to use Photorec.

recoverjpeg

This program is dedicated to identifying and recovering JPEG pictures. You can install the package recoverjpeg using any method, and then run (assuming /dev/sda1 is the partition you want to recover from)

sudo recoverjpeg /dev/sda1

Recovered files will be saved in your home directory, with names following the pattern image*.jpg.

Ntfsprogs

NtfsUndelete can recover deleted files from an NTFS file-system. The Windows and LiveCd versions have a very nice intuitive gui but the linux one is probably stronger and does not have a front-end gui at the moment.

Briefly, it has 3 modes

"Scan", searches for deleted files and find info about them
"Undelete", see note below ...
"Copy", err i am not sure what this does as i am not a wizard

The best simple guide i have found so far (24-11-2010) is http://www.howtogeek.com/howto/13706/recover-deleted-files-on-an-ntfs-hard-drive-from-a-ubuntu-live-cd/

When undeleting chose which files to undelete and where to undelete them too. By default this appears to be the desktop of the OS you are booted into, whether that is a LiveCd or on a different partition or drive. For a LiveCd or LiveUsb you will need to move them onto Usb-stick or safe partition before rebooting as the desktop gets forgotten on LiveCds unless you are using a "Persistent image".

To search

ntfsundelete /dev/sda2

To undelete

ntfsundelete /dev/sda2 -u -i 3689 -o work.doc -d ~/output

This will write all recovered files in a directory output inside your home directory.

For better information on using ntfsundelete please see the separate page NtfsUndelete, particularly the External Links there.

Sleuth Kit and Autopsy

(Obtained the following from http://www.sleuthkit.org/autopsy/desc.php)

The Autopsy Forensic Browser is a graphical interface to the command line digital investigation analysis tools in The Sleuth Kit. Together, they can analyze Windows and UNIX disks and file systems (NTFS, FAT, UFS1/2, Ext2/3).

The Sleuth Kit and Autopsy are both Open Source and run on UNIX platforms. As Autopsy is HTML-based, you can connect to the Autopsy server from any platform using an HTML browser. Autopsy provides a "File Manager"-like interface and shows details about deleted data and file system structures.

Autopsy

Autopsy can be run from the "live" CD, but you must specify an address to which you can connect remotely. You must also specify an external disk on which it can save the extracted information.

For example, assuming you have an external disk mounted to /media/disk with an autopsy folder on it and your IP address is 192.168.0.1, you can run:

sudo autopsy -d /media/disk/autopsy 192.168.0.1

Sleuthkit

Extract unallocated (deleted) blocks from a disk or disk image.

Example:

dls inputimage > outputimage

Use any data carving tool to search the output image for files.

List file and directory names in a forensic image. fls lists the files and directory names in the image and can display file names of recently deleted files for the directory using the given inode. This includes deleted files. If you have imaged your filesystem to a file named "loopfile", you can list the contents by running:

fls loopfile -r -f fat -i raw
r/r 3: test (Volume Label Entry)
r/r * 5: sample.docx
r/r * 7: sample.pptx
r/r * 9: sample.xlsx

Copy file by inode. icat opens the named image(s) and copies the file with the specified inode number to standard output.

Example:

fls has shown you the inode number of some files on an image. To recover a file by using th einode number run:

icat -r -f fat -i raw loopfile 5 > sample.docx

sorter - Sort files in an image into categories based on file type. Sorter is a Perl script that analyzes a file system to organize the allocated and unallocated files by file type.

Example: This will sort all the files found in /dev/sdc1 and put image files in a directory named "out":

sudo sorter -h -s -i raw -f fat -d out -C /usr/share/sleuthkit/windows.sort /dev/sdc1

Here is a description of a script that will pull all files from an image using fls and icat:

http://forums.gentoo.org/viewtopic-t-365703.html

Another, similar script which attempts to "rebuild" the filesystem directory structure plus file content:

http://matt.matzi.org.uk/2008/07/03/reconstructing-heavily-damaged-hard-drives/

Cleaning up

from:How to recover lost files after you accidentally wipe your hard drive

Sort certain types of files:

sudo mkdir recovery/VID recovery/JPG

find recovery/ -name "*.avi" | xargs -i mv {} recovery/VID/

find recovery/ -name "*.mpg" | xargs -i mv {} recovery/VID/

find recovery/ -name "*.jpg" | xargs -i mv {} recovery/JPG/

Eliminate small photos:

sudo mkdir recovery/SMALL

find recovery/JPG/ -name "*.jpg" -size -1024k | xargs -i mv {} recovery/SMALL/

Rename jpegs according to exif data:

find JPG/ -name "*.jpg" | xargs -i jhead  -nf%Y%m%d-%H%M%S {}

Then, remove duplicates.

find /var/recovery/JPG/ -name "*a.jpg" | xargs -i mv {} /var/recovery/JPG/DUPS/

Copy files with matching strings:

cd recovery
mkdir ../copy/
grep -l "enter the string of text here" *.doc | xargs -i cp {} ../copy/

Prevention

The best way to avoid data loss is by performing regular backups. See: BackupYourSystem

Ubuntu Documentation

CAUTION

Guidelines

Lost Partition

GNU Parted

Testdisk

Gpart

Imaging a damaged device, filesystem or drive

Software choices

Ran out of space while imaging the drive?

Extract filesystem from recovered image

Mounting partitions on the image

Extract individual files from recovered image

Foremost

Scalpel

Magic Rescue

Photorec

recoverjpeg

Ntfsprogs

Sleuth Kit and Autopsy

Autopsy

Sleuthkit

Cleaning up

Prevention

Other links