Hardware and RAID configuration
8 x 1.5 TB drives configured as RAID 6.
Problem:
The storage system failed to boot properly. The backup was run a day before the crash
and most backup data was corrupt. There were 3 LUNs
and about 3-4 virtual machines.
The customer needs only a small set of data whose backup is either corrupt or not available.
Diagnosis:
- The RAID configuration is quickly determined using data patterns. The RAID seems OK;
all drives store the most current data.
- The host file system is Ext4.
There are 4 LUNs which are shown as 4 separate iSCSI files in the Ext4 volume:
- One LUN is empty.
- One has another Ext4 volume inside and stores only trivial data.
- Inside each of two remaining LUNs, there is a VMFS
volume which hosts a few virtual machines.
- Further analysis shows that none of the found virtual disks have the desired data.
This leads to the assumption that either 1) the target virtual disk (VMDK
file) was deleted or 2) its metadata was corrupt.
- Fortunately, the metadata of the desired data is found in the tracing and file
carving method seems possible.
Solution:
- Eliminate all virtualization layers: using a special filter, the VMFS volume storing
the lost VMDK file is now seen as a simple virtual partition on a single virtual drive.
- Compute the segment profile: File Scavenger® is configured to run another trace.
It compiles the profile for all VMFS data blocks. Claimed blocks are eliminated.
- Matching and sorting the segments: this is the main part of the file carving method.
A separate program will pick all potential blocks and give each of them points
depending on the likelihood of matching. Then, it assembles contiguous blocks together
and forms segments. It also fills the gaps by using data inference methods. Missing
segments are filled by zeroes. This process is fine tuned for each project, because
the criteria for matching may differ.
- Rebuild the lost VMDK runs: now, approximately 87% of lost VMDK file is located.
The offsets are converted back to a regular file runs, which File Scavenger® can parse
correctly.
Result:
- The lost VMDK is restored and manually verified. Most data is OK, but the set of the
desired data is mostly corrupt.
- The whole process is once again checked. No errors are found. One possible explanation
is that the files were already corrupt before the RAID failure; then the VMFS volume
crashed and caused the VMDK file to disappear. So we successfully restore an already-corrupt VMDK file.
- The customer rejects the result and the deposit is refunded.
Conclusiont:
Despite of the failure, this job once again proves the power of file carving.
It works great when the underlying file system is either NTFS or Ext3/4, and the virtual disk’s
allocation block (or segment) size is fairly large (for example, 1-MB in the VMFS volume). Even smaller
segment sizes still work, but the results will not be as good.
|