Did you ever encounter mentions of a “file carving” technique used to recover information from formatted, corrupted and inaccessible disks? Originally, file carving (or data carving) was invented as a technique used by forensic specialists to recover evidence a suspect attempted to destroy. In the context of data recovery tools, the term is almost never used correctly. In this article, we’ll discuss the data carving technique and talk about the differences between this technology and the more commonly available signature search analysis.
What Is Data Carving?
Data carving is a technique used by forensic investigators to recover fragmented files from hard disks with missing, damaged or corrupted file system data. The technique is based on a variation of signature search, an algorithm commonly used in many commercially available data recovery tools such as The Undelete or HDD Recovery Pro. With signature search, a data recovery tool can scan the entire surface of the hard disk looking for characteristic signatures of known file types, detecting and analyzing their headers in order to determine the exact file length.
While the idea of file carving is similar to that of signature search algorithms, the carving approach is essentially different in that it applies a significant amount of heuristics in order to reassemble files being recovered from multiple fragments. In the absence of the actual file system, assembling a file from multiple fragments scattered all over the disk is a task far from trivial. The file carving process makes extensive use of information obtained from known file structures, analyzing the content of the file and applying heuristic algorithms in order to reconstruct a working copy of a file from multiple fragments.
Signature Search vs. File Carving
Now, signature search algorithms also analyze information obtained from known file structures in order to recover files. What’s the difference?
Signature search algorithms only analyze limited information about the files such as file headers in order to determine the length of the file. Signature search assumes that the file was stored in a single continuous chunk – which is sometimes not true due to disk fragmentation.
File carving algorithms were specifically designed to take disk fragmentation into account. The file carving technique takes info account all types of information contained in files including file headers and actual file content. The technique makes extensive use of heuristics regarding the typical structures of known file formats as well as how file systems handle fragmented data. By using information obtained from multiple sources, the file carving process can guess which fragments belong together, effectively reconstructing fragmented files from multiple pieces.
