Article

File Carving Explained: Recovering Deleted Data from Raw Bytes

6 min read

File carving recovering deleted data by scanning raw bytes for file headers and footers

Introduction

When a file is deleted, it rarely disappears. Most of the time it is still sitting on the drive, waiting to be either recovered or quietly overwritten. File carving is the technique investigators use to reach that data — to pull deleted photos, documents, and fragments out of unallocated space when the file system no longer points to them. This article gives you file carving explained from the ground up: how deletion really works, what a carver does with raw bytes, why you must always work from a forensic image, and how to fix every recovered artifact in time using cryptographic hashes. If you are new to the field, our evidence acquisition basics — a beginner's guide is a useful companion to this piece.

How Deletion Actually Works

Deleting a file feels final, but the data usually remains intact. On most storage media, the file system keeps two things: the contents of a file in data blocks, and a directory record that points to those blocks and gives the file a name. When you delete, the file system typically does the cheap thing — it drops the pointer and marks the blocks as free for reuse. The bytes themselves are not wiped. The file's contents stay physically on the disk, now living in unallocated space, until some later write happens to land on the same blocks and overwrite them. This gap between "deleted" and "overwritten" is exactly the window that recovery exploits, and it is why prompt, write-protected handling of a drive matters so much.

What File Carving Does

Ordinary recovery tools read the file system to find files. Carving throws that map away. A file carver scans the raw bytes of a disk image from end to end, looking for the tell-tale signatures of known file types — the headers that mark where a file of a given type begins and the footers that mark where it ends. A JPEG, for example, starts and ends with specific byte sequences; a PDF and a ZIP archive each have their own. When the carver finds a known header, it reads forward until it hits the matching footer (or a sensible size limit) and reconstructs the file from the bytes in between. Crucially, it does all of this without relying on the file system, so it can recover data whose directory entries are long gone, whose file system is corrupted, or that never had a directory entry at all.

Always Carve from a Forensic Image, Never the Live Disk

This is the rule that separates defensible recovery from accidental destruction. The deleted data you want lives in free space — and the live operating system treats free space as fair game to write to. Browse the drive, run software, or even just plug it in unprotected, and you risk overwriting the very bytes you are trying to save. The correct workflow is: attach the source through a write blocker, create a bit-for-bit forensic image of the whole device, verify that image against the original by hash, and then carve only from a working copy of that image. The original is preserved untouched, and the recovery is repeatable. To understand why the imaging hash is the linchpin of the whole process, see the role of hashing in digital forensics.

Hash Every Recovered Artifact to Fix It in Time

A single carving pass can spit out hundreds of recovered files. Each one is a piece of potential evidence, and each one needs its own integrity anchor. The moment an artifact is extracted, compute a cryptographic hash over it. That hash is a fingerprint that fixes the file's exact state at the moment of recovery: if anyone later asks whether a recovered document was altered, you recompute the hash and compare. e-Dex computes MD5, SHA-1, SHA-256, SHA-512 and BLAKE3 for every file and can bundle the results into an integrity certificate, fully offline, so your recovered artifacts carry verifiable proof from the instant they are carved.

The Limits of File Carving

Carving is powerful, but it is not magic, and a careful examiner states its limits plainly. Fragmentation is the biggest one: a carver that reads straight from header to footer assumes the file's bytes are contiguous. When a file is scattered across non-adjacent blocks, the carver may stitch the wrong data together, recover a truncated file, or miss it entirely. False positives are another: a byte pattern that merely resembles a known header can produce a "file" that is partial or corrupt. And nothing recovers overwritten blocks — once the bytes are gone, they are gone — while encrypted or compressed regions usually carve as unreadable noise. Every carved artifact therefore needs validation before it is treated as reliable evidence.

Evidentiary Handling of Carved Data

Recovered data only helps if its handling can withstand scrutiny. Document the whole journey: the source device and how it was imaged, the verified image hash, the carving tool and settings used, and a per-file hash for every artifact you extract. Keep the original image read-only, work only on copies, and record who did what and when so the recovery sits inside a clean digital forensics chain of custody. A carved file backed by a documented image hash and a per-artifact integrity certificate is far more defensible than a loose file pulled off a drive with no provenance.

Frequently Asked Questions

What is file carving in digital forensics?
File carving is a recovery technique that reconstructs files directly from raw bytes on storage media, without relying on the file system. Instead of reading directory entries and pointers, a carver scans the disk image for known file signatures — the headers and footers that mark where a file type begins and ends — and reassembles the data in between. Because it ignores the file system, carving can recover files even when their directory records have been deleted, the file system is damaged, or the data sits in unallocated or slack space.

Can deleted files really be recovered?
Often, yes. On most storage media, deleting a file does not erase its contents. The file system simply drops the pointer to the data and marks those blocks as available, while the actual bytes remain until something else overwrites them. Until that overwrite happens, the data is still physically present in unallocated space and can be recovered by carving. The longer a drive is used after deletion, the more likely the blocks are reused, so prompt, write-protected handling matters.

Why must you carve from a forensic image instead of the live disk?
Working on the original live disk risks overwriting the very deleted data you are trying to recover, and any write alters the evidence. The correct practice is to create a bit-for-bit forensic image of the source using a write blocker, verify the image hash against the source, and then carve only from a working copy of that image. This preserves the original untouched and keeps the recovery repeatable and defensible.

Why should recovered artifacts be hashed?
Carving can produce hundreds of recovered files. Computing a cryptographic hash for each one the moment it is extracted fixes its state in time: the hash is a fingerprint that proves the artifact has not changed since recovery. If anyone later questions whether a recovered file was edited, you can recompute the hash and compare. e-Dex computes MD5, SHA-1, SHA-256, SHA-512 and BLAKE3 per file and produces an integrity certificate offline.

What are the limits of file carving?
Carving works best on contiguous, unfragmented files. When a file is fragmented across non-adjacent blocks, a header-to-footer carver may stitch the wrong data together or stop early. Signature matching can also produce false positives, recovering partial or corrupt files that merely look like a known type. Overwritten blocks are unrecoverable, and encrypted or compressed data usually carves as unreadable. Carving is powerful but not guaranteed; results need validation.

Conclusion

File carving turns "the file is deleted" into "the file may still be recoverable" — because deletion drops a pointer, not the data. By scanning raw bytes for known headers and footers, carving rebuilds files the file system has forgotten, and by always working from a verified forensic image and hashing every recovered artifact, you keep that recovery defensible. Build your workflow on a solid integrity foundation with e-Dex — the free, offline Digital Evidence Integrity Suite and prove your recovered data is exactly what it should be.