Skip to content

Commit 547f8b6

Browse files
authored
Merge pull request #1235 from HackTricks-wiki/research_update_src_generic-methodologies-and-resources_basic-forensic-methodology_partitions-file-systems-carving_file-data-carving-recovery-tools_20250804_014946
Research Update Enhanced src/generic-methodologies-and-resou...
2 parents f24f2f3 + 993201c commit 547f8b6

1 file changed

Lines changed: 74 additions & 14 deletions

File tree

src/generic-methodologies-and-resources/basic-forensic-methodology/partitions-file-systems-carving/file-data-carving-recovery-tools.md

Lines changed: 74 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,38 @@ More tools in [https://github.com/Claudio-C/awesome-datarecovery](https://github
1010

1111
The most common tool used in forensics to extract files from images is [**Autopsy**](https://www.autopsy.com/download/). Download it, install it and make it ingest the file to find "hidden" files. Note that Autopsy is built to support disk images and other kinds of images, but not simple files.
1212

13+
> **2024-2025 update** – Version **4.21** (released February 2025) added a rebuilt **carving module based on SleuthKit v4.13** that is noticeably quicker when dealing with multi-terabyte images and supports parallel extraction on multi-core systems.¹ A small CLI wrapper (`autopsycli ingest <case> <image>`) was also introduced, making it possible to script carving inside CI/CD or large-scale lab environments.
14+
15+
```bash
16+
# Create a case and ingest an evidence image from the CLI (Autopsy ≥4.21)
17+
autopsycli case --create MyCase --base /cases
18+
# ingest with the default ingest profile (includes data-carve module)
19+
autopsycli ingest MyCase /evidence/disk01.E01 --threads 8
20+
```
21+
1322
### Binwalk <a href="#binwalk" id="binwalk"></a>
1423
1524
**Binwalk** is a tool for analyzing binary files to find embedded content. It's installable via `apt` and its source is on [GitHub](https://github.com/ReFirmLabs/binwalk).
1625
1726
**Useful commands**:
1827
1928
```bash
20-
sudo apt install binwalk #Insllation
21-
binwalk file #Displays the embedded data in the given file
22-
binwalk -e file #Displays and extracts some files from the given file
23-
binwalk --dd ".*" file #Displays and extracts all files from the given file
29+
sudo apt install binwalk # Installation
30+
binwalk firmware.bin # Display embedded data
31+
binwalk -e firmware.bin # Extract recognised objects (safe-default)
32+
binwalk --dd " .* " firmware.bin # Extract *everything* (use with care)
2433
```
2534
35+
⚠️ **Security note** – Versions **≤2.3.3** are affected by a **Path Traversal** vulnerability (CVE-2022-4510). Upgrade (or isolate with a container/non-privileged UID) before carving untrusted samples.
36+
2637
### Foremost
2738
2839
Another common tool to find hidden files is **foremost**. You can find the configuration file of foremost in `/etc/foremost.conf`. If you just want to search for some specific files uncomment them. If you don't uncomment anything foremost will search for its default configured file types.
2940
3041
```bash
3142
sudo apt-get install foremost
3243
foremost -v -i file.img -o output
33-
#Discovered files will appear inside the folder "output"
44+
# Discovered files will appear inside the folder "output"
3445
```
3546
3647
### **Scalpel**
@@ -42,26 +53,62 @@ sudo apt-get install scalpel
4253
scalpel file.img -o output
4354
```
4455
45-
### Bulk Extractor
56+
### Bulk Extractor 2.x
4657
47-
This tool comes inside kali but you can find it here: [https://github.com/simsong/bulk_extractor](https://github.com/simsong/bulk_extractor)
58+
This tool comes inside kali but you can find it here: <https://github.com/simsong/bulk_extractor>
4859
49-
This tool can scan an image and will **extract pcaps** inside it, **network information (URLs, domains, IPs, MACs, mails)** and more **files**. You only have to do:
60+
Bulk Extractor can scan an evidence image and carve **pcap fragments**, **network artefacts (URLs, domains, IPs, MACs, e-mails)** and many other objects **in parallel using multiple scanners**.
5061
51-
```
52-
bulk_extractor memory.img -o out_folder
62+
```bash
63+
# Build from source – v2.1.1 (April 2024) requires cmake ≥3.16
64+
git clone https://github.com/simsong/bulk_extractor.git && cd bulk_extractor
65+
mkdir build && cd build && cmake .. && make -j$(nproc) && sudo make install
66+
67+
# Run every scanner, carve JPEGs aggressively and generate a bodyfile
68+
bulk_extractor -o out_folder -S jpeg_carve_mode=2 -S write_bodyfile=y /evidence/disk.img
5369
```
5470
55-
Navigate through **all the information** that the tool has gathered (passwords?), **analyse** the **packets** (read[ **Pcaps analysis**](../pcap-inspection/index.html)), search for **weird domains** (domains related to **malware** or **non-existent**).
71+
Useful post-processing scripts (`bulk_diff`, `bulk_extractor_reader.py`) can de-duplicate artefacts between two images or convert results to JSON for SIEM ingestion.
5672
5773
### PhotoRec
5874
59-
You can find it in [https://www.cgsecurity.org/wiki/TestDisk_Download](https://www.cgsecurity.org/wiki/TestDisk_Download)
75+
You can find it in <https://www.cgsecurity.org/wiki/TestDisk_Download>
6076
6177
It comes with GUI and CLI versions. You can select the **file-types** you want PhotoRec to search for.
6278
6379
![](<../../../images/image (242).png>)
6480
81+
### ddrescue + ddrescueview (imaging failing drives)
82+
83+
When a physical drive is unstable, it is best practice to **image it first** and only run carving tools against the image. `ddrescue` (GNU project) focuses on reliably copying bad disks while keeping a log of unreadable sectors.
84+
85+
```bash
86+
sudo apt install gddrescue ddrescueview # On Debian-based systems
87+
# First pass – try to get as much data as possible without retries
88+
sudo ddrescue -f -n /dev/sdX suspect.img suspect.log
89+
# Second pass – aggressive, 3 retries on the remaining bad areas
90+
sudo ddrescue -d -r3 /dev/sdX suspect.img suspect.log
91+
92+
# Visualise the status map (green=good, red=bad)
93+
ddrescueview suspect.log
94+
```
95+
96+
Version **1.28** (December 2024) introduced **`--cluster-size`** which can speed up imaging of high-capacity SSDs where traditional sector sizes no longer align with flash blocks.
97+
98+
### Extundelete / Ext4magic (EXT 3/4 undelete)
99+
100+
If the source file system is Linux EXT-based you may be able to recover recently deleted files **without full carving**. Both tools work directly on a read-only image:
101+
102+
```bash
103+
# Attempt journal-based undelete (metadata must still be present)
104+
extundelete disk.img --restore-all
105+
106+
# Fallback to full directory scan; supports extents and inline data
107+
ext4magic disk.img -M -f '*.jpg' -d ./recovered
108+
```
109+
110+
> 🛈 If the file system was mounted after deletion, the data blocks may have already been reused – in that case proper carving (Foremost/Scalpel) is still required.
111+
65112
### binvis
66113
67114
Check the [code](https://code.google.com/archive/p/binvis/) and the [web page tool](https://binvis.io/#/).
@@ -87,12 +134,25 @@ Searches for AES keys by searching for their key schedules. Able to find 128. 19
87134
88135
Download [here](https://sourceforge.net/projects/findaes/).
89136
137+
### YARA-X (triaging carved artefacts)
138+
139+
[YARA-X](https://github.com/VirusTotal/yara-x) is a Rust rewrite of YARA released in 2024. It is **10-30× faster** than classic YARA and can be used to classify thousands of carved objects very quickly:
140+
141+
```bash
142+
# Scan every carved object produced by bulk_extractor
143+
yarax -r rules/index.yar out_folder/ --threads 8 --print-meta
144+
```
145+
146+
The speed‐up makes it realistic to **auto-tag** all carved files in large-scale investigations.
147+
90148
## Complementary tools
91149
92-
You can use [**viu** ](https://github.com/atanunq/viu)to see images from the terminal.\
150+
You can use [**viu** ](https://github.com/atanunq/viu)to see images from the terminal. \
93151
You can use the linux command line tool **pdftotext** to transform a pdf into text and read it.
94152
95-
{{#include ../../../banners/hacktricks-training.md}}
96153
97154
155+
## References
98156
157+
1. Autopsy 4.21 release notes – <https://github.com/sleuthkit/autopsy/releases/tag/autopsy-4.21>
158+
{{#include ../../../banners/hacktricks-training.md}}

0 commit comments

Comments
 (0)