test(jsonl): add jsonl_parser schema coverage and harden null usage#22
test(jsonl): add jsonl_parser schema coverage and harden null usage#22
Conversation
Add tests/test_jsonl_parser.py covering _parse_tool_result dispatch, content helpers, user/assistant/system paths, file activity, parse_session integration, quick_session_info, and malformed entries. Treat non-dict message.usage in _process_assistant as empty to avoid crashes on null usage.
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughAdds a large pytest suite for utils/jsonl_parser and hardens message/usage access by introducing a safe ChangesParser Test Coverage & Defensive Guard
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
utils/jsonl_parser.py (1)
140-142:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard
messagebefore dereferencing in both user/assistant processors.If an entry contains
"message": null, both processors call.get(...)onNoneand abortparse_sessionfor a recoverable malformed line. Apply the same dict-guard pattern here too.💡 Suggested fix
def _process_user(entry: dict, messages: list, metadata: dict): @@ - msg = entry.get("message", {}) + msg = entry.get("message", {}) + if not isinstance(msg, dict): + msg = {} content = msg.get("content", []) @@ def _process_assistant(entry: dict, messages: list, metadata: dict): @@ - msg = entry.get("message", {}) + msg = entry.get("message", {}) + if not isinstance(msg, dict): + msg = {} model = msg.get("model", "")Also applies to: 173-174
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@utils/jsonl_parser.py` around lines 140 - 142, The processors in utils/jsonl_parser.py currently assume entry["message"] is a dict and call entry.get("message", {}) without guarding against message being None; update both user and assistant processing blocks to first coerce message to a dict (e.g., msg = entry.get("message") or {} ) before calling msg.get("content", []) and passing to _extract_text, and apply the same pattern where similar code exists (the other processor and the parse_session usage) so a null "message" no longer raises when dereferenced.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tests/test_jsonl_parser.py`:
- Around line 718-730: Add two regression tests in tests/test_jsonl_parser.py
alongside test_missing_usage_dict_no_crash: one that passes a record with
"message": null and another that passes a record with a non-dict usage (e.g.,
"usage": "invalid" or a number) to ensure parse_session handles these malformed
shapes without crashing and still sets metadata.total_input_tokens to 0;
reference the existing test function name test_missing_usage_dict_no_crash and
the parser entrypoint parse_session to locate where to add the new tests and
mirror the try/finally cleanup pattern used there.
---
Outside diff comments:
In `@utils/jsonl_parser.py`:
- Around line 140-142: The processors in utils/jsonl_parser.py currently assume
entry["message"] is a dict and call entry.get("message", {}) without guarding
against message being None; update both user and assistant processing blocks to
first coerce message to a dict (e.g., msg = entry.get("message") or {} ) before
calling msg.get("content", []) and passing to _extract_text, and apply the same
pattern where similar code exists (the other processor and the parse_session
usage) so a null "message" no longer raises when dereferenced.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 4b102107-1dd1-4277-936f-9231732d0eb4
📒 Files selected for processing (2)
tests/test_jsonl_parser.pyutils/jsonl_parser.py
Add tests/test_jsonl_parser.py for tool-result dispatch, helpers, parse_session, quick_session_info, and malformed entries (null/non-dict message and usage). Introduce _entry_message() so only dict messages are read; keep usage as a dict before token accounting. Apply the same normalization in quick_session_info. Use full-string == assertions in TestStripSystemTags; drop unused pytest import; add regression for string-shaped assistant message.
Closes #21
Summary
Adds comprehensive pytest coverage for
utils/jsonl_parser.py(CCC1) and hardens assistant parsing whenmessage.usageis JSONnullor otherwise not a plain object.Motivation
jsonl_parser.pyis the structural core for untrusted, schema-evolving Claude Code JSONL. Direct tests lock in_parse_tool_resultdispatch, helper behavior, and integration paths so silent regressions from schema drift are caught early.Changes
tests/test_jsonl_parser.pycovering:_parse_tool_resultfor all major result shapes and fallbacks_normalize_content,_extract_text,_extract_images_infer_title,_strip_system_tags_process_user,_process_assistant,_process_system,_track_file_activityparse_sessionintegration (empty files, unknown types, sidechains, snapshots, wall time, malformed lines)quick_session_infofor small vs large sessions_process_assistanttreats non-dictusageas{}sousage: nulldoes not crash.Testing