test(jsonl): add jsonl_parser schema coverage and harden null usage by clean6378-max-it · Pull Request #22 · cppalliance/claude-code-chat-browser

clean6378-max-it · 2026-05-05T23:36:20Z

Closes #21

Summary

Adds comprehensive pytest coverage for utils/jsonl_parser.py (CCC1) and hardens assistant parsing when message.usage is JSON null or otherwise not a plain object.

Motivation

jsonl_parser.py is the structural core for untrusted, schema-evolving Claude Code JSONL. Direct tests lock in _parse_tool_result dispatch, helper behavior, and integration paths so silent regressions from schema drift are caught early.

Changes

New: tests/test_jsonl_parser.py covering:
- _parse_tool_result for all major result shapes and fallbacks
- _normalize_content, _extract_text, _extract_images
- _infer_title, _strip_system_tags
- _process_user, _process_assistant, _process_system, _track_file_activity
- parse_session integration (empty files, unknown types, sidechains, snapshots, wall time, malformed lines)
- quick_session_info for small vs large sessions
Fix: _process_assistant treats non-dict usage as {} so usage: null does not crash.

Testing

pytest tests/test_jsonl_parser.py
pytest tests/
Notes
Underscore imports are confined to utils.jsonl_parser, consistent with existing parser tests.
CI automation for pytest is tracked separately (CCC2).

<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **Tests**
  * Added a comprehensive test suite for JSONL parsing: tool results, content normalization, title inference, message handling, file activity, session parsing, and extensive edge-case coverage (missing keys, nested results, images, sidechains, compaction, malformed/partial entries).

* **Bug Fixes**
  * Improved parsing robustness to safely handle missing or unexpected message/usage data, preventing errors during session extraction.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Add tests/test_jsonl_parser.py covering _parse_tool_result dispatch, content helpers, user/assistant/system paths, file activity, parse_session integration, quick_session_info, and malformed entries. Treat non-dict message.usage in _process_assistant as empty to avoid crashes on null usage.

coderabbitai · 2026-05-05T23:36:33Z

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

Adds a large pytest suite for utils/jsonl_parser and hardens message/usage access by introducing a safe _entry_message(entry) helper and ensuring usage is a dict in assistant processing and quick_session_info first-pass extraction.

Changes

Parser Test Coverage & Defensive Guard

Layer / File(s)	Summary
Data Shape / Helper `utils/jsonl_parser.py`	Add `_entry_message(entry)` to safely extract `entry["message"]` as a dict; replace direct `entry.get("message", {})` usages.
Core Guard `utils/jsonl_parser.py`	In `_process_assistant`, ensure `usage` is a dict (reset to `{}` when not) before accessing token/cache fields. Update `quick_session_info` first-pass to use `_entry_message`.
Test Helpers `tests/test_jsonl_parser.py` (`_fresh_metadata`, `_write_jsonl`)	Add deterministic metadata fixture and temp-jsonl writer used by tests.
Unit Tests — Parse Tool Result `tests/test_jsonl_parser.py` (parse tool/result tests)	Comprehensive tests covering bash, file edits/read/write, glob, grep, web search/fetch, task variants, todo_write, user_input, plan, unknown shapes, non-dict/null inputs, and slug preservation.
Unit Tests — Content Helpers `tests/test_jsonl_parser.py` (`_normalize_content`, `_extract_text`, `_extract_images`)	Tests for string/list/dict/mixed inputs, nested tool_result images, base64 image extraction, and skipping non-image items.
Unit Tests — Title & System Tag Helpers `tests/test_jsonl_parser.py` (`_infer_title`, `_strip_system_tags`)	Tests for title inference (first user text, 100-char truncation, Untitled fallback) and system tag stripping behavior.
Unit Tests — User Processing `tests/test_jsonl_parser.py` (`_process_user` tests)	Tests for first-entry metadata capture, handling missing `message`, and toolUseResult image extraction.
Unit Tests — Assistant & System Processing `tests/test_jsonl_parser.py` (`_process_assistant`, `_process_system` tests)	Covers content normalization, synthetic-model exclusion, thinking-block joining, tool_use counting, api_error flagging, stop_reason/service_tier accumulation, ephemeral token sums, compaction boundary handling and missing compactMetadata safety.
Unit Tests — File Activity & Integration `tests/test_jsonl_parser.py` (`_track_file_activity`, `parse_session` tests)	File read/write/edit tracking, bash/web logging, end-to-end parse_session scenarios (empty skeleton, unknown type ignored, sidechains, file-history snapshots, entry counts, wall-time, invalid JSON lines, missing keys, cleanup).
Unit Tests — Quick Session Info & Malformed Entries `tests/test_jsonl_parser.py` (`quick_session_info`, malformed tests)	Tests for small/large-file tail-read timestamp, no-user-entry Untitled behavior, and partial/invalid entry cases to validate robustness.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble lines of jsonl with delight,
Guards snug the messages through the night.
Tests sprout clover where edge-cases play,
Parsers hum softly at break of day.
Hop on — the session folds up right.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 4.40% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'test(jsonl): add jsonl_parser schema coverage and harden null usage' accurately describes both main changes: comprehensive test coverage for jsonl_parser and hardening of null/non-dict usage handling.
Linked Issues check	✅ Passed	The PR comprehensively addresses all coding requirements from issue `#21`: tests cover all 14 _parse_tool_result dispatch arms, helper functions, system/user/assistant processing, file activity tracking, integration scenarios, and malformed entries; null usage hardening is implemented in _process_assistant.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to issue `#21` objectives: test suite covers specified parser functions and scenarios, and utility changes harden message/usage handling as required. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch test/ccc1-jsonl-parser-schema

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

utils/jsonl_parser.py (1)

140-142: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard message before dereferencing in both user/assistant processors.

If an entry contains "message": null, both processors call .get(...) on None and abort parse_session for a recoverable malformed line. Apply the same dict-guard pattern here too.

💡 Suggested fix

 def _process_user(entry: dict, messages: list, metadata: dict):
@@
-    msg = entry.get("message", {})
+    msg = entry.get("message", {})
+    if not isinstance(msg, dict):
+        msg = {}
     content = msg.get("content", [])
@@
 def _process_assistant(entry: dict, messages: list, metadata: dict):
@@
-    msg = entry.get("message", {})
+    msg = entry.get("message", {})
+    if not isinstance(msg, dict):
+        msg = {}
     model = msg.get("model", "")

Also applies to: 173-174

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@utils/jsonl_parser.py` around lines 140 - 142, The processors in
utils/jsonl_parser.py currently assume entry["message"] is a dict and call
entry.get("message", {}) without guarding against message being None; update
both user and assistant processing blocks to first coerce message to a dict
(e.g., msg = entry.get("message") or {} ) before calling msg.get("content", [])
and passing to _extract_text, and apply the same pattern where similar code
exists (the other processor and the parse_session usage) so a null "message" no
longer raises when dereferenced.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_jsonl_parser.py`:
- Around line 718-730: Add two regression tests in tests/test_jsonl_parser.py
alongside test_missing_usage_dict_no_crash: one that passes a record with
"message": null and another that passes a record with a non-dict usage (e.g.,
"usage": "invalid" or a number) to ensure parse_session handles these malformed
shapes without crashing and still sets metadata.total_input_tokens to 0;
reference the existing test function name test_missing_usage_dict_no_crash and
the parser entrypoint parse_session to locate where to add the new tests and
mirror the try/finally cleanup pattern used there.

---

Outside diff comments:
In `@utils/jsonl_parser.py`:
- Around line 140-142: The processors in utils/jsonl_parser.py currently assume
entry["message"] is a dict and call entry.get("message", {}) without guarding
against message being None; update both user and assistant processing blocks to
first coerce message to a dict (e.g., msg = entry.get("message") or {} ) before
calling msg.get("content", []) and passing to _extract_text, and apply the same
pattern where similar code exists (the other processor and the parse_session
usage) so a null "message" no longer raises when dereferenced.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4b102107-1dd1-4277-936f-9231732d0eb4

📥 Commits

Reviewing files that changed from the base of the PR and between 3b72afd and e6df529.

📒 Files selected for processing (2)

tests/test_jsonl_parser.py
utils/jsonl_parser.py

Add tests/test_jsonl_parser.py for tool-result dispatch, helpers, parse_session, quick_session_info, and malformed entries (null/non-dict message and usage). Introduce _entry_message() so only dict messages are read; keep usage as a dict before token accounting. Apply the same normalization in quick_session_info. Use full-string == assertions in TestStripSystemTags; drop unused pytest import; add regression for string-shaped assistant message.

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

Comment thread tests/test_jsonl_parser.py

fix(jsonl): coerce null message before parsing JSONL entries

3f6db99

clean6378-max-it requested a review from timon0305 May 6, 2026 15:50

timon0305 requested changes May 6, 2026

View reviewed changes

Comment thread tests/test_jsonl_parser.py Outdated

clean6378-max-it requested a review from timon0305 May 6, 2026 17:43

timon0305 approved these changes May 6, 2026

View reviewed changes

timon0305 requested a review from wpak-ai May 6, 2026 17:53

wpak-ai approved these changes May 6, 2026

View reviewed changes

wpak-ai merged commit 09de881 into master May 6, 2026
2 checks passed

wpak-ai deleted the test/ccc1-jsonl-parser-schema branch May 6, 2026 17:55

This was referenced May 6, 2026

Issue — claude-code-chat-browser — Refactor _parse_tool_result dispatch table (5 pt) #29

Open

refactor(jsonl-parser): replace _parse_tool_result if/elif with dispa… #31

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(jsonl): add jsonl_parser schema coverage and harden null usage#22

test(jsonl): add jsonl_parser schema coverage and harden null usage#22
wpak-ai merged 3 commits intomasterfrom
test/ccc1-jsonl-parser-schema

clean6378-max-it commented May 5, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 5, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

clean6378-max-it commented May 5, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Testing

Uh oh!

coderabbitai Bot commented May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

clean6378-max-it commented May 5, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 5, 2026 •

edited

Loading