roadmap: #203 filed — AutoCompactionEvent summary-only, no SSE event emitted mid-turn when auto-compaction fires (Jobdori cycle #136 )

roadmap: #202 filed — sanitize_tool_message_pairing silent drop, no tool_message_dropped event (Jobdori cycle #135 )
roadmap: #201 filed — parse_tool_arguments silent fallback, no tool_arg_parse_error event (Jobdori cycle #134 )
2026-06-28 13:28:39 -04:00 · 2026-04-25 07:48:22 +09:00 · 2026-04-25 06:06:32 +09:00 · 2026-04-25 05:03:54 +09:00 · 2026-04-25 04:03:40 +09:00
1 changed files with 144 additions and 0 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -12275,3 +12275,147 @@ claw status --session clawcode-human  # → no approval-blocked indicator
 **Status:** Open. No code changed. Merge-wait mode. Filed from DOGFOOD_FINDINGS.md evidence (gaebal-gajae).

 🪨
+
+---
+
+## Pinpoint #200 — SCHEMAS.md / classifier comments self-documenting drift: declarative claims diverge from implementation with no derive-from-source enforcement (Q *YeonGyu Kim, cycle #304)
+
+**Observed:** SCHEMAS.md action-field counts and classifier comments claim coverage over implementation-enumerable facts (e.g. event kinds, action verbs, field lists). Over time these declarative claims silently diverge from actual code — pattern already observed at #170 (classifier 4-verb sweep) and #172 (action-field count drift). No test enforces that SCHEMAS.md reflects current implementation; document can go stale without any CI signal.
+
+**Gap:**
+- SCHEMAS.md field/kind enumerations are hand-maintained with no automated sync
+- Classifier comments referencing "N action verbs" or "M event types" have no derive-from-source test
+- Divergence is invisible until a human audits both doc and code manually
+- Pattern recurs: #170 found 4-verb classifier claim vs actual 6-verb set; #172 found action-field count mismatch
+
+**Repro:**
+```
+# Check SCHEMAS.md event kind list vs actual runtime event kinds
+grep -E 'kind:' SCHEMAS.md | wc -l
+grep -rE 'kind:.*=' src/ | grep -v test | wc -l
+# Counts diverge with no CI gate
+```
+
+**Expected:** A derive-from-source test (e.g. `test_schemas_md_event_kinds_complete`) that parses SCHEMAS.md claimed enumerations and asserts they match implementation-enumerable facts. Fails loudly on drift.
+
+**Fix sketch:**
+1. Add `tests/test_schemas_doc_parity.py` — parse SCHEMAS.md event-kind list, compare to runtime-emitted kinds
+2. Add similar check for classifier action-verb claims vs actual classifier verb set
+3. Gate on CI so SCHEMAS.md updates are forced when implementation changes
+
+**Status:** Open. No code changed. Merge-wait mode. Filed from Q *YeonGyu Kim cycle #304 observation.
+
+🪨
+
+---
+
+## Pinpoint #201 — `parse_tool_arguments` silent fallback: malformed JSON tool args wrapped as `{"raw": ...}`, no structured error event emitted (Jobdori, cycle #134)
+
+**Observed:** In `rust/crates/api/src/providers/openai_compat.rs`, `parse_tool_arguments()` (line ~1223) silently converts malformed JSON tool arguments to `json!({ "raw": arguments })` with no error event, no log entry, and no structured signal. Downstream consumers receive what appears to be a valid parsed object with a `raw` field — the parse failure is completely invisible.
+
+**Gap:**
+- A tool call with malformed arguments (e.g. `arguments: "not json"`) is silently normalized to `{"raw": "not json"}`
+- No `tool_parse_error` event or `parse_error` field is emitted
+- Classifier and orchestrator see an apparently-valid tool args object — they cannot distinguish "arguments parsed cleanly" from "arguments were garbage and got wrapped"
+- Error surfaces only when downstream logic tries to access expected keys and gets `None` — attribution is lost by then
+
+**Repro:**
+```
+# Simulate a provider returning malformed tool call arguments
+# claw receives chunk with: tool_call.function.arguments = "not valid json {"
+# parse_tool_arguments("not valid json {") → Ok({"raw": "not valid json {"})
+# No error logged, no event emitted, session continues as if parse succeeded
+```
+
+**Expected:** 
+- `parse_tool_arguments` returns a typed result: either parsed object or a `ToolParseError { raw: String, error: String }`
+- On parse failure, emit structured event: `{ "kind": "tool_arg_parse_error", "tool_index": N, "raw": "...", "parse_error": "..." }`
+- Session status reflects parse failure; `claw doctor` can surface it
+- Downstream code can distinguish clean parse from fallback wrap
+
+**Fix sketch:**
+1. Change return type of `parse_tool_arguments` to `Result<Value, ToolArgParseError>`
+2. On error path, emit `tool_arg_parse_error` event before returning fallback
+3. Include `parse_error` field alongside `raw` in fallback value so downstream can detect it: `json!({ "raw": arguments, "__parse_error": err.to_string() })`
+4. Add to classifier's recognized error taxonomy
+
+**Status:** Open. No code changed. Filed 2026-04-25 05:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: b780c80.
+
+🪨
+
+---
+
+## Pinpoint #202 — `sanitize_tool_message_pairing` silent drop: orphaned tool messages removed with no event, no log, no diagnostic visibility (Jobdori, cycle #135)
+
+**Observed:** In `rust/crates/api/src/providers/openai_compat.rs`, `sanitize_tool_message_pairing()` (called at line ~868) silently drops any `role:"tool"` message whose `tool_call_id` has no matching preceding `assistant` turn with a `tool_calls[].id`. The drop is intentional (prevents 400s from OpenAI-compat backends) but produces zero structured signal: no event, no log entry, no field in the request envelope indicating N messages were removed.
+
+**Gap:**
+- A session with compaction, editing, or resume can arrive at the request boundary with orphaned tool messages
+- These are quietly dropped; the provider receives a request with fewer messages than the session history claims
+- No `tool_message_dropped` event or `history_sanitized` field is emitted
+- `claw doctor`, the event log, and downstream observers cannot distinguish "all tool messages sent" from "N tool messages silently omitted"
+- Debugging mismatch between session history and what the provider actually received requires source-level tracing
+
+**Repro:**
+```
+# Craft a session where a tool result has no matching assistant tool_calls entry
+# (e.g. resume after compaction that dropped the assistant turn but kept the result)
+# sanitize_tool_message_pairing() drops the orphan silently
+# Event log shows no drop event
+# Provider receives history minus the orphaned message; caller sees no indication
+```
+
+**Expected:**
+- On any drop, emit structured event: `{ "kind": "tool_message_dropped", "tool_call_id": "...", "count": N, "reason": "no_paired_assistant_turn" }`
+- Optionally: include `{ "history_sanitized": { "dropped_tool_messages": N } }` in request metadata
+- `claw doctor` can surface sessions where tool message sanitization occurred
+- Clawability: agents replaying or resuming sessions can detect the gap and re-issue the tool call or warn
+
+**Fix sketch:**
+1. Change `sanitize_tool_message_pairing` to return `(Vec<Value>, Vec<DroppedToolMessage>)` or emit events via a callback/channel
+2. At call site (line ~868), if any drops occurred, emit `tool_message_dropped` event(s) before sending request
+3. Add `dropped_tool_messages` count to request diagnostic envelope if non-zero
+4. Add to classifier's recognized event taxonomy
+
+**Status:** Open. No code changed. Filed 2026-04-25 06:09 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 5e0228d.
+
+🪨
+
+---
+
+## Pinpoint #203 — `AutoCompactionEvent` is summary-only: no streaming SSE event emitted when auto-compaction fires mid-turn (Jobdori, cycle #136)
+
+**Observed:** In `rust/crates/runtime/src/conversation.rs`, `maybe_auto_compact()` (line ~555) fires compaction between turns and returns an `AutoCompactionEvent { removed_message_count }`. This event is attached to `TurnSummary.auto_compaction` and only surfaces in the post-turn struct returned by `run_turn()`. It is not emitted as a streaming SSE event at the moment compaction occurs.
+
+**Gap:**
+- A claw monitoring the SSE stream during a long multi-turn session cannot detect that compaction fired until the final `TurnSummary` JSON arrives (or, in JSON output mode, until the CLI prints the final response envelope)
+- Between compaction firing and the final summary, the session history has already been truncated — any mid-turn state the claw was tracking against the old history is now stale
+- No `session_compacted` or `auto_compaction` SSE event exists; the classifier's event taxonomy has no entry for it
+- `claw doctor` cannot surface "this session has been auto-compacted N times" or "compaction removed M messages in the last turn"
+- Claws relying on replay-by-session-history for context reconstruction silently receive a shorter history with no notification
+
+**Repro:**
+```
+# Run a session long enough to trigger auto-compaction
+# Monitor the SSE stream during the turn
+# Observe: no event with kind=session_compacted or similar appears in the stream
+# The only signal is the post-turn auto_compaction field in the JSON summary
+# If running in interactive TUI mode, the only signal is the printed compaction notice
+```
+
+**Expected:**
+- When `maybe_auto_compact()` removes messages, emit a streaming SSE event immediately: `{ "kind": "session_compacted", "removed_message_count": N, "retained_message_count": M, "trigger": "auto" }`
+- `claw doctor` surfaces sessions with auto-compaction history and message counts
+- Classifier recognizes `session_compacted` as a first-class event kind
+- `/compact` manual command similarly emits this event (currently only prints a user-facing string)
+
+**Fix sketch:**
+1. Add `session_compacted` to the `StreamEvent` enum (or as a diagnostic event channel alongside it)
+2. In `maybe_auto_compact()`, after compaction, push `session_compacted` event through the event channel before continuing the turn
+3. Expose count in the event: `{ kind: "session_compacted", removed_message_count: N }`
+4. Wire manual `/compact` command to emit the same event
+5. Add to classifier event taxonomy and `claw doctor` output
+
+**Status:** Open. No code changed. Filed 2026-04-25 07:47 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 0730183.
+
+🪨
Author	SHA1	Message	Date
YeonGyu-Kim	604bf389b6	roadmap: #203 filed — AutoCompactionEvent summary-only, no SSE event emitted mid-turn when auto-compaction fires (Jobdori cycle #136 )	2026-04-25 07:48:22 +09:00
YeonGyu-Kim	0730183f35	roadmap: #202 filed — sanitize_tool_message_pairing silent drop, no tool_message_dropped event (Jobdori cycle #135 )	2026-04-25 06:06:32 +09:00
YeonGyu-Kim	5e0228dce0	roadmap: #201 filed — parse_tool_arguments silent fallback, no tool_arg_parse_error event (Jobdori cycle #134 )	2026-04-25 05:03:54 +09:00
YeonGyu-Kim	b780c808d1	roadmap: #200 filed — SCHEMAS.md self-documenting drift, no derive-from-source enforcement (Q *YeonGyu Kim cycle #304 )	2026-04-25 04:03:40 +09:00