mirror of
https://github.com/instructkr/claw-code.git
synced 2026-06-28 13:28:39 -04:00
Compare commits
4 Commits
6948b20d74
...
604bf389b6
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
604bf389b6 | ||
|
|
0730183f35 | ||
|
|
5e0228dce0 | ||
|
|
b780c808d1 |
144
ROADMAP.md
144
ROADMAP.md
@@ -12275,3 +12275,147 @@ claw status --session clawcode-human # → no approval-blocked indicator
|
||||
**Status:** Open. No code changed. Merge-wait mode. Filed from DOGFOOD_FINDINGS.md evidence (gaebal-gajae).
|
||||
|
||||
🪨
|
||||
|
||||
---
|
||||
|
||||
## Pinpoint #200 — SCHEMAS.md / classifier comments self-documenting drift: declarative claims diverge from implementation with no derive-from-source enforcement (Q *YeonGyu Kim, cycle #304)
|
||||
|
||||
**Observed:** SCHEMAS.md action-field counts and classifier comments claim coverage over implementation-enumerable facts (e.g. event kinds, action verbs, field lists). Over time these declarative claims silently diverge from actual code — pattern already observed at #170 (classifier 4-verb sweep) and #172 (action-field count drift). No test enforces that SCHEMAS.md reflects current implementation; document can go stale without any CI signal.
|
||||
|
||||
**Gap:**
|
||||
- SCHEMAS.md field/kind enumerations are hand-maintained with no automated sync
|
||||
- Classifier comments referencing "N action verbs" or "M event types" have no derive-from-source test
|
||||
- Divergence is invisible until a human audits both doc and code manually
|
||||
- Pattern recurs: #170 found 4-verb classifier claim vs actual 6-verb set; #172 found action-field count mismatch
|
||||
|
||||
**Repro:**
|
||||
```
|
||||
# Check SCHEMAS.md event kind list vs actual runtime event kinds
|
||||
grep -E 'kind:' SCHEMAS.md | wc -l
|
||||
grep -rE 'kind:.*=' src/ | grep -v test | wc -l
|
||||
# Counts diverge with no CI gate
|
||||
```
|
||||
|
||||
**Expected:** A derive-from-source test (e.g. `test_schemas_md_event_kinds_complete`) that parses SCHEMAS.md claimed enumerations and asserts they match implementation-enumerable facts. Fails loudly on drift.
|
||||
|
||||
**Fix sketch:**
|
||||
1. Add `tests/test_schemas_doc_parity.py` — parse SCHEMAS.md event-kind list, compare to runtime-emitted kinds
|
||||
2. Add similar check for classifier action-verb claims vs actual classifier verb set
|
||||
3. Gate on CI so SCHEMAS.md updates are forced when implementation changes
|
||||
|
||||
**Status:** Open. No code changed. Merge-wait mode. Filed from Q *YeonGyu Kim cycle #304 observation.
|
||||
|
||||
🪨
|
||||
|
||||
---
|
||||
|
||||
## Pinpoint #201 — `parse_tool_arguments` silent fallback: malformed JSON tool args wrapped as `{"raw": ...}`, no structured error event emitted (Jobdori, cycle #134)
|
||||
|
||||
**Observed:** In `rust/crates/api/src/providers/openai_compat.rs`, `parse_tool_arguments()` (line ~1223) silently converts malformed JSON tool arguments to `json!({ "raw": arguments })` with no error event, no log entry, and no structured signal. Downstream consumers receive what appears to be a valid parsed object with a `raw` field — the parse failure is completely invisible.
|
||||
|
||||
**Gap:**
|
||||
- A tool call with malformed arguments (e.g. `arguments: "not json"`) is silently normalized to `{"raw": "not json"}`
|
||||
- No `tool_parse_error` event or `parse_error` field is emitted
|
||||
- Classifier and orchestrator see an apparently-valid tool args object — they cannot distinguish "arguments parsed cleanly" from "arguments were garbage and got wrapped"
|
||||
- Error surfaces only when downstream logic tries to access expected keys and gets `None` — attribution is lost by then
|
||||
|
||||
**Repro:**
|
||||
```
|
||||
# Simulate a provider returning malformed tool call arguments
|
||||
# claw receives chunk with: tool_call.function.arguments = "not valid json {"
|
||||
# parse_tool_arguments("not valid json {") → Ok({"raw": "not valid json {"})
|
||||
# No error logged, no event emitted, session continues as if parse succeeded
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
- `parse_tool_arguments` returns a typed result: either parsed object or a `ToolParseError { raw: String, error: String }`
|
||||
- On parse failure, emit structured event: `{ "kind": "tool_arg_parse_error", "tool_index": N, "raw": "...", "parse_error": "..." }`
|
||||
- Session status reflects parse failure; `claw doctor` can surface it
|
||||
- Downstream code can distinguish clean parse from fallback wrap
|
||||
|
||||
**Fix sketch:**
|
||||
1. Change return type of `parse_tool_arguments` to `Result<Value, ToolArgParseError>`
|
||||
2. On error path, emit `tool_arg_parse_error` event before returning fallback
|
||||
3. Include `parse_error` field alongside `raw` in fallback value so downstream can detect it: `json!({ "raw": arguments, "__parse_error": err.to_string() })`
|
||||
4. Add to classifier's recognized error taxonomy
|
||||
|
||||
**Status:** Open. No code changed. Filed 2026-04-25 05:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: b780c80.
|
||||
|
||||
🪨
|
||||
|
||||
---
|
||||
|
||||
## Pinpoint #202 — `sanitize_tool_message_pairing` silent drop: orphaned tool messages removed with no event, no log, no diagnostic visibility (Jobdori, cycle #135)
|
||||
|
||||
**Observed:** In `rust/crates/api/src/providers/openai_compat.rs`, `sanitize_tool_message_pairing()` (called at line ~868) silently drops any `role:"tool"` message whose `tool_call_id` has no matching preceding `assistant` turn with a `tool_calls[].id`. The drop is intentional (prevents 400s from OpenAI-compat backends) but produces zero structured signal: no event, no log entry, no field in the request envelope indicating N messages were removed.
|
||||
|
||||
**Gap:**
|
||||
- A session with compaction, editing, or resume can arrive at the request boundary with orphaned tool messages
|
||||
- These are quietly dropped; the provider receives a request with fewer messages than the session history claims
|
||||
- No `tool_message_dropped` event or `history_sanitized` field is emitted
|
||||
- `claw doctor`, the event log, and downstream observers cannot distinguish "all tool messages sent" from "N tool messages silently omitted"
|
||||
- Debugging mismatch between session history and what the provider actually received requires source-level tracing
|
||||
|
||||
**Repro:**
|
||||
```
|
||||
# Craft a session where a tool result has no matching assistant tool_calls entry
|
||||
# (e.g. resume after compaction that dropped the assistant turn but kept the result)
|
||||
# sanitize_tool_message_pairing() drops the orphan silently
|
||||
# Event log shows no drop event
|
||||
# Provider receives history minus the orphaned message; caller sees no indication
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
- On any drop, emit structured event: `{ "kind": "tool_message_dropped", "tool_call_id": "...", "count": N, "reason": "no_paired_assistant_turn" }`
|
||||
- Optionally: include `{ "history_sanitized": { "dropped_tool_messages": N } }` in request metadata
|
||||
- `claw doctor` can surface sessions where tool message sanitization occurred
|
||||
- Clawability: agents replaying or resuming sessions can detect the gap and re-issue the tool call or warn
|
||||
|
||||
**Fix sketch:**
|
||||
1. Change `sanitize_tool_message_pairing` to return `(Vec<Value>, Vec<DroppedToolMessage>)` or emit events via a callback/channel
|
||||
2. At call site (line ~868), if any drops occurred, emit `tool_message_dropped` event(s) before sending request
|
||||
3. Add `dropped_tool_messages` count to request diagnostic envelope if non-zero
|
||||
4. Add to classifier's recognized event taxonomy
|
||||
|
||||
**Status:** Open. No code changed. Filed 2026-04-25 06:09 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 5e0228d.
|
||||
|
||||
🪨
|
||||
|
||||
---
|
||||
|
||||
## Pinpoint #203 — `AutoCompactionEvent` is summary-only: no streaming SSE event emitted when auto-compaction fires mid-turn (Jobdori, cycle #136)
|
||||
|
||||
**Observed:** In `rust/crates/runtime/src/conversation.rs`, `maybe_auto_compact()` (line ~555) fires compaction between turns and returns an `AutoCompactionEvent { removed_message_count }`. This event is attached to `TurnSummary.auto_compaction` and only surfaces in the post-turn struct returned by `run_turn()`. It is not emitted as a streaming SSE event at the moment compaction occurs.
|
||||
|
||||
**Gap:**
|
||||
- A claw monitoring the SSE stream during a long multi-turn session cannot detect that compaction fired until the final `TurnSummary` JSON arrives (or, in JSON output mode, until the CLI prints the final response envelope)
|
||||
- Between compaction firing and the final summary, the session history has already been truncated — any mid-turn state the claw was tracking against the old history is now stale
|
||||
- No `session_compacted` or `auto_compaction` SSE event exists; the classifier's event taxonomy has no entry for it
|
||||
- `claw doctor` cannot surface "this session has been auto-compacted N times" or "compaction removed M messages in the last turn"
|
||||
- Claws relying on replay-by-session-history for context reconstruction silently receive a shorter history with no notification
|
||||
|
||||
**Repro:**
|
||||
```
|
||||
# Run a session long enough to trigger auto-compaction
|
||||
# Monitor the SSE stream during the turn
|
||||
# Observe: no event with kind=session_compacted or similar appears in the stream
|
||||
# The only signal is the post-turn auto_compaction field in the JSON summary
|
||||
# If running in interactive TUI mode, the only signal is the printed compaction notice
|
||||
```
|
||||
|
||||
**Expected:**
|
||||
- When `maybe_auto_compact()` removes messages, emit a streaming SSE event immediately: `{ "kind": "session_compacted", "removed_message_count": N, "retained_message_count": M, "trigger": "auto" }`
|
||||
- `claw doctor` surfaces sessions with auto-compaction history and message counts
|
||||
- Classifier recognizes `session_compacted` as a first-class event kind
|
||||
- `/compact` manual command similarly emits this event (currently only prints a user-facing string)
|
||||
|
||||
**Fix sketch:**
|
||||
1. Add `session_compacted` to the `StreamEvent` enum (or as a diagnostic event channel alongside it)
|
||||
2. In `maybe_auto_compact()`, after compaction, push `session_compacted` event through the event channel before continuing the turn
|
||||
3. Expose count in the event: `{ kind: "session_compacted", removed_message_count: N }`
|
||||
4. Wire manual `/compact` command to emit the same event
|
||||
5. Add to classifier event taxonomy and `claw doctor` output
|
||||
|
||||
**Status:** Open. No code changed. Filed 2026-04-25 07:47 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: 0730183.
|
||||
|
||||
🪨
|
||||
|
||||
Reference in New Issue
Block a user