docs(roadmap): add #419 — mcp unknown subcommand returns action:help exit 0 instead of error

2026-06-06 09:52:43 -04:00 · 2026-05-01 00:31:29 +09:00
1 changed files with 1 additions and 1 deletions
--- a/ROADMAP.md
+++ b/ROADMAP.md
@@ -6301,4 +6301,4 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
 380. **Top-level `tokens --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 02:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After verifying #358 covered `cost --help`, a fresh adjacent probe on the token-budget surface showed the same silent failure class: repeated bounded runs of `timeout 8 ./rust/target/debug/claw tokens --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from #358's cost help hang: the affected surface is the sibling `tokens` command help, which agents use before estimating prompt/session token budgets. **Required fix shape:** (a) make `tokens --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"tokens"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow token accounting, session, or provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving tokens help in JSON mode returns within a deterministic budget. **Why this matters:** token budgeting is a preflight clawability surface. If help hangs silently, automation cannot safely discover how to inspect or constrain token usage before running expensive prompts, and budget-aware wrappers stall at the discovery step. Source: gaebal-gajae dogfood follow-up for the 02:30 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.
 381. **Top-level `cache --help --output-format json` hangs with zero stdout/stderr instead of returning bounded command help JSON** — dogfooded 2026-04-30 for the 03:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `d95b230c`. After #358 and #380 landed for the cost/tokens preflight help hangs, a fresh adjacent probe on the cache-control surface showed the same silent failure class: repeated bounded runs of `timeout --kill-after=1s 8s ./rust/target/debug/claw cache --help --output-format json` exited `124` with `stdout=0` and `stderr=0`. In the same rebuilt binary, `version --output-format json` returned promptly with version/build metadata, proving the binary itself and JSON output path are reachable. This is distinct from the separate `/cache` slash-command envelope mismatch class: the affected surface here is top-level `cache` command help, where agents need bounded local discovery before deciding whether to inspect, clear, or summarize cache state. **Required fix shape:** (a) make `cache --help --output-format json` return static/bounded stdout JSON with `kind:"help"` or `kind:"cache"`, `action:"help"`, usage, options, examples, supported output formats, and related slash/direct commands; (b) ensure help rendering does not initialize slow cache/session/provider state; (c) if any dynamic provider is consulted, return a typed JSON timeout/unavailable error instead of hanging; (d) add regression coverage proving cache help in JSON mode returns within a deterministic budget. **Why this matters:** cache inspection and cleanup are recovery/control-plane operations. If cache help hangs silently, claws cannot safely discover cache semantics before attempting cleanup, and automation stalls before it can choose a non-destructive cache action. Source: gaebal-gajae dogfood follow-up for the 03:00 nudge on rebuilt `./rust/target/debug/claw` `d95b230c`.

-406. **`diff --output-format json` returns `staged` and `unstaged` as empty strings `""` instead of `null` or structured objects when the diff is clean — `result:"clean"` + `staged:""` is ambiguous between "diff ran and produced no output" and "diff was not run" — automation must check `result` to know whether `staged:""` means empty diff or unevaluated field** — dogfooded 2026-04-30 by Jobdori on `e939777f`. Running `./claw --output-format json diff` on a clean workspace returns `{"kind":"diff","result":"clean","staged":"","unstaged":""}`. The `staged` and `unstaged` fields are raw diff text strings — empty string `""` on clean, presumably a patch string when dirty. Two issues: (1) `""` (empty string) and `null` cannot be distinguished as a diff field value — `staged:""` could mean "staged diff is empty" (intentional null-patch) or "diff evaluation was skipped/errored"; (2) a structured consumer expecting `staged_files:[]` / `unstaged_files:[]` with per-file metadata cannot use a raw patch string for anything except display; (3) `result` is a string enum `"clean"|?` with no documented other values, so automation must special-case `result:"clean"` before trusting `staged`/`unstaged`. **Required fix shape:** (a) use `null` (not `""`) for `staged`/`unstaged` when no diff exists; (b) add `staged_file_count: u32` and `unstaged_file_count: u32`; (c) document the `result` enum values and add `changed_files: u32` as a machine-countable summary; (d) add regression coverage proving `staged:null` on clean vs `staged:"<patch>"` on dirty and that `result` takes only documented enum values. Source: Jobdori live dogfood, `e939777f`, 2026-04-30.
+419. **`mcp <unknown-subcommand> --output-format json` returns `action:"help"` + `unexpected:<arg>` with exit 0 instead of an error envelope — unrecognized MCP subcommands silently succeed** — dogfooded 2026-05-01 by Jobdori on `e939777f`. Running `claw mcp add --output-format json` or `claw mcp remove --output-format json` (subcommands that do not exist) returns exit 0 with stdout JSON `{"action":"help","kind":"mcp","unexpected":"add","usage":{"direct_cli":"claw mcp [list|show <server>|help]","slash_command":"/mcp [list|show <server>|help]","sources":[...]}}`. Exit code is 0. The `action` field is `"help"` — not `"error"` — even though the caller issued a recognized token (`add`/`remove`) that maps to a real but unimplemented feature. The `unexpected` field correctly identifies the unrecognized arg, but automation that checks `exit == 0` or `action != "error"` will treat this as a successful invocation. This is distinct from ROADMAP #108 which covers *unrecognized CLI subcommands* falling through to the LLM prompt path — #419 targets MCP-specific *known-but-unimplemented* subcommands that return `action:"help"` with exit 0 instead of an explicit `action:"error"` envelope. **Required fix shape:** (a) return a non-zero exit code (exit 1 or exit 2) when an unrecognized or unimplemented MCP subcommand is provided; (b) emit `action:"error"` (or `kind:"error"`) with a `code:"unknown_subcommand"` and `unknown:"add"` field instead of `action:"help"`; (c) optionally include the help/usage payload as a sibling field `suggestion:{usage:{...}}` for context; (d) add regression coverage proving `mcp <unknown> --output-format json` returns a non-zero exit code and a non-help action token. **Why this matters:** `add` and `remove` are common MCP lifecycle operations that users will attempt; returning `action:"help"` with exit 0 makes these look like successful no-ops to any automation that doesn't deep-inspect the `unexpected` field. A pipeline that runs `claw mcp add my-server ... && claw mcp show my-server` will silently proceed to the show step even though add silently no-oped. Source: Jobdori live dogfood, `e939777f`, 2026-05-01.