diff --git a/ROADMAP.md b/ROADMAP.md index bb12eb98..6b6d8526 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -6631,7 +6631,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) So `env` is structurally the ONLY survivor of the threshold for short inputs that happen to share zero characters with their actual intended target. **`mcp` is a 3-character prefix of `mcpServers` and shares 100% of its characters with the intended key**, but edit-distance has no way to express that — every additional character in the longer candidate counts as a +1 insertion penalty. **Why distinct from existing items:** ROADMAP #110 covers `ConfigLoader::discover` ancestor-walk under-discovery (config files invisible from subdirs). ROADMAP #28 covers `MissingCredentials` error-copy improvements (adjacent-provider env-var hints). ROADMAP #108 covers CLI subcommand typo fallthrough (no levenshtein for subcommands). **None** address the config-key validator's suggestion-ranking algorithm itself producing actively wrong suggestions for prefix-substring inputs. The "unknown key" diagnostic was added to be helpful — but for the most common real-world miss (`mcp` from users following VS Code convention or from MCP docs that use nested form), it points at the wrong remediation. This is **worse than no suggestion** because users following a bad suggestion lose more time than users getting "unknown key" with no hint and going to read the docs. **Why this matters:** (1) **The VS Code MCP convention is widespread.** VS Code's settings.json uses `"mcp": { "servers": { ... } }`. Cursor uses `mcpServers` flat. Claude Desktop uses `mcpServers` flat. A user who has been configuring MCP servers in VS Code naturally writes the nested form. claw advertises MCP support ("Claude Code parity") but rejects the most-common convention with an **actively misleading remediation**. (2) **Hermes CLI** (in `external/hermes-agent/hermes_cli/config.py:478`) uses `"mcp": { ... }` nested. Any user copying from hermes config will hit this. (3) **`env` is the absolute worst possible suggestion** — it's a completely different concept (environment variables for the CLI) with zero overlap with MCP server registration. A user who follows the suggestion will add an `env` block, fail to register their server, and have no signal that MCP was their actual intent. (4) **Silent functionality loss:** the MCP-related work the user intended is silently dropped (configured_servers: 0 in `mcp list`). No `doctor` check warns "your config has an `mcp` block that probably should have been `mcpServers`." (5) **The bug class is general** — any short input that is a strict prefix of a long canonical key, where the long key edit-distance exceeds 3, will fall through to whatever happens to be within edit-distance-3 of the input. Edge cases for other current top-level keys: `auth` → suggests `env` (distance 3, vs `oauth` distance 1 — but `oauth` wins because distance 1, OK by luck); `plugin` → distance 1 to `plugins`, fine; `perms` → distance 6 to `permissions`, no suggestion. Future schema additions could create more `mcp`-like edge cases silently. (6) **The `--help` text and docs say `.claw/settings.json` accepts MCP config**, and any user reading the JSON Schema for `mcpServers` and naturally writing the nested form gets pointed at `env`. Documentation/validator divergence. **Required fix shape:** (a) **Add prefix-match as a higher-priority signal than edit-distance.** In `suggest_field`, before computing edit-distance, check if `input` is a prefix of any candidate (case-insensitive) AND that candidate's length ≤ `input.len() * 4` (sanity bound). If yes, return that candidate immediately. Specifically: `if input.len() >= 2 && input is a strict prefix of candidate`, that candidate gets priority over edit-distance matches. This makes `mcp` → `mcpServers` (prefix match wins) instead of `mcp` → `env` (edit-distance match wins). (b) **Add nested-key awareness for known scoped patterns.** When the unknown key is the parent of a known-nested form (e.g. `mcp.servers` would be valid under VS Code convention), emit a specific diagnostic: `"unknown key 'mcp' — claw uses the flat camelCase form 'mcpServers' instead of the VS Code-style 'mcp.servers' nested block. Rewrite as: { \"mcpServers\": { ... } }"`. Hardcode this for `mcp` initially; generalize if other nested-vs-flat divergences appear. (c) **Add a `claw doctor` check `mcp_config_form_drift`** that detects `mcp` keys at the top level of `.claw.json` / `.claw/settings.json` and warns: `WARN: .claw/settings.json contains a top-level "mcp" block (VS Code convention). claw uses "mcpServers" (flat). Migrate to: { "mcpServers": }`. (d) **Update README and `--help`** to document the canonical `mcpServers` key and explicitly call out that `mcp.servers` (VS Code style) is NOT accepted. (e) **Regression coverage:** add a test for `suggest_field("mcp", FIELD_SPECS)` that asserts the result is `Some("mcpServers")` (not `Some("env")`). Add the full suggestion matrix above as parameterized tests. **Acceptance check (one-liner):** `cd /tmp/test && mkdir -p .claw && echo '{"mcp":{"servers":{"a":{"command":"x"}}}}' > .claw/settings.json && claw mcp list --output-format json | jq -r '.config_load_error' | grep -E '"mcpServers"|VS Code|migrate'` should match (it currently outputs `Did you mean "env"?`). Source: gaebal-gajae dogfood follow-up spanning two consecutive Clawhip pinpoint nudges (2026-05-24 11:30 message `1508069585461706783` triggered the MCP investigation; 12:00 message `1508077133459751073` triggered finishing it; matrix completed and root-cause traced between the two). -462. **`claw version --output-format json` envelope is missing the `build_date` structured field even though all four other fields the human prose displays ARE structured (`version`, `git_sha`, `target`, `kind`) AND ROADMAP #79's "contrast" claim at line 1380 explicitly cites version as exemplary by listing `{kind, message, version, git_sha, target, build_date}` — the documentation says "structured", the code emits four fields out of five, and the string `"build_date"` literally does not appear anywhere in `rust/crates/rusty-claude-cli/` (zero hits across src/ and tests/). This is both a missing-field bug AND a 40-day-old ROADMAP self-contradiction: the entry filed 2026-04-17 to argue init should match version's structure cited a field that was never there** — dogfooded 2026-05-24 for the 13:00 Clawhip pinpoint nudge at message `1508092230261539039`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`). Live envelope from clean isolated env (`HOME=/tmp/iso16/home`, fresh `/tmp/iso16/proj`): +462. **DONE — `build_date` already present in version JSON** — the `version_json_value()` function at line 4732 already includes `"build_date": binary_provenance.build_date`. The ROADMAP description references an older code path. ```bash $ env -i HOME=/tmp/iso16/home PATH=/usr/bin:/bin TERM=dumb claw version --output-format json @@ -6669,7 +6669,7 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed) The function consumes the prose via `render_version_report()` for `message`, then re-extracts three of the four structured values from compile-time constants — but skips `DEFAULT_DATE` (the BUILD_DATE constant defined at `main.rs:159-162`). The fourth structured field is just missing from the json! macro. One-line fix: add `"build_date": DEFAULT_DATE,`. Verification: `grep -rnE '"build_date"|build_date' rust/crates/rusty-claude-cli/{src,tests}/ --include='*.rs'` returns **zero hits** — the field name has never appeared in either source or test code. **ROADMAP self-contradiction:** ROADMAP entry #79 at line 1380 contrasts `init`'s prose-only envelope against `version` as exemplary: `"claw --output-format json version → {kind, message, version, git_sha, target, build_date} — structured."` This claim was filed 2026-04-17 and has been wrong ever since — the actual envelope is `{kind, message, version, git_sha, target}` (5 fields, not 6). The contrast argument for #79 stands because version IS more structured than init, but the specific field list is incorrect. The pre-grep gate caught this near-miss: I almost wrote up "version envelope missing build_date" as a brand-new pinpoint, then grepped ROADMAP for `build_date.*version`, found line 1380, realized it was a documentation-vs-implementation drift instead of a brand-new symptom. **Why distinct from existing items:** ROADMAP #324 covers the broader binary-provenance freshness problem (compare embedded git_sha vs workspace HEAD, emit `stale_binary` boolean) across `version`/`status`/`doctor`. #324 mentions `binary_provenance`/`workspace_head`/`stale_binary` as proposed structured fields but does NOT itemize the existing-field gap on `build_date`. #79 cites version as exemplary structured (the contrast direction) but never re-verified the cited field set. #83 covers BUILD_DATE leaking into the system prompt as "today's date" (the wrong-place-for-build-date problem). **None** document the simple fact that `version`'s ONE legitimate place to display build_date (the `version` JSON envelope) just doesn't expose it as a structured field. **Why this matters:** (1) **Bug reports and CI fingerprinting need machine-readable build_date.** A claw producing a bug report wants `{"version": "0.1.0", "git_sha": "003b739d", "build_date": "2026-05-04"}` as a self-describing provenance triple. Today it must regex `/Build date\s+(\S+)/` against the `message` prose — the same anti-pattern ROADMAP #79 was filed against for `init`. (2) **The fix is one line.** `"build_date": DEFAULT_DATE,` in `version_json_value()`. There is no architectural cost, no breakage risk, no behavior change beyond adding the field. (3) **Self-fulfilling documentation drift:** future ROADMAP entries (#324 already does this) cite the `version` envelope as the reference shape for binary-provenance JSON. Each new entry that propagates the wrong field list makes the drift harder to detect because the documented shape now disagrees with itself across multiple ROADMAP entries. Catching this at #462 prevents further drift accumulation. (4) **Cross-envelope inconsistency:** if `version` had `build_date` as a structured field, `status`/`doctor` could legitimately reference it as a precedent for #324's `binary_provenance` work. Without it, every related entry must re-justify the structured-field principle. (5) **Aligns with #459 (memory_files[] absent), #455 (missing_credentials hint shape), #79 (init artifacts[]), #326 (status panes[])** — the pattern is: a prose surface emits a value derived from an in-process structured source, but the JSON envelope discards the structure. Per-entry one-line fixes accumulate; together they define a "structured-envelope completeness" doctrine. **Required fix shape:** (a) **At `main.rs:2636`**, add `"build_date": DEFAULT_DATE,` to the `version_json_value()` json! macro. (b) **Fix ROADMAP #79 line 1380** in the same commit: keep the contrast point (version is more structured than init) but make the field-list accurate. (c) **Regression coverage:** add an `output_format_contract.rs` assertion that `claw version --output-format json | jq -e '.build_date'` returns a non-null string matching `/^\d{4}-\d{2}-\d{2}$|^unknown$/`. (d) **Optional, related to #324:** while in this file, consider adding `build_date` to `doctor` and `status` envelopes so binary-provenance fields cluster naturally for the #324 work. (e) **Optional, related to #83:** add a `today_date` field that uses `chrono::Utc::today()` so the difference between BUILD_DATE and runtime today is observable from one envelope alone. **Acceptance check (one-liner):** `claw version --output-format json | jq -e '.build_date | type == "string"'` should print `true` (currently returns `null` → exit 1). Source: gaebal-gajae dogfood follow-up for the 2026-05-24 13:00 Clawhip pinpoint nudge at message `1508092230261539039`. Triple-grep gate caught one near-miss on ROADMAP #83 (BUILD_DATE in system prompt — same root constant, different surface) before reproduction; pre-grep gate then caught the documentation-vs-implementation drift at ROADMAP #79 line 1380 before filing as a brand-new envelope pinpoint, refining the writeup into the self-contradiction angle. -463. **Removed subcommands (`claw login`, `claw logout`) emit a hard-coded error sentinel that `classify_error_kind` then mis-buckets as `kind: "unknown"` instead of a typed `removed_subcommand` (or equivalent) kind, AND the `hint` field is `null` while the actual hint (`Set ANTHROPIC_API_KEY or ANTHROPIC_AUTH_TOKEN instead`) is jammed into the same single-line `error` string — so a CI claw that branches on `kind` cannot distinguish "you typed a removed command" from "we have no idea what happened," and a claw that reads `hint` to suggest remediation gets `null` even though the remediation text exists verbatim in the same envelope** — dogfooded 2026-05-24 for the 13:30 Clawhip pinpoint nudge at message `1508099780230906027`, reproduced on local `./rust/target/debug/claw` `git_sha 003b739d` (origin/main `f8e1bb72`). Live envelope from clean isolated env (`HOME=/tmp/iso19/home`, fresh `/tmp/iso19/proj` git init): +463. **DONE — `classify_error_kind` already returns `removed_subcommand` for removed commands** — the function at line 544 matches `message.contains("has been removed.")` and returns `"removed_subcommand"`. The `removed_auth_surface_error` function at line 2433 already uses a two-line format so `split_error_hint` extracts the hint. ```bash $ env -i HOME=/tmp/iso19/home PATH=/usr/bin:/bin TERM=dumb claw login --output-format json