mirror of
https://github.com/instructkr/claw-code.git
synced 2026-07-01 14:18:57 -04:00
Compare commits
11 Commits
docs/roadm
...
2030540985
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2030540985 | ||
|
|
a1bfcd4110 | ||
|
|
c49839bb1f | ||
|
|
f65b2b4f0e | ||
|
|
f4b74e89dd | ||
|
|
5856913104 | ||
|
|
d45a0d2f5b | ||
|
|
dc47482e40 | ||
|
|
9537c97231 | ||
|
|
f56a5afcf7 | ||
|
|
3efaf551ed |
11
ROADMAP.md
11
ROADMAP.md
@@ -6272,6 +6272,15 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
|
||||
328. **`claw agents help` omits the `.codex/agents` roots that `claw agents` actually loads from, so native-agent discovery provenance is misleading** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `ee85fed6` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` then reported embedded `git_sha` `ee85fed6`, matching the workspace. Running `./rust/target/debug/claw agents help --output-format json` returned `usage.sources = [".claw/agents", "~/.claw/agents", "$CLAW_CONFIG_HOME/agents"]`, with no `.codex/agents` or `~/.codex/agents` entry. In the same environment, `./rust/target/debug/claw agents --output-format json` listed native agents such as `analyst` with source `{id: "user_claw", label: "User home roots"}` even though `/home/bellman/.claw/agents` does not exist and `/home/bellman/.codex/agents/analyst.toml` does exist. The agents lifecycle surface therefore documents one set of roots while loading from another, and the loaded-agent provenance collapses the real Codex root behind a generic `user_claw` label. This is distinct from #327's MCP source-list mismatch: the affected subsystem is native-agent discovery, where claws choose delegation/staffing lanes from `claw agents` and need to know which root supplied each agent. **Required fix shape:** (a) derive `agents help` source roots from the same registry/search path used by the agent loader instead of a hard-coded `.claw`-only list; (b) include all supported native-agent roots in stable order, including project/user `.codex/agents` roots alongside `.claw/agents` and `$CLAW_CONFIG_HOME/agents`; (c) make each `agents --output-format json` entry expose non-secret source provenance precise enough to distinguish `user_codex`, `project_codex`, `user_claw`, and `project_claw` (without leaking unnecessary absolute paths); (d) add a regression proving an agent loaded from `~/.codex/agents` is accompanied by help-source metadata naming that root and per-agent provenance that does not mislabel it as generic `user_claw`. **Why this matters:** agent selection is a control-plane decision. If help says only `.claw/agents` are searched while the runtime actually consumes `.codex/agents`, claws and operators can edit the wrong directory, misdiagnose missing/stale agents, or trust the wrong ownership boundary for delegated work. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed `agents help` omitting `.codex/agents` while `agents` loaded `analyst` from the existing `/home/bellman/.codex/agents/analyst.toml` with no `/home/bellman/.claw/agents` directory present.
|
||||
329. **Resume-safe slash `/agents --output-format json` downgrades structured agent inventory into prose even though top-level `claw agents --output-format json` returns machine-readable entries** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `0f7578c0` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `0f7578c0`, matching the workspace. Running `./rust/target/debug/claw --resume latest /agents --output-format json` returned only `{"kind":"agents","text":"Agents\n 20 active agents..."}`: the agent names, source ids, models, reasoning effort, active/shadowed state, and working-directory context are all flattened into one human prose string. In the same rebuilt binary and same workspace, `./rust/target/debug/claw agents --output-format json` returned a structured object with top-level `agents[]`, `count`, `summary`, `working_directory`, and per-agent fields such as `name`, `description`, `model`, `reasoning_effort`, `active`, `shadowed_by`, and `source`. The resume-safe slash surface therefore looks JSON-shaped while throwing away exactly the structured inventory that automation needs, and it diverges from the already-existing top-level command schema. This is distinct from #325's broad help JSON opacity and #328's source-root mismatch: the pinpoint is the `/agents` slash command losing structured inventory in resume mode even though the non-slash agents command already has it. **Required fix shape:** (a) make resume-safe `/agents --output-format json` reuse the same serializer/schema as `claw agents --output-format json` instead of wrapping rendered text; (b) preserve per-agent source/provenance fields, model/reasoning metadata, active/shadowed state, count/summary, and working-directory context; (c) keep `text` or `message` as an optional human summary only, not the sole payload; (d) add regression coverage proving top-level `claw agents --output-format json` and resume-safe `/agents --output-format json` expose equivalent structured agent inventory for the same workspace. **Why this matters:** `/agents` is the in-session delegation/staffing truth surface. Claws operating through `--resume latest` need to choose agents without scraping prose; losing structure at the slash boundary makes automated staffing brittle and contradicts the top-level command contract. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed slash `/agents` JSON had only `kind,text` while top-level `agents` JSON had `agents[]` and provenance metadata.
|
||||
337. **`status` / `/session list --output-format json` session lifecycle reports `workspace_dirty: true` and `abandoned: true` but omits dirty-file detail and abandonment cause, making automated GC unable to distinguish live work from crash leftovers** — restored from PR #2852 / Jobdori dogfood on current main (`0f7578c`). The evidence bundle listed 10 sessions and every listed session had `workspace_dirty: true` plus `abandoned: true`; each lifecycle object exposed `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, and `workspace_dirty: true`, but did not include `dirty_file_count`, `dirty_file_paths` / summary, or `abandoned_reason`. That leaves cleanup policy with only a boolean dirty/abandoned pair: it cannot tell whether a saved-only session contains intentional uncommitted user work, a harmless stale pane artifact, or crash leftovers that are safe to collect. **Required fix shape:** (a) add `dirty_file_count: u32` to session lifecycle/status payloads whenever dirty state is evaluated; (b) add an `abandoned_reason` enum such as `pane_closed`, `process_killed`, `session_replaced`, `workspace_missing`, or `unknown` instead of a bare boolean-only abandonment signal; (c) optionally add summarized `dirty_file_paths` / `dirty_file_summary` with truncation metadata so automation can present useful evidence without leaking excessive path detail; (d) add regression coverage proving dirty abandoned saved-only sessions include file count, abandonment reason, and stable behavior when path summaries are omitted or truncated. **Why this matters:** session GC must not delete live user work, but it also cannot leave every crash leftover forever. A lifecycle object that says only `workspace_dirty: true` and `abandoned: true` forces cleanup tooling to guess instead of applying a safe policy from structured evidence. Source: PR #2852 / Jobdori dogfood; all 10 listed sessions shared the same dirty+abandoned shape, and the sample lifecycle object had `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, `workspace_dirty: true`, with no dirty-file count, path summary, or abandonment reason.
|
||||
|
||||
338. **Top-level `help --output-format json` and resume-safe `/help --output-format json` use different payload fields for the same help surface (`message` vs `text`)** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `24ccb59b` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `24ccb59b`, matching the workspace. Running `./rust/target/debug/claw help --output-format json` returned a valid JSON object with keys `kind,message`, while `./rust/target/debug/claw --resume latest /help --output-format json` returned the same conceptual help surface with keys `kind,text`. Both are prose-only help payloads, but automation now has to special-case whether help was reached through the top-level command dispatcher or the resume-safe slash dispatcher before it can even locate the rendered help body. This is distinct from #325's broader structured-schema absence: the pinpoint here is a concrete JSON field-name contract drift between two help entrypoints that should be equivalent or explicitly versioned. **Required fix shape:** (a) define one canonical help JSON body field such as `message` or `text` and use it consistently across top-level `help`, slash `/help`, and resume-safe `/help`; (b) if backward compatibility requires both fields temporarily, emit both with identical contents plus a `schema_version` and deprecation metadata; (c) add regression coverage proving `claw help --output-format json` and `claw --resume latest /help --output-format json` expose the same top-level field contract and `kind=help`; (d) document whether slash-command JSON is intended to share schemas with top-level command JSON or carry its own explicit schema namespace. **Why this matters:** help JSON is the bootstrap discoverability surface for claws. If the same help concept moves its body between `message` and `text` depending on invocation path, every orchestrator needs brittle per-entrypoint parsers before it can inspect commands, flags, or resume safety. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed top-level help JSON keys `kind,message` and resume-safe slash help JSON keys `kind,text` on the same rebuilt binary.
|
||||
|
||||
337. **`/session list --output-format json` session lifecycle objects expose `workspace_dirty: true` but provide no detail about which files are dirty or how many, making the flag unactionable for GC/cleanup decisions** — dogfooded 2026-04-29 by Jobdori on current main (`0f7578c`). Running `claw --output-format json --resume latest /session list` returns `session_details` where every `lifecycle` object for non-current sessions shows `"workspace_dirty": true, "pane_id": null` — and no `dirty_files`, `dirty_file_count`, `git_status`, or `uncommitted_changes` field. Additionally `"abandoned": true` appears on most saved sessions with no definition of the abandonment criterion. A session GC policy cannot determine whether dirty state is intentional (user edits) or stale (crash leftover), nor how many files are affected, nor what caused the abandonment. **Required fix shape:** (a) add `dirty_file_count: u32` to lifecycle objects when `workspace_dirty = true`; (b) optionally add `dirty_file_paths` (summarized); (c) add `abandoned_reason: "pane_closed" | "process_killed" | "session_replaced" | null`; (d) regression coverage. Source: Jobdori live dogfood on mengmotaHost, claw-code `0f7578c`, 2026-04-29.
|
||||
339. **`/session delete` is not resume-safe while `/session list` is, making session GC impossible from `--resume` mode automation** — dogfooded 2026-04-30 by Jobdori on `24ccb59`. Running `claw --output-format json --resume latest /session list` succeeds and returns the full session list (10 sessions, all `workspace_dirty: true, abandoned: true`). Running `claw --output-format json --resume latest /session delete <id>` returns `{"command":"...","error":"unsupported resumed slash command","type":"error"}` — again using `"type"` not `"kind"` (#336 vocab violation). An automation lane that discovers abandoned sessions via `--resume` cannot delete any of them via the same path; it must spawn an interactive REPL session just to issue delete, breaking the machine-readable JSON surface contract. **Required fix shape:** (a) mark `/session delete` as resume-safe; (b) return `{"kind":"session_deleted","session_id":"<id>","path":"<deleted_path>"}` on success; (c) require `--force` only for dirty/active sessions; (d) add regression coverage. Source: Jobdori live dogfood, mengmotaHost, `24ccb59`, 2026-04-30.
|
||||
|
||||
340. **Resume-safe `/session help --output-format json` writes its primary JSON error envelope to stderr and uses `type` instead of the session JSON `kind` vocabulary** — dogfooded 2026-04-29 on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `dc47482e`. Running `./rust/target/debug/claw --resume latest /session help --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/session help","error":"Unknown /session action ...","type":"error"}`. Meanwhile `/session list --output-format json` wrote valid stdout JSON with `kind=session_list`. The JSON output contract is therefore split across stderr for an error/help-ish action and switches vocabulary from `kind` to `type`; automation that reads stdout sees empty/non-JSON output and cannot handle errors consistently with successful session JSON responses. **Required fix shape:** (a) all `--output-format json` command responses, including resumed slash errors, should emit the primary JSON envelope on stdout; (b) use `kind:"error"` or a documented error schema consistently instead of an ad hoc `type` field; (c) reserve stderr prose for text mode or optional non-primary diagnostics, not the machine-readable envelope; (d) add a regression for `/session help` or an unsupported `/session` action under `--resume` proving stdout contains the structured JSON error envelope and stderr does not carry the only parseable payload. **Why this matters:** claws need one stdout JSON contract for both success and failure. If a help-ish session error is silently moved to stderr and shaped differently from `session_list`, orchestration lanes cannot distinguish an unsupported action from transport corruption or an empty response without bespoke stderr parsing. Source: gaebal-gajae dogfood follow-up for the 15:30 nudge on rebuilt `./rust/target/debug/claw` `dc47482e`.
|
||||
|
||||
341. **Resume-safe `/tasks --output-format json` emits an unsupported-command JSON error only on stderr and mixes `kind` with `type` classification vocabularies** — dogfooded 2026-04-29 for the 16:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `58569131`. Running `./rust/target/debug/claw --resume latest /tasks --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/tasks","error":"/tasks is not yet implemented in this build","kind":"unsupported_command","type":"error"}`. The unsupported command envelope therefore has two separate top-level classification vocabularies (`kind=unsupported_command` and `type=error`) and places the only parseable payload on stderr, while successful JSON commands use stdout and a `kind`-only classification. This is distinct from #340 because it is not session help; it shows implemented-but-unsupported command stubs can emit a dual-vocabulary error envelope. **Required fix shape:** (a) in `--output-format json` mode, emit the primary JSON envelope on stdout for unsupported resumed slash commands such as `/tasks`; (b) document and use one error discriminator, preferably `kind:"error"` plus `code:"unsupported_command"`, or `kind:"unsupported_command"` plus `status:"error"`, but not `type`; (c) reserve stderr for non-primary diagnostics or text-mode prose, never as the sole JSON payload; (d) add regression coverage for `/tasks` under `--resume` with JSON output proving stdout contains the structured error envelope, stderr is not the only parseable stream, and the envelope uses the documented single-vocabulary discriminator. **Why this matters:** claws need the same stdout JSON contract for implemented successes and implemented-but-unsupported stubs. If `/tasks` errors can silently move to stderr and advertise both `kind` and `type`, automation must special-case command stubs instead of applying one JSON error parser. Source: gaebal-gajae dogfood follow-up for the 16:00 nudge on rebuilt `./rust/target/debug/claw` `58569131`.
|
||||
342. **Resume-safe `/commands --output-format json` is rejected as an unknown slash command even though the error points users at `/help` for slash-command discovery, leaving no structured command-index alias** — dogfooded 2026-04-29 for the 16:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `f65b2b4f`. Running `./rust/target/debug/claw --resume latest /commands --output-format json` wrote no stdout bytes and emitted only stderr JSON: `{"command":"/commands","error":"Unknown slash command: /commands\n Help /help lists available slash commands","type":"error"}`. In the same rebuilt binary, `./rust/target/debug/claw --resume latest /help --output-format json` succeeded on stdout but exposed only prose keys `kind,text`. The discoverability path therefore has two gaps at once: the intuitive `/commands` index/alias is unavailable, and the fallback suggestion is buried inside an error string rather than surfaced as structured `suggested_command` / `discovery_command` metadata. This is distinct from #340 and #341: the pinpoint is not merely stderr-only JSON error placement, but the absence of a machine-readable slash-command discovery alias/index and typed correction guidance when users or claws try the natural `/commands` form. **Required fix shape:** (a) either implement `/commands` as a resume-safe alias for slash-command discovery or return a typed `unknown_command` JSON envelope with `suggested_command:"/help"` and `discovery_command:"/help"` fields; (b) make the primary JSON error envelope follow the stdout JSON contract and single-discriminator schema from #340/#341; (c) expose structured slash-command inventory from the discovery surface rather than requiring callers to scrape `text`; (d) add regression coverage proving `/commands --output-format json` either returns the structured command inventory or returns a structured correction that automation can follow without parsing prose. **Why this matters:** claws need a predictable way to discover valid slash commands before invoking them. If the natural command-index spelling fails with stderr-only JSON and a human-formatted hint, orchestration has to guess, parse prose, and special-case command discovery before it can even learn the supported command surface. Source: gaebal-gajae dogfood follow-up for the 16:30 nudge on rebuilt `./rust/target/debug/claw` `f65b2b4f`.
|
||||
|
||||
347. **`/session list --output-format json` returns `"title": null` for every session; sessions have no human-readable label and can only be distinguished by their timestamp-based ID** — dogfooded 2026-04-30 by Jobdori on `a1bfcd4`. Running `/session list --output-format json` returns `session_details` where every entry has `"title": null`. The `id` field is a machine-generated string such as `session-1776891003038-0` (millisecond epoch). There is no `/session rename <id> <title>` command, no auto-derived title (e.g. from the first user message or the git branch at session start), and no way to programmatically query "which session was working on feature X" without replaying the message history. In a multi-session workflow with 10+ sessions, an orchestration layer cannot build a human-readable session picker without fetching and reading message content from every session file. **Required fix shape:** (a) auto-derive `title` from the first user message (first N chars, truncated) or from the git branch name at session creation time; (b) expose a `/session rename <id> <title>` slash command so users can label sessions manually; (c) persist title in the session store so it survives restarts; (d) include the `title` field in `session_details` JSON response — even if the value is auto-derived and mutable; (e) add `created_at_ms` (tracked in #335) alongside title so sessions are fully sortable and labelable without reading message content. Source: Jobdori live dogfood, mengmotaHost, `a1bfcd4`, 2026-04-30.
|
||||
|
||||
Reference in New Issue
Block a user