mirror of
https://github.com/instructkr/claw-code.git
synced 2026-07-01 06:09:04 -04:00
Compare commits
27 Commits
docs/roadm
...
2030540985
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
2030540985 | ||
|
|
a1bfcd4110 | ||
|
|
c49839bb1f | ||
|
|
f65b2b4f0e | ||
|
|
f4b74e89dd | ||
|
|
5856913104 | ||
|
|
d45a0d2f5b | ||
|
|
dc47482e40 | ||
|
|
9537c97231 | ||
|
|
f56a5afcf7 | ||
|
|
3efaf551ed | ||
|
|
30c9b438ef | ||
|
|
587bb18572 | ||
|
|
24ccb59bd2 | ||
|
|
0e8e75ef75 | ||
|
|
0f7578c064 | ||
|
|
213d406cbf | ||
|
|
ee85fed6ca | ||
|
|
3a34d83749 | ||
|
|
981aff7c8b | ||
|
|
c94940effa | ||
|
|
b90875fa8e | ||
|
|
2567cbcc78 | ||
|
|
d607ff3674 | ||
|
|
cdf6282965 | ||
|
|
e7074f47ee | ||
|
|
9468383b67 |
26
ROADMAP.md
26
ROADMAP.md
@@ -6258,3 +6258,29 @@ Original filing (2026-04-18): the session emitted `SessionStart hook (completed)
|
||||
248. **Non-interactive prompt mode can exceed caller timeouts with no in-band startup/API phase event or partial status artifact** — dogfooded 2026-04-29 from live tmux session `claw-code-issue-247-human-fresh-run` after the owner explicitly asked gaebal-gajae to make a fresh session and use `claw-code` directly. The actual `./rust/target/debug/claw` binary was launched via `clawhip tmux new` on current main. `claw doctor --output-format json` and `claw status --output-format json` both succeeded and reported auth/config/workspace ok, but minimal non-interactive prompt calls (`timeout 120 ./rust/target/debug/claw --output-format json --dangerously-skip-permissions "echo hello"` and `timeout 120 ./rust/target/debug/claw --output-format json prompt "Reply with just the word hello"`) both timed out from the outer harness after roughly 150s with only `Command exceeded timeout` visible. There was no machine-readable `api_request_started`, `waiting_for_first_token`, provider/model/base-url identity, retry count, or partial status file/event that would let clawhip distinguish slow provider, network stall, auth/OAuth drift, stream parser hang, or prompt-mode bug. **Required fix shape:** (a) emit structured non-interactive lifecycle events for `startup_ok`, `api_request_started`, `first_byte/first_token`, retry/backoff, and terminal `timeout_or_stall` states; (b) include provider/model/base URL source and auth source category without leaking secrets; (c) support a CLI/request timeout flag or env override that returns a typed JSON error before the outer orchestrator kills the process; (d) write/emit a final partial status artifact on timeout so lane monitors do not have to infer state from a dead process. **Why this matters:** non-interactive prompt mode is the automation path; if it can hang past the caller's timeout while doctor/status are green, claws lose the ability to tell whether startup, auth, transport, provider latency, or stream consumption failed. Source: live session `claw-code-issue-247-human-fresh-run` on 2026-04-29.
|
||||
|
||||
249. **`/issue` advertises GitHub issue creation but never reaches a GitHub/OAuth/auth preflight or creation path, and the non-interactive error suggests unusable resume forms** — dogfooded 2026-04-29 on current main `8e22f757` while chasing the remaining Phase-0 GitHub OAuth blocker. The visible help advertises `/issue [context]` as “Draft or create a GitHub issue from the conversation,” but the actual implementation path only renders a local `Issue` report (`format_issue_report`) and does not invoke `gh`, GitHub API, OAuth, token discovery, browser auth, or even a dry-run/auth-preflight surface. Direct non-interactive use (`./rust/target/debug/claw '/issue dogfood test'`) returns `slash command /issue dogfood test is interactive-only` and suggests `claw --resume SESSION.jsonl /issue ...` / `claw --resume latest /issue ...` “when the command is marked [resume]”, while `/help` does not mark `/issue` as resume-safe and resume dispatch rejects interactive-only commands. That leaves operators with a GitHub-labeled command whose real behavior is neither issue creation nor a clear GitHub OAuth blocker. **Required fix shape:** (a) split the contract explicitly: either rename/copy to “draft issue text” or implement a real `create` path with GitHub auth preflight; (b) surface a machine-readable GitHub auth state (`gh_cli_authenticated`, `github_token_present`, `oauth_required`, `creation_unavailable`) before any issue-create attempt; (c) make the direct-mode error avoid suggesting resume forms for commands not marked resume-safe; (d) add regression coverage proving `/issue` help, direct-mode rejection, resume support flags, and creation/draft behavior agree. **Why this matters:** Phase-0 GitHub OAuth verification cannot complete if the only GitHub issue surface stops at local prose while still advertising creation. Claws need to know whether they are missing GitHub auth, using a draft-only helper, or hitting an unimplemented creation path. Source: gaebal-gajae dogfood cycle in `#clawcode-building-in-public` on 2026-04-29.
|
||||
322. **Config deprecation warnings are emitted to stderr even under `--output-format json`, making JSON output unparseable from combined stdout+stderr capture** — dogfooded 2026-04-29 by Jobdori on current main (`8e22f75`). Running `cargo run --bin claw -- doctor --output-format json 2>&1 | python3 -c "import sys,json; json.loads(sys.stdin.read())"` fails with `Expecting value: line 1 column 1 (char 0)` because a `warning: /path/settings.json: field "enabledPlugins" is deprecated. Use "plugins.enabled" instead` line is emitted to stderr before the JSON body begins. When a caller captures combined output (the common automation pattern: `2>&1`, subprocess `STDOUT | STDERR`, PTY capture, or tmux pane scrape) the warning prefix breaks JSON parse for every downstream consumer. Root cause: `rust/crates/runtime/src/config.rs` line ~300 calls `eprintln!("warning: {warning}")` unconditionally during `ClawSettings::load_merged()` regardless of active output format. **Required fix shape:** (a) thread the active `CliOutputFormat` through the config loading path and suppress or defer human-readable warning strings when `json` mode is active; (b) instead, collect deprecation diagnostics and inject them into the JSON output as a top-level `"warnings": [...]` array (same field already used by `doctor`); (c) ensure the JSON body is always the first bytes on stdout and all prose warnings stay on stderr or are suppressed in json mode; (d) add regression coverage proving `claw <any-cmd> --output-format json` stdout is valid JSON regardless of config deprecation state. **Why this matters:** `--output-format json` is the automation/claw contract; if config warnings can silently corrupt the JSON stream, every orchestration layer that captures combined output gets broken parse-on-warning with no stable fallback. Source: Jobdori live dogfood on mengmotaHost, claw-code main `8e22f75`, 2026-04-29.
|
||||
|
||||
323. **`status --output-format json` reports `session.session = "live-repl"` while simultaneously reporting `session_lifecycle.kind = "saved_only"` — contradictory session identity in a single status snapshot** — dogfooded 2026-04-29 by Jobdori on current main (`804d96b`). Running `claw status --output-format json` from an active REPL-style invocation produced `"session": "live-repl"` in the `workspace` block and `"session_lifecycle": {"kind": "saved_only", "pane_id": null, ...}` in the same object. Those two fields carry contradictory claims: `"live-repl"` asserts there is an active interactive session, while `"saved_only"` asserts there is no live tmux pane hosting the session — the session exists only as a saved artifact. A downstream claw reading this snapshot cannot tell which claim to trust: is this a running session whose pane is undetectable, or a saved-only session that the `session` field is misclassifying? Root cause: `"live-repl"` is a fallback sentinel emitted by `main.rs:6070` when `context.session_path` is `None`, while `session_lifecycle` is computed independently by `classify_session_lifecycle_for()` from tmux pane discovery; the two fields share no common source and can diverge. **Required fix shape:** (a) derive both `session.session` and `session_lifecycle.kind` from the same lifecycle classification result so they cannot diverge; (b) replace the `"live-repl"` free-form sentinel with a structured `session_kind` field (`live_repl`, `saved`, `resume`, etc.) that carries the same type vocabulary as `session_lifecycle.kind`; (c) when `session_lifecycle.kind = "saved_only"`, never emit `"session": "live-repl"` (or vice versa); (d) add a regression test proving `status --output-format json` never emits `session.kind = "live_repl"` and `session_lifecycle.kind = "saved_only"` simultaneously. **Why this matters:** `status --output-format json` is the machine-readable truth surface for session state; if two fields in the same snapshot contradict each other, every lane, monitor, and orchestrator has to pick a winner instead of reading a coherent state. Source: Jobdori live dogfood on mengmotaHost, claw-code `804d96b`, 2026-04-29.
|
||||
|
||||
324. **Stale local debug binaries can impersonate the current workspace because version/status/doctor do not compare embedded build provenance to repo HEAD** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `e7074f47` after PR #2838. The working tree was at `e7074f47`, but running `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `1f901988`. `status` and `doctor` remained green and exposed no warning that the executable under test was stale relative to the workspace HEAD, nor any structured build-provenance freshness signal that downstream claws could use to decide whether the observed behavior came from the checked-out code or an older debug artifact. This is a repo-identity opacity gap: the JSON truth surfaces can look authoritative while actually describing a different binary lineage than the source tree being dogfooded. **Required fix shape:** (a) compare the embedded build `git_sha` / build date with the current workspace git HEAD and dirty state when the binary can discover a containing worktree; (b) expose redaction-safe structured fields in `version --output-format json`, `status --output-format json`, and `doctor --output-format json`, including `binary_provenance`, `workspace_head`, and `stale_binary` (with enough reason/detail to distinguish clean match, dirty workspace, unknown workspace, and definite stale SHA mismatch); (c) warn in human/text mode when executing a stale local debug binary such as `./rust/target/debug/claw` so dogfooders do not trust old behavior as current-main evidence; (d) avoid leaking secrets or absolute sensitive paths beyond the existing workspace-identification policy; (e) add regression/fixture coverage for matching HEAD, dirty workspace, no-worktree/unknown provenance, and stale embedded SHA cases. **Why this matters:** status/doctor/version are supposed to be the machine-readable basis for dogfood truth. If a stale binary can report a different `git_sha` than the checked-out repo without any freshness warning, claws can file or verify bugs against the wrong code and waste cycles chasing already-fixed or not-yet-built behavior. Source: gaebal-gajae dogfood follow-up from current main `e7074f47` after PR #2838; observed `./rust/target/debug/claw version --output-format json` reporting `git_sha` `1f901988` with no stale-binary-vs-workspace-HEAD warning.
|
||||
|
||||
325. **`help --output-format json` returns valid JSON but hides the actual help schema inside one prose `message` string** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `d607ff36`. Running `./rust/target/debug/claw help --output-format json` produces parseable JSON, but the object only exposes top-level keys like `kind` and `message`; all command names, global flags, slash-command metadata, aliases, resume-safety, output-format support, auth/preflight notes, and descriptions are flattened into one human-oriented prose blob. That technically satisfies “valid JSON” while still forcing automation to scrape the same help text humans read, making `/issue`, `/help`, and resume-safety contracts opaque to claws. **Required fix shape:** (a) keep `message` as the compact human-rendered help summary, but add a documented structured schema with `schema` / `schema_version` fields; (b) expose first-class arrays/objects such as `commands[]`, `options[]`, and `slash_commands[]` with stable fields including `name`, `aliases`, `description`, `args`, `output_formats_supported`, `resume_safe`, `interactive_only`, and `creates_external_side_effects`; (c) include auth and creation preflight metadata where relevant, especially for GitHub/issue flows (`auth_preflight`, `creation_unavailable`, `gh_cli_authenticated`, `github_token_present`, or equivalent non-secret state); (d) make `/issue`, `/help`, aliases, and resume-dispatch safety machine-readable from the JSON payload instead of recoverable only by parsing prose markers; (e) add regression coverage proving `help --output-format json` is valid JSON and that `/issue`, `/help`, resume-safe vs interactive-only slash commands, aliases, descriptions, supported output formats, and side-effect/auth-preflight fields are present and internally consistent. **Why this matters:** help JSON is the discoverability surface automation uses before invoking commands. If it is just prose wrapped in JSON, claws cannot safely decide whether a command can run non-interactively, resume from a saved session, create external GitHub side effects, or requires auth/preflight without brittle text scraping. Source: gaebal-gajae dogfood follow-up from current main `d607ff36`; observed `./rust/target/debug/claw help --output-format json` returning valid JSON with only `{kind,message}` at the top level while the actionable command schema remained buried in `message`.
|
||||
|
||||
326. **`status --output-format json` underreports active workspace pane inventory when one tmux session has multiple panes/processes in the same project** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `b90875fa` while responding to the claw-code dogfood nudge. The active OMX session `claw-code-issue-326-dogfood-pinpoint` was running in `/mnt/offloading/Workspace/claw-code` with two panes: `%9384` (`cmd=node`, active pane) and `%9385` (`cmd=node`, inactive sidecar pane). `tmux list-panes -a -F '#{session_name}:#{window_index}.#{pane_index} #{pane_id} pid=#{pane_pid} cmd=#{pane_current_command} cwd=#{pane_current_path} active=#{pane_active}'` showed both panes in the same session/workspace, but `./rust/target/debug/claw status --output-format json` collapsed the workspace lifecycle to a single object: `session_lifecycle.kind = "running_process"`, `pane_id = "%9384"`, `pane_command = "node"`, with no `panes[]`, process count, sidecar/secondary-pane inventory, or ambiguity marker. A downstream claw reading only status JSON would believe there is exactly one live process for that workspace even though the control plane has multiple panes in the same task session. **Required fix shape:** (a) expose a structured active-session inventory in `status --output-format json`, including `panes[]` or `processes[]` with pane id, command, cwd, active flag, and session/window identity for all matching workspace panes; (b) keep the compact `session_lifecycle` summary, but add an explicit `pane_count` / `has_sidecar_panes` / `inventory_truncated` signal so summaries cannot masquerade as complete truth; (c) define how to classify primary vs sidecar/inactive panes without losing them, and make the chosen primary pane provenance visible; (d) add regression coverage for a tmux session with two panes in one workspace proving status JSON reports both panes or marks the inventory as partial. **Why this matters:** status JSON is the machine-readable lane truth surface. If it reports only the primary pane while hiding secondary panes, clawhip and other claws can miss sidecar workers, blocked helpers, stale subprocesses, or duplicated control-plane processes and make bad restart/cleanup/routing decisions from an undercounted session snapshot. Source: gaebal-gajae dogfood session `claw-code-issue-326-dogfood-pinpoint`; observed `claw status --output-format json` returning only `%9384` while `tmux list-panes` showed `%9384` and `%9385` in the same claw-code workspace.
|
||||
|
||||
327. **`claw mcp help` omits `.claw.json` from its documented config sources even though `claw mcp` still loads MCP servers from `.claw.json`** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `981aff7c` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json` so `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `981aff7c` matching the workspace. Running `./rust/target/debug/claw mcp --help` printed `Sources .claw/settings.json, .claw/settings.local.json`, and `./rust/target/debug/claw mcp help --output-format json` returned `"sources": [".claw/settings.json", ".claw/settings.local.json"]`. In the same rebuilt binary, a temp workspace containing only a project `.claw.json` with `{"mcpServers":{"demo":{"command":"/bin/echo","args":["hi"]}}}` made `./rust/target/debug/claw mcp --output-format json` report `configured_servers: 1` and `servers[0].name: "demo"`. The MCP lifecycle surface therefore tells users and claws that `.claw.json` is not a source while actively accepting it as one. This is distinct from #322's JSON warning corruption, #323/#326's status lifecycle contradictions, #324's stale-binary provenance gap, and #325's top-level help schema flattening: the pinpoint is a concrete MCP subcommand source-of-truth mismatch in both text and JSON help. **Required fix shape:** (a) derive the `mcp help` source list from the same `ConfigLoader::discover`/settings-source registry that `mcp list` actually uses instead of hard-coding a partial list; (b) include all supported MCP config sources in stable order, including legacy/project `.claw.json`, user `~/.claw/settings.json`, project `.claw/settings.json`, and local `.claw/settings.local.json` as applicable; (c) add source metadata to `mcp --output-format json` entries so each server can be attributed to the file/layer that provided it; (d) add a regression proving a server loaded from `.claw.json` is accompanied by help/JSON source metadata that names `.claw.json`, and that help stays in sync when config source discovery changes. **Why this matters:** MCP setup is already a high-friction lifecycle path; if the command that diagnoses MCP servers omits a still-supported source, operators can move or delete the wrong config file, and automation cannot tell whether `.claw.json` support is intentional compatibility or accidental legacy behavior. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using the rebuilt actual `./rust/target/debug/claw`; temp-workspace proof showed `.claw.json` loads one MCP server while `mcp help` documents only `.claw/settings*.json` sources.
|
||||
|
||||
328. **`claw agents help` omits the `.codex/agents` roots that `claw agents` actually loads from, so native-agent discovery provenance is misleading** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `ee85fed6` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` then reported embedded `git_sha` `ee85fed6`, matching the workspace. Running `./rust/target/debug/claw agents help --output-format json` returned `usage.sources = [".claw/agents", "~/.claw/agents", "$CLAW_CONFIG_HOME/agents"]`, with no `.codex/agents` or `~/.codex/agents` entry. In the same environment, `./rust/target/debug/claw agents --output-format json` listed native agents such as `analyst` with source `{id: "user_claw", label: "User home roots"}` even though `/home/bellman/.claw/agents` does not exist and `/home/bellman/.codex/agents/analyst.toml` does exist. The agents lifecycle surface therefore documents one set of roots while loading from another, and the loaded-agent provenance collapses the real Codex root behind a generic `user_claw` label. This is distinct from #327's MCP source-list mismatch: the affected subsystem is native-agent discovery, where claws choose delegation/staffing lanes from `claw agents` and need to know which root supplied each agent. **Required fix shape:** (a) derive `agents help` source roots from the same registry/search path used by the agent loader instead of a hard-coded `.claw`-only list; (b) include all supported native-agent roots in stable order, including project/user `.codex/agents` roots alongside `.claw/agents` and `$CLAW_CONFIG_HOME/agents`; (c) make each `agents --output-format json` entry expose non-secret source provenance precise enough to distinguish `user_codex`, `project_codex`, `user_claw`, and `project_claw` (without leaking unnecessary absolute paths); (d) add a regression proving an agent loaded from `~/.codex/agents` is accompanied by help-source metadata naming that root and per-agent provenance that does not mislabel it as generic `user_claw`. **Why this matters:** agent selection is a control-plane decision. If help says only `.claw/agents` are searched while the runtime actually consumes `.codex/agents`, claws and operators can edit the wrong directory, misdiagnose missing/stale agents, or trust the wrong ownership boundary for delegated work. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed `agents help` omitting `.codex/agents` while `agents` loaded `analyst` from the existing `/home/bellman/.codex/agents/analyst.toml` with no `/home/bellman/.claw/agents` directory present.
|
||||
329. **Resume-safe slash `/agents --output-format json` downgrades structured agent inventory into prose even though top-level `claw agents --output-format json` returns machine-readable entries** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `0f7578c0` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `0f7578c0`, matching the workspace. Running `./rust/target/debug/claw --resume latest /agents --output-format json` returned only `{"kind":"agents","text":"Agents\n 20 active agents..."}`: the agent names, source ids, models, reasoning effort, active/shadowed state, and working-directory context are all flattened into one human prose string. In the same rebuilt binary and same workspace, `./rust/target/debug/claw agents --output-format json` returned a structured object with top-level `agents[]`, `count`, `summary`, `working_directory`, and per-agent fields such as `name`, `description`, `model`, `reasoning_effort`, `active`, `shadowed_by`, and `source`. The resume-safe slash surface therefore looks JSON-shaped while throwing away exactly the structured inventory that automation needs, and it diverges from the already-existing top-level command schema. This is distinct from #325's broad help JSON opacity and #328's source-root mismatch: the pinpoint is the `/agents` slash command losing structured inventory in resume mode even though the non-slash agents command already has it. **Required fix shape:** (a) make resume-safe `/agents --output-format json` reuse the same serializer/schema as `claw agents --output-format json` instead of wrapping rendered text; (b) preserve per-agent source/provenance fields, model/reasoning metadata, active/shadowed state, count/summary, and working-directory context; (c) keep `text` or `message` as an optional human summary only, not the sole payload; (d) add regression coverage proving top-level `claw agents --output-format json` and resume-safe `/agents --output-format json` expose equivalent structured agent inventory for the same workspace. **Why this matters:** `/agents` is the in-session delegation/staffing truth surface. Claws operating through `--resume latest` need to choose agents without scraping prose; losing structure at the slash boundary makes automated staffing brittle and contradicts the top-level command contract. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed slash `/agents` JSON had only `kind,text` while top-level `agents` JSON had `agents[]` and provenance metadata.
|
||||
337. **`status` / `/session list --output-format json` session lifecycle reports `workspace_dirty: true` and `abandoned: true` but omits dirty-file detail and abandonment cause, making automated GC unable to distinguish live work from crash leftovers** — restored from PR #2852 / Jobdori dogfood on current main (`0f7578c`). The evidence bundle listed 10 sessions and every listed session had `workspace_dirty: true` plus `abandoned: true`; each lifecycle object exposed `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, and `workspace_dirty: true`, but did not include `dirty_file_count`, `dirty_file_paths` / summary, or `abandoned_reason`. That leaves cleanup policy with only a boolean dirty/abandoned pair: it cannot tell whether a saved-only session contains intentional uncommitted user work, a harmless stale pane artifact, or crash leftovers that are safe to collect. **Required fix shape:** (a) add `dirty_file_count: u32` to session lifecycle/status payloads whenever dirty state is evaluated; (b) add an `abandoned_reason` enum such as `pane_closed`, `process_killed`, `session_replaced`, `workspace_missing`, or `unknown` instead of a bare boolean-only abandonment signal; (c) optionally add summarized `dirty_file_paths` / `dirty_file_summary` with truncation metadata so automation can present useful evidence without leaking excessive path detail; (d) add regression coverage proving dirty abandoned saved-only sessions include file count, abandonment reason, and stable behavior when path summaries are omitted or truncated. **Why this matters:** session GC must not delete live user work, but it also cannot leave every crash leftover forever. A lifecycle object that says only `workspace_dirty: true` and `abandoned: true` forces cleanup tooling to guess instead of applying a safe policy from structured evidence. Source: PR #2852 / Jobdori dogfood; all 10 listed sessions shared the same dirty+abandoned shape, and the sample lifecycle object had `abandoned: true`, `kind: "saved_only"`, `pane_id: null`, `workspace_dirty: true`, with no dirty-file count, path summary, or abandonment reason.
|
||||
|
||||
338. **Top-level `help --output-format json` and resume-safe `/help --output-format json` use different payload fields for the same help surface (`message` vs `text`)** — dogfooded 2026-04-29 on current `origin/main` / workspace HEAD `24ccb59b` after rebuilding the actual debug binary with `cargo run --manifest-path rust/Cargo.toml --bin claw -- version --output-format json`; `./rust/target/debug/claw version --output-format json` reported embedded `git_sha` `24ccb59b`, matching the workspace. Running `./rust/target/debug/claw help --output-format json` returned a valid JSON object with keys `kind,message`, while `./rust/target/debug/claw --resume latest /help --output-format json` returned the same conceptual help surface with keys `kind,text`. Both are prose-only help payloads, but automation now has to special-case whether help was reached through the top-level command dispatcher or the resume-safe slash dispatcher before it can even locate the rendered help body. This is distinct from #325's broader structured-schema absence: the pinpoint here is a concrete JSON field-name contract drift between two help entrypoints that should be equivalent or explicitly versioned. **Required fix shape:** (a) define one canonical help JSON body field such as `message` or `text` and use it consistently across top-level `help`, slash `/help`, and resume-safe `/help`; (b) if backward compatibility requires both fields temporarily, emit both with identical contents plus a `schema_version` and deprecation metadata; (c) add regression coverage proving `claw help --output-format json` and `claw --resume latest /help --output-format json` expose the same top-level field contract and `kind=help`; (d) document whether slash-command JSON is intended to share schemas with top-level command JSON or carry its own explicit schema namespace. **Why this matters:** help JSON is the bootstrap discoverability surface for claws. If the same help concept moves its body between `message` and `text` depending on invocation path, every orchestrator needs brittle per-entrypoint parsers before it can inspect commands, flags, or resume safety. Source: gaebal-gajae dogfood in `/home/bellman/Workspace/claw-code` on 2026-04-29 using rebuilt `./rust/target/debug/claw`; proof commands showed top-level help JSON keys `kind,message` and resume-safe slash help JSON keys `kind,text` on the same rebuilt binary.
|
||||
|
||||
339. **`/session delete` is not resume-safe while `/session list` is, making session GC impossible from `--resume` mode automation** — dogfooded 2026-04-30 by Jobdori on `24ccb59`. Running `claw --output-format json --resume latest /session list` succeeds and returns the full session list (10 sessions, all `workspace_dirty: true, abandoned: true`). Running `claw --output-format json --resume latest /session delete <id>` returns `{"command":"...","error":"unsupported resumed slash command","type":"error"}` — again using `"type"` not `"kind"` (#336 vocab violation). An automation lane that discovers abandoned sessions via `--resume` cannot delete any of them via the same path; it must spawn an interactive REPL session just to issue delete, breaking the machine-readable JSON surface contract. **Required fix shape:** (a) mark `/session delete` as resume-safe; (b) return `{"kind":"session_deleted","session_id":"<id>","path":"<deleted_path>"}` on success; (c) require `--force` only for dirty/active sessions; (d) add regression coverage. Source: Jobdori live dogfood, mengmotaHost, `24ccb59`, 2026-04-30.
|
||||
|
||||
340. **Resume-safe `/session help --output-format json` writes its primary JSON error envelope to stderr and uses `type` instead of the session JSON `kind` vocabulary** — dogfooded 2026-04-29 on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `dc47482e`. Running `./rust/target/debug/claw --resume latest /session help --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/session help","error":"Unknown /session action ...","type":"error"}`. Meanwhile `/session list --output-format json` wrote valid stdout JSON with `kind=session_list`. The JSON output contract is therefore split across stderr for an error/help-ish action and switches vocabulary from `kind` to `type`; automation that reads stdout sees empty/non-JSON output and cannot handle errors consistently with successful session JSON responses. **Required fix shape:** (a) all `--output-format json` command responses, including resumed slash errors, should emit the primary JSON envelope on stdout; (b) use `kind:"error"` or a documented error schema consistently instead of an ad hoc `type` field; (c) reserve stderr prose for text mode or optional non-primary diagnostics, not the machine-readable envelope; (d) add a regression for `/session help` or an unsupported `/session` action under `--resume` proving stdout contains the structured JSON error envelope and stderr does not carry the only parseable payload. **Why this matters:** claws need one stdout JSON contract for both success and failure. If a help-ish session error is silently moved to stderr and shaped differently from `session_list`, orchestration lanes cannot distinguish an unsupported action from transport corruption or an empty response without bespoke stderr parsing. Source: gaebal-gajae dogfood follow-up for the 15:30 nudge on rebuilt `./rust/target/debug/claw` `dc47482e`.
|
||||
|
||||
341. **Resume-safe `/tasks --output-format json` emits an unsupported-command JSON error only on stderr and mixes `kind` with `type` classification vocabularies** — dogfooded 2026-04-29 for the 16:00 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `58569131`. Running `./rust/target/debug/claw --resume latest /tasks --output-format json` wrote no stdout bytes, but wrote a JSON error object to stderr: `{"command":"/tasks","error":"/tasks is not yet implemented in this build","kind":"unsupported_command","type":"error"}`. The unsupported command envelope therefore has two separate top-level classification vocabularies (`kind=unsupported_command` and `type=error`) and places the only parseable payload on stderr, while successful JSON commands use stdout and a `kind`-only classification. This is distinct from #340 because it is not session help; it shows implemented-but-unsupported command stubs can emit a dual-vocabulary error envelope. **Required fix shape:** (a) in `--output-format json` mode, emit the primary JSON envelope on stdout for unsupported resumed slash commands such as `/tasks`; (b) document and use one error discriminator, preferably `kind:"error"` plus `code:"unsupported_command"`, or `kind:"unsupported_command"` plus `status:"error"`, but not `type`; (c) reserve stderr for non-primary diagnostics or text-mode prose, never as the sole JSON payload; (d) add regression coverage for `/tasks` under `--resume` with JSON output proving stdout contains the structured error envelope, stderr is not the only parseable stream, and the envelope uses the documented single-vocabulary discriminator. **Why this matters:** claws need the same stdout JSON contract for implemented successes and implemented-but-unsupported stubs. If `/tasks` errors can silently move to stderr and advertise both `kind` and `type`, automation must special-case command stubs instead of applying one JSON error parser. Source: gaebal-gajae dogfood follow-up for the 16:00 nudge on rebuilt `./rust/target/debug/claw` `58569131`.
|
||||
342. **Resume-safe `/commands --output-format json` is rejected as an unknown slash command even though the error points users at `/help` for slash-command discovery, leaving no structured command-index alias** — dogfooded 2026-04-29 for the 16:30 nudge on current `origin/main` / rebuilt `./rust/target/debug/claw` with embedded `git_sha` `f65b2b4f`. Running `./rust/target/debug/claw --resume latest /commands --output-format json` wrote no stdout bytes and emitted only stderr JSON: `{"command":"/commands","error":"Unknown slash command: /commands\n Help /help lists available slash commands","type":"error"}`. In the same rebuilt binary, `./rust/target/debug/claw --resume latest /help --output-format json` succeeded on stdout but exposed only prose keys `kind,text`. The discoverability path therefore has two gaps at once: the intuitive `/commands` index/alias is unavailable, and the fallback suggestion is buried inside an error string rather than surfaced as structured `suggested_command` / `discovery_command` metadata. This is distinct from #340 and #341: the pinpoint is not merely stderr-only JSON error placement, but the absence of a machine-readable slash-command discovery alias/index and typed correction guidance when users or claws try the natural `/commands` form. **Required fix shape:** (a) either implement `/commands` as a resume-safe alias for slash-command discovery or return a typed `unknown_command` JSON envelope with `suggested_command:"/help"` and `discovery_command:"/help"` fields; (b) make the primary JSON error envelope follow the stdout JSON contract and single-discriminator schema from #340/#341; (c) expose structured slash-command inventory from the discovery surface rather than requiring callers to scrape `text`; (d) add regression coverage proving `/commands --output-format json` either returns the structured command inventory or returns a structured correction that automation can follow without parsing prose. **Why this matters:** claws need a predictable way to discover valid slash commands before invoking them. If the natural command-index spelling fails with stderr-only JSON and a human-formatted hint, orchestration has to guess, parse prose, and special-case command discovery before it can even learn the supported command surface. Source: gaebal-gajae dogfood follow-up for the 16:30 nudge on rebuilt `./rust/target/debug/claw` `f65b2b4f`.
|
||||
|
||||
347. **`/session list --output-format json` returns `"title": null` for every session; sessions have no human-readable label and can only be distinguished by their timestamp-based ID** — dogfooded 2026-04-30 by Jobdori on `a1bfcd4`. Running `/session list --output-format json` returns `session_details` where every entry has `"title": null`. The `id` field is a machine-generated string such as `session-1776891003038-0` (millisecond epoch). There is no `/session rename <id> <title>` command, no auto-derived title (e.g. from the first user message or the git branch at session start), and no way to programmatically query "which session was working on feature X" without replaying the message history. In a multi-session workflow with 10+ sessions, an orchestration layer cannot build a human-readable session picker without fetching and reading message content from every session file. **Required fix shape:** (a) auto-derive `title` from the first user message (first N chars, truncated) or from the git branch name at session creation time; (b) expose a `/session rename <id> <title>` slash command so users can label sessions manually; (c) persist title in the session store so it survives restarts; (d) include the `title` field in `session_details` JSON response — even if the value is auto-derived and mutable; (e) add `created_at_ms` (tracked in #335) alongside title so sessions are fully sortable and labelable without reading message content. Source: Jobdori live dogfood, mengmotaHost, `a1bfcd4`, 2026-04-30.
|
||||
|
||||
Reference in New Issue
Block a user