mirror of
https://github.com/instructkr/claw-code.git
synced 2026-06-12 03:25:01 -04:00
Compare commits
23 Commits
07b4fd0300
...
35cf2476a3
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
35cf2476a3 | ||
|
|
df0102307b | ||
|
|
a12b14652d | ||
|
|
59fd1253b0 | ||
|
|
4423774573 | ||
|
|
d01ebd345b | ||
|
|
23b5d6a0ce | ||
|
|
eefcfe159f | ||
|
|
4621f339a9 | ||
|
|
40147e1db3 | ||
|
|
2a9e6aa68b | ||
|
|
c014acb787 | ||
|
|
1291d9fbc3 | ||
|
|
c14a366dfa | ||
|
|
f84676d534 | ||
|
|
ee3d325a0b | ||
|
|
66935ea0bf | ||
|
|
e63ede0462 | ||
|
|
68c9fd6315 | ||
|
|
350e9ee70b | ||
|
|
cd773056e7 | ||
|
|
ab5f25950b | ||
|
|
ec5fd078e3 |
36
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
36
.github/ISSUE_TEMPLATE/bug_report.md
vendored
Normal file
@@ -0,0 +1,36 @@
|
||||
---
|
||||
name: Bug Report
|
||||
about: Report a bug in claw-code
|
||||
title: "[bug] "
|
||||
labels: bug
|
||||
assignees: ''
|
||||
---
|
||||
|
||||
## Description
|
||||
|
||||
<!-- What happened? -->
|
||||
|
||||
## Steps to Reproduce
|
||||
|
||||
1.
|
||||
2.
|
||||
3.
|
||||
|
||||
## Expected Behavior
|
||||
|
||||
<!-- What should have happened? -->
|
||||
|
||||
## Actual Behavior
|
||||
|
||||
<!-- What actually happened? Include error messages, logs, screenshots -->
|
||||
|
||||
## Environment
|
||||
|
||||
- **claw-code version:**
|
||||
- **OS:**
|
||||
- **Provider/model:**
|
||||
- **Rust version (if building from source):**
|
||||
|
||||
## Additional Context
|
||||
|
||||
<!-- Related pinpoints, sessions, config, etc. -->
|
||||
27
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
27
.github/PULL_REQUEST_TEMPLATE.md
vendored
Normal file
@@ -0,0 +1,27 @@
|
||||
## Summary
|
||||
|
||||
<!-- Brief description of what this PR does -->
|
||||
|
||||
## Related Pinpoints / Issues
|
||||
|
||||
<!-- Link to ROADMAP.md pinpoints or GitHub issues, e.g., #283, #285 -->
|
||||
|
||||
## Changes
|
||||
|
||||
<!-- List key changes -->
|
||||
-
|
||||
|
||||
## Testing
|
||||
|
||||
<!-- How was this tested? -->
|
||||
- [ ] `cargo test` passes
|
||||
- [ ] `cargo fmt --check` passes
|
||||
- [ ] Manual verification (describe)
|
||||
|
||||
## Checklist
|
||||
|
||||
- [ ] Code follows project conventions
|
||||
- [ ] ROADMAP.md updated (if filing/closing pinpoints)
|
||||
- [ ] CHANGELOG.md updated (if user-facing change)
|
||||
- [ ] Documentation updated (if applicable)
|
||||
- [ ] No regressions in existing tests
|
||||
@@ -4,12 +4,16 @@ All notable changes to claw-code are documented in this file.
|
||||
|
||||
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html) (currently pre-1.0).
|
||||
|
||||
## [Unreleased] — 2026-04-26 to 2026-04-27 (extended dogfood audit cycles, through #427)
|
||||
## [Unreleased] — 2026-04-26 to 2026-04-27 (extended dogfood audit cycles, through #433)
|
||||
|
||||
Branch: `feat/jobdori-168c-emission-routing`
|
||||
|
||||
### Added — Documentation
|
||||
|
||||
- **docs/CONFIGURATION.md** — Configuration reference: env vars, settings.json, provider selection (cycle #429)
|
||||
- **CODE_OF_CONDUCT.md** — Contributor Covenant v2.1 (cycle #432)
|
||||
- **.github/PULL_REQUEST_TEMPLATE.md** — Standardized PR description template (cycle #430)
|
||||
- **.github/ISSUE_TEMPLATE/bug_report.md** — Standard bug report template (cycle #431)
|
||||
- **docs/ARCHITECTURE.md** — High-level architecture overview: 9 Rust crates, request flow, subsystem map with pinpoint links (cycle #426)
|
||||
- **CHANGELOG.md** — This file (cycle #424)
|
||||
- **docs/PINPOINT_FILING_GUIDE.md** — Step-by-step pinpoint filing workflow with #290 worked example (cycle #422)
|
||||
|
||||
77
CODE_OF_CONDUCT.md
Normal file
77
CODE_OF_CONDUCT.md
Normal file
@@ -0,0 +1,77 @@
|
||||
# Contributor Covenant Code of Conduct
|
||||
|
||||
## Our Pledge
|
||||
|
||||
We as members, contributors, and leaders pledge to make participation in our community a harassment-free experience for everyone, regardless of age, body size, visible or invisible disability, ethnicity, sex characteristics, gender identity and expression, level of experience, education, socio-economic status, nationality, personal appearance, race, caste, color, religion, or sexual identity and orientation.
|
||||
|
||||
We pledge to act and interact in ways that contribute to an open, welcoming, diverse, inclusive, and healthy community.
|
||||
|
||||
## Our Standards
|
||||
|
||||
Examples of behavior that contributes to a positive environment for our community include:
|
||||
|
||||
- Demonstrating empathy and kindness toward other people
|
||||
- Being respectful of differing opinions, viewpoints, and experiences
|
||||
- Giving and gracefully accepting constructive feedback
|
||||
- Accepting responsibility and apologizing to those affected by our mistakes, and learning from the experience
|
||||
- Focusing on what is best not just for us as individuals, but for the overall community
|
||||
|
||||
Examples of unacceptable behavior include:
|
||||
|
||||
- The use of sexualized language or imagery, and sexual attention or advances of any kind
|
||||
- Trolling, insulting or derogatory comments, and personal or political attacks
|
||||
- Public or private harassment
|
||||
- Publishing others' private information, such as a physical or email address, without their explicit permission
|
||||
- Other conduct which could reasonably be considered inappropriate in a professional setting
|
||||
|
||||
## Enforcement Responsibilities
|
||||
|
||||
Community leaders are responsible for clarifying and enforcing our standards of acceptable behavior and will take appropriate and fair corrective action in response to any behavior that they deem inappropriate, threatening, offensive, or harmful.
|
||||
|
||||
Community leaders have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, and will communicate reasons for moderation decisions when appropriate.
|
||||
|
||||
## Scope
|
||||
|
||||
This Code of Conduct applies within all community spaces, and also applies when an individual is officially representing the community in public spaces. Examples of representing our community include using an official e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event.
|
||||
|
||||
## Enforcement
|
||||
|
||||
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported to the community leaders responsible for enforcement at GitHub Security Advisories or email to the maintainers listed in SECURITY.md. All complaints will be reviewed and investigated promptly and fairly.
|
||||
|
||||
All community leaders are obligated to respect the privacy and security of the reporter of any incident.
|
||||
|
||||
## Enforcement Guidelines
|
||||
|
||||
Community leaders will follow these Community Impact Guidelines in determining the consequences for any action they deem in violation of this Code of Conduct:
|
||||
|
||||
### 1. Correction
|
||||
|
||||
Community Impact: Use of inappropriate language or other behavior deemed unprofessional or unwelcome in the community.
|
||||
|
||||
Consequence: A private, written warning from community leaders, providing clarity around the nature of the violation and an explanation of why the behavior was inappropriate. A public apology may be requested.
|
||||
|
||||
### 2. Warning
|
||||
|
||||
Community Impact: A violation through a single incident or series of actions.
|
||||
|
||||
Consequence: A warning with consequences for continued behavior. No interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, for a specified period of time. This includes avoiding interactions in community spaces as well as external channels like social media. Violating these terms may lead to a temporary or permanent ban.
|
||||
|
||||
### 3. Temporary Ban
|
||||
|
||||
Community Impact: A serious violation of community standards, including sustained inappropriate behavior.
|
||||
|
||||
Consequence: A temporary ban from any sort of interaction or public communication with the community for a specified period of time. No public or private interaction with the people involved, including unsolicited interaction with those enforcing the Code of Conduct, is allowed during this period. Violating these terms may lead to a permanent ban.
|
||||
|
||||
### 4. Permanent Ban
|
||||
|
||||
Community Impact: Demonstrating a pattern of violation of community standards, including sustained inappropriate behavior, harassment of an individual, or aggression toward or disparagement of classes of individuals.
|
||||
|
||||
Consequence: A permanent ban from any sort of public interaction within the community.
|
||||
|
||||
## Attribution
|
||||
|
||||
This Code of Conduct is adapted from the [Contributor Covenant](https://www.contributor-covenant.org), version 2.1, available at [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html](https://www.contributor-covenant.org/version/2/1/code_of_conduct.html).
|
||||
|
||||
Community Impact Guidelines were inspired by [Mozilla's code of conduct enforcement ladder](https://github.com/mozilla/diversity).
|
||||
|
||||
For answers to common questions about this code of conduct, see the FAQ at [https://www.contributor-covenant.org/faq](https://www.contributor-covenant.org/faq). Translations are available at [https://www.contributor-covenant.org/translations](https://www.contributor-covenant.org/translations).
|
||||
10
PARITY.md
10
PARITY.md
@@ -210,8 +210,8 @@ Repo documentation suite shipped during extended dogfood audit. Status: present/
|
||||
|
||||
| Document | Status | Priority | Notes |
|
||||
|----------|--------|----------|-------|
|
||||
| CODE_OF_CONDUCT.md | ❌ Missing | Low | Standard for public repos |
|
||||
| .github/PULL_REQUEST_TEMPLATE.md | ❌ Missing | Medium | Would standardize PR descriptions |
|
||||
| docs/CONFIGURATION.md | ❌ Missing | High | env vars, settings.json, provider config — relates to #283, #285 |
|
||||
| docs/API_REFERENCE.md | ❌ Missing | Medium | JSON envelope schema, output format contract |
|
||||
| .github/ISSUE_TEMPLATE/bug_report.md | ❌ Missing | Low | Standard bug template (pinpoint.md covers discovery) |
|
||||
| CODE_OF_CONDUCT.md | ✅ Present | Low | Contributor Covenant v2.1 |
|
||||
| .github/PULL_REQUEST_TEMPLATE.md | ✅ Present | Medium | Standardizes PR descriptions |
|
||||
| docs/CONFIGURATION.md | ✅ Present | High | env vars, settings.json, provider config — relates to #283, #285 |
|
||||
| docs/API_REFERENCE.md | ✅ Present | Medium | JSON envelope schema, output format contract — #288, #266, #168c |
|
||||
| .github/ISSUE_TEMPLATE/bug_report.md | ✅ Present | #431 | Standard bug template with repro steps, environment, context sections |
|
||||
|
||||
79
PHASE_A_IMPLEMENTATION.md
Normal file
79
PHASE_A_IMPLEMENTATION.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Phase A: Provider Infrastructure (Implementation Kickoff)
|
||||
|
||||
**Scope:** Formalize multi-provider routing and declarative config architecture. Critical path for Phases B-F.
|
||||
|
||||
**Pinpoints in scope:** #245, #246, #285
|
||||
**Blocked by:** Phase 0 merge (GitHub OAuth, cargo fmt, clawcode-human approval)
|
||||
**Estimated effort:** 2-3 cycles
|
||||
**Target:** Merge-ready immediately post-Phase 0
|
||||
|
||||
## #245 — Providers are hard-coded enum; no backend-swap capability
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- [ ] Providers defined as trait (not enum)
|
||||
- [ ] Factory/registry pattern allows runtime provider selection
|
||||
- [ ] Existing providers (Anthropic, OpenAI) are re-implemented as trait impls
|
||||
- [ ] Tests pass for all existing behavior
|
||||
- [ ] Zero breaking changes to public API
|
||||
|
||||
**Implementation sequence:**
|
||||
1. Define `Provider` trait with core methods (chat completion, streaming, model listing)
|
||||
2. Implement trait for existing providers
|
||||
3. Add provider registry/factory
|
||||
4. Update CLI to accept `--provider` flag
|
||||
5. Regression tests
|
||||
|
||||
## #246 — Provider selection logic is CLI-parsing only; no config source integration
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- [ ] Provider selection checks: 1) CLI flag, 2) env var, 3) settings.json, 4) default
|
||||
- [ ] settings.json schema includes `provider` field with subconfig
|
||||
- [ ] Env vars like `OPENAI_API_KEY` trigger automatic provider selection
|
||||
- [ ] Conflict resolution documented (CLI > env > config file > default)
|
||||
- [ ] Config merging tested
|
||||
|
||||
**Implementation sequence:**
|
||||
1. Extend settings.json schema (add provider field, subconfig structure)
|
||||
2. Implement config-merge logic (priority order)
|
||||
3. Update `claw doctor` to validate provider config (#293 prerequisite)
|
||||
4. Integration tests
|
||||
|
||||
## #285 — No declarative provider fallback; can't swap backends mid-session
|
||||
|
||||
**Acceptance Criteria:**
|
||||
- [ ] `settings.json` supports `providers: [primary, secondary, fallback]` array
|
||||
- [ ] Streaming failures trigger automatic fallback to next provider
|
||||
- [ ] Session state is preserved across provider swap
|
||||
- [ ] User is notified of fallback event
|
||||
- [ ] `claw doctor --providers` shows fallback chain health
|
||||
|
||||
**Implementation sequence:**
|
||||
1. Extend settings.json schema (providers array)
|
||||
2. Implement fallback logic in streaming handler
|
||||
3. Add state-preservation during swap
|
||||
4. User notification (log + maybe `--verbose` output)
|
||||
5. Integration tests with dual-provider setup
|
||||
|
||||
## Dependency Graph
|
||||
|
||||
```
|
||||
Phase 0 merge ──→ #245 (trait + registry) ──→ #246 (config integration) ──→ #285 (fallback)
|
||||
│ │
|
||||
└────────────────────────┘
|
||||
(parallel possible)
|
||||
```
|
||||
|
||||
## Success Criteria (Phase A complete)
|
||||
|
||||
- [ ] All three pinpoints (#245, #246, #285) have passing tests
|
||||
- [ ] `claw --provider openai` works
|
||||
- [ ] `claw --provider openai --fallback anthropic` works
|
||||
- [ ] settings.json with `{ "provider": "openai", ... }` is read correctly
|
||||
- [ ] `claw doctor --providers` validates all configured backends
|
||||
- [ ] Zero regression on existing Anthropic-only workflows
|
||||
- [ ] PR merges with zero cargo fmt warnings
|
||||
- [ ] clawcode-human approval granted
|
||||
|
||||
## Next: Phase B (transport-layer + resilience)
|
||||
|
||||
Once Phase A merges, Phase B begins with auto-compaction (#287, #288, #289) and streaming resilience (#223, #225, #229, #230, #232, #283, #287, #288, #289, #290, #291, #292).
|
||||
@@ -36,7 +36,7 @@ Claw Code is the public Rust implementation of the `claw` CLI agent harness.
|
||||
The canonical implementation lives in [`rust/`](./rust), and the current source of truth for this repository is **ultraworkers/claw-code**.
|
||||
|
||||
> [!IMPORTANT]
|
||||
> Start with [`USAGE.md`](./USAGE.md) for build, auth, CLI, session, and parity-harness workflows. Make `claw doctor` your first health check after building, use [`rust/README.md`](./rust/README.md) for crate-level details, read [`PARITY.md`](./PARITY.md) for the current Rust-port checkpoint, see [`docs/ARCHITECTURE.md`](./docs/ARCHITECTURE.md) for a high-level crate/subsystem map, and see [`docs/container.md`](./docs/container.md) for the container-first workflow.
|
||||
> Start with [`USAGE.md`](./USAGE.md) for build, auth, CLI, session, and parity-harness workflows. Make `claw doctor` your first health check after building, use [`rust/README.md`](./rust/README.md) for crate-level details, read [`PARITY.md`](./PARITY.md) for the current Rust-port checkpoint, see [`docs/ARCHITECTURE.md`](./docs/ARCHITECTURE.md) for a high-level crate/subsystem map, see [`docs/CONFIGURATION.md`](./docs/CONFIGURATION.md) for env vars and settings, and see [`docs/container.md`](./docs/container.md) for the container-first workflow.
|
||||
>
|
||||
> **ACP / Zed status:** `claw-code` does not ship an ACP/Zed daemon entrypoint yet. Run `claw acp` (or `claw --acp`) for the current status instead of guessing from source layout; `claw acp serve` is currently a discoverability alias only, and real ACP support remains tracked separately in `ROADMAP.md`.
|
||||
|
||||
|
||||
351
ROADMAP.md
351
ROADMAP.md
@@ -17426,3 +17426,354 @@ Required fix shape: (a) classify `empty_stream` / stream-closed-before-first-pay
|
||||
- regression tests for tier escalation transitions
|
||||
|
||||
**Branch / parity:** local==origin==fork at 7a022b6
|
||||
|
||||
---
|
||||
|
||||
### #293 — `claw doctor` does not check provider health/reachability
|
||||
|
||||
**Exact pinpoint:** `claw doctor` validates local configuration (API keys present, settings.json parseable, etc.) but does NOT ping the configured provider endpoint to verify: (1) network reachability, (2) authentication validity, (3) model availability, (4) rate-limit status. During sustained upstream degradation (e.g., 20+ `500 empty_stream` failures over 9+ hours), users have no diagnostic tool to distinguish "my config is wrong" from "the provider is down."
|
||||
|
||||
**Live evidence:**
|
||||
- gaebal-gajae's session hit `500 empty_stream` 20+ times across 9+ hours (2026-04-26 21:04 KST ~ 2026-04-27 06:33 KST)
|
||||
- No `claw doctor` check could have detected upstream unavailability
|
||||
- Users had to manually check status.anthropic.com or guess
|
||||
|
||||
**Why distinct:**
|
||||
- #122b (claw doctor broad-path warning) — fixed a specific warning message, did NOT add provider health checks
|
||||
- #292 (extreme-sustained-degradation escalation) — covers runtime escalation during conversation, NOT pre-flight diagnostics
|
||||
- #291 (repeat-failure circuit-breaker) — covers runtime circuit-breaking, NOT diagnostic tooling
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #293.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `claw doctor --check-providers` flag (opt-in to avoid slow startup)
|
||||
- Lightweight provider ping: send minimal request (e.g., `models/list` or single-token completion) to each configured provider
|
||||
- Report: reachable/unreachable, auth-valid/auth-invalid, rate-limited/available
|
||||
- Integration with #292 escalation: `claw doctor` output could suggest "provider X appears degraded, consider fallback Y"
|
||||
- Regression test asserting provider-check path exists when flag is passed
|
||||
|
||||
---
|
||||
|
||||
### #294 — First-run onboarding has no guided setup flow
|
||||
|
||||
**Exact pinpoint:** A new user who installs claw-code and runs `claw` without any env vars set gets an auth error or cryptic failure with no guidance on what to configure. There is no `claw setup` command, no interactive wizard, no `--init` flag, no first-run detection, and no clear "here is what you need to set up first" message. The user must read documentation (if they find it) to discover the required env var names, provider options, and settings.json location.
|
||||
|
||||
**Live evidence:**
|
||||
- PR claw-code#2810 opened 2026-04-27 04:09 KST: "feat: interactive provider wizard (/setup, claw setup, Ctrl+P)" — independent implementation evidence that this is a recognized pain point
|
||||
- docs/CONFIGURATION.md (cycle #429) had to document 13+ env vars from source grep — none of these are surfaced to new users at runtime
|
||||
- `claw doctor` (per #293) validates config but does not GUIDE setup
|
||||
|
||||
**Why distinct:**
|
||||
- #285 (declarative providers config) — covers config SOURCE-OF-TRUTH (env-vs-settings.json), NOT onboarding UX
|
||||
- #245/#246 (declarative config, backend swap) — covers config structure, NOT first-run flow
|
||||
- #293 (claw doctor provider health) — covers diagnostic tooling, NOT initial setup guidance
|
||||
- PR #2810 implementation — implementation in progress, but pinpoint captures the discovery axis and acceptance criteria
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #294; PR #2810 cross-referenced.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `claw setup` / `claw init` command: interactive wizard (like PR #2810 proposes)
|
||||
- First-run detection: if no API key configured AND no settings.json, show guided setup prompt
|
||||
- Minimal setup path: detect provider intent from model flag (e.g., `claw --model claude-*` → prompt for `ANTHROPIC_API_KEY`)
|
||||
- `claw doctor --setup` mode: not just validation but guided remediation
|
||||
- Acceptance: `claw` with no config shows actionable setup guidance, not an opaque auth error
|
||||
|
||||
### #295 — Long-running worktree has no stale-branch detection or auto-sync warning
|
||||
|
||||
**Exact pinpoint:** When working in a long-running claw-code worktree (e.g., during a multi-hour dogfood session), there is no mechanism to detect or warn when the local branch is behind origin. Each new subagent cycle must manually run `git fetch && git pull --rebase` to avoid working on stale code. claw-code has no `claw sync`, no stale-worktree indicator in `claw branches`, no `claw status` that shows upstream divergence, and no pre-session freshness check.
|
||||
|
||||
**Live evidence:**
|
||||
- Extended dogfood audit (cycles #410-#438, 13+ hours) required manual `git pull --rebase` at the start of every subagent cycle
|
||||
- Without this step, subagents risk operating on stale HEAD, introducing rebase conflicts on push
|
||||
- Q's 06:32 status: "claw-code worktrees present: `providers`, `batchtool`" — both invisible to `claw lanes` (#30) and neither has staleness detection
|
||||
|
||||
**Why distinct:**
|
||||
- #30 (`claw lanes` stub) — covers session enumeration, NOT branch freshness
|
||||
- #32 (`claw branches --status`) — covers branch status display, NOT auto-sync or staleness warning
|
||||
- #38 (`claw new` worktrees invisible) — covers worktree discovery, NOT upstream sync
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #295.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `claw sync` command: fetch + rebase current branch against origin (single command)
|
||||
- Pre-session freshness check: `claw` warns if local branch is >N commits behind origin
|
||||
- `claw status` includes upstream divergence count (like `git status` shows `Your branch is behind by N commits`)
|
||||
- `claw doctor` checks: worktree staleness as part of health check
|
||||
- Integration with #38: marker file could include last-sync timestamp
|
||||
|
||||
---
|
||||
|
||||
### #296 — Tests have implicit brittleness assumptions during high-concurrency dogfood
|
||||
|
||||
**Exact pinpoint:** During extended dogfood audit (13+ hours, continuous subagent cycles with git rebasing, worktree syncing, parallel session management), several test categories show potential brittleness: (1) Timing-sensitive tests in `test_run_turn_loop_timeout.py` and `test_run_turn_loop_cancellation.py` use hard-coded wall-clock values (e.g. `timeout_seconds=0.2`, `time.sleep(0.05)`, `assert elapsed < 1.5`) that assume a lightly-loaded machine — under sustained CI or dogfood concurrency these margins can flip; (2) Session state not cleaned via pytest fixtures — tests use `unittest.TestCase` pattern without `addCleanup` or `tearDown` for session files; (3) CLI parity test explicitly `skip`s `delete-session`/`load-session` and `flush-transcript` sub-commands, leaving state-sensitive paths untested; (4) No `cargo test -- --test-threads=1` enforcement for Rust side, meaning parallel test execution could race on shared filesystem state (worktree markers, .claw directories). Tests pass in isolation but brittleness risk is evident under sustained load.
|
||||
|
||||
**Live evidence:**
|
||||
- Extended audit cycles #410-#438 exercised test suite under continuous branching/rebasing/sync stress
|
||||
- `test_run_turn_loop_cancellation.py:109` — `timeout_seconds=0.2` with `time.sleep(0.05)` mock: 4x margin disappears under load
|
||||
- `test_run_turn_loop_timeout.py:72` — `assert elapsed < 1.5` with `timeout_seconds=0.3`: assumes 5x scheduling headroom
|
||||
- `test_cli_parity_audit.py:206,216` — explicit skip comments for state-dependent commands
|
||||
- No `@pytest.fixture` teardown discovered in any test file — session/file state can leak between runs
|
||||
|
||||
**Why distinct:**
|
||||
- #286 (agent thread no-heartbeat) — covers runtime parallel agent lifecycle, NOT test isolation
|
||||
- #38 (claw new worktree invisibility) — covers worktree discovery gap, NOT test cleanup
|
||||
- #287 (auto-compaction timing) — covers production behaviour, NOT test timing assumptions
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #296.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- Replace hard-coded timing margins with `pytest-timeout` + environment-aware multiplier (`CI=1` → 3x margin)
|
||||
- Add `@pytest.fixture(autouse=True)` session teardown in conftest.py — clean up `.claw/` and session files post-test
|
||||
- Un-skip `delete-session`/`load-session`/`flush-transcript` with proper tmp_path fixtures
|
||||
- Rust CI: add `-- --test-threads=1 --nocapture` flag for brittleness detection under dogfood concurrency
|
||||
- Add `pytest-repeat` run in CI (`--count=3`) to surface non-deterministic failures early
|
||||
|
||||
---
|
||||
|
||||
## Post-Merge Parity Matrix: claw-code vs. anomalyco/opencode
|
||||
|
||||
| Feature | claw-code Status | anomalyco/opencode | claw-code Gap |
|
||||
|---------|------------------|-------------------|---------------|
|
||||
| `claw lanes` (session enumeration) | Stub (#30) | Live (full state) | #30: Implement live enumeration |
|
||||
| `claw branches --status` (divergence) | Unimplemented (#32) | Live (parity display) | #32: Add status output |
|
||||
| `claw sync` (worktree freshness) | Unimplemented (#295) | Live (auto-sync) | #295: Add sync flow |
|
||||
| `claw setup` (first-run wizard) | Unimplemented (#294) | Live (interactive) | #294: Add guided onboarding |
|
||||
| `claw doctor --providers` (health check) | Unimplemented (#293) | Live (provider ping) | #293: Add provider diagnostics |
|
||||
| Multi-provider routing (declarative) | Design phase (#245/#246/#285) | Live (full impl) | Phase A: Implement declarative config |
|
||||
| Auto-compaction (context-aware) | Design phase (#287/#288/#289) | Live (tuned algorithm) | Phase B: Implement compaction |
|
||||
| Streaming error envelope | Design phase (#290/#291/#292) | Live (all edge cases) | Phase B: Implement resilience |
|
||||
| Tool-result atomic writes | Design phase (#254/#268/#274) | Live (MCP refresh) | Phase C: Implement tool lifecycle |
|
||||
| Session persistence versioning | Design phase (#278/#279) | Live (migrations) | Phase D: Implement persistence |
|
||||
| CLI `--max-turns` / `--cwd` | Design phase (#262/#267/#272) | Live (flexible dispatch) | Phase E: Implement dispatch |
|
||||
|
||||
### #297 — MCP plugin connection crash has no mid-session graceful recovery
|
||||
|
||||
**Exact pinpoint:** When an external MCP server process dies, drops its connection, or times out during an active claw-code session, there is no recovery mechanism: the session hangs or errors with no user-facing guidance, no retry attempt, no graceful degradation to built-in tools, and no notification that tool availability has changed. Users must manually restart the MCP server and re-launch claw-code.
|
||||
|
||||
**Live evidence:**
|
||||
- Clawhip nudge prompt explicitly lists "MCP/plugin lifecycle breakage" as a priority discovery area across all cycles
|
||||
- Extended dogfood audit (14+ hours) exercised multi-process environments where MCP server stability is a real concern
|
||||
- No `reconnect`, `retry`, or `mcp.*error` recovery logic found in `rust/crates/` source
|
||||
|
||||
**Why distinct:**
|
||||
- #280 (MCP crate refresh) — covers dependency/API update, NOT runtime connection recovery
|
||||
- #254 (tool-result atomic writes) — covers result delivery durability, NOT connection lifecycle
|
||||
- #268 (tool rendering parity) — covers output display, NOT plugin connection management
|
||||
- #286 (agent background thread lifecycle) — covers agent threads, NOT external MCP process lifecycle
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #297.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- MCP connection health monitor: periodic heartbeat to MCP server (ping/pong or lightweight probe)
|
||||
- Graceful degradation: on connection loss, mark affected tools as unavailable, notify user, continue with remaining tools
|
||||
- Retry with backoff: attempt reconnection up to N times before surfacing error
|
||||
- User notification: `[MCP: <server-name> disconnected — retrying (1/3)]` style status message
|
||||
- `claw doctor --mcp` checks: validate all configured MCP servers are reachable
|
||||
- Integration with #293 (claw doctor provider health): unified health-check command
|
||||
|
||||
### #298 — Event/log output is unstructured; no machine-readable or queryable format
|
||||
|
||||
**Exact pinpoint:** claw-code emits diagnostic output to stderr in human-readable format but provides no: (1) structured log format (JSON-lines/NDJSON), (2) documented log level filtering, (3) session-scoped log capture (parallel sessions interleave), (4) machine-readable event stream for CI/monitoring, (5) `--log-file` flag. Operators running claw-code in automated pipelines cannot parse events without brittle regex.
|
||||
|
||||
**Live evidence:**
|
||||
- Extended dogfood audit (14+ hours, 43 subagent cycles) relied on Discord post-hoc summaries rather than queryable session logs
|
||||
- Clawhip nudge prompt lists "event/log opacity" as a recurring priority discovery area
|
||||
- Parallel subagent sessions produced interleaved stderr with no session discriminator
|
||||
|
||||
**Why distinct:**
|
||||
- #292 (extreme-sustained-degradation escalation) — runtime user-facing escalation, NOT log structure
|
||||
- #290 (stream-init error envelope) — API response envelope, NOT diagnostic logging
|
||||
- #283 (skip-reason typing) — compaction event typing, NOT general logging
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `RUST_LOG` documentation in CONFIGURATION.md (immediate, low-effort)
|
||||
- `--log-format json` flag → emit NDJSON to stderr or `--log-file`
|
||||
- Session discriminator: `session_id` field in each log line
|
||||
- `claw logs` subcommand: tail/filter session logs (long-term)
|
||||
|
||||
### #299 — `/resume latest` session search is scoped to current workspace only
|
||||
|
||||
**Exact pinpoint:** The `/resume latest` command searches for the most recent session only within the current working directory/workspace. If a user switches directories or starts claw-code from a different workspace, `/resume latest` cannot find sessions from other workspaces even though those sessions are stored and theoretically accessible. This creates a surprising UX gap: users who move between projects cannot resume recent sessions without knowing the exact session ID.
|
||||
|
||||
**Live evidence:**
|
||||
- PR ultraworkers/claw-code#2811 opened 2026-04-27 09:36 KST: "fix: /resume latest searches all workspaces" — independent upstream fix confirming this as a real bug
|
||||
- Extended dogfood audit (14+ hours) involved frequent workspace switches (worktree: providers, batchtool, main) — `/resume` behavior across workspaces was a source of friction
|
||||
|
||||
**Why distinct:**
|
||||
- #30 (`claw lanes` stub) — session enumeration in lane board, NOT resume-scope search
|
||||
- #278/#279 (session persistence versioning) — session format/migration, NOT search scope
|
||||
- #295 (stale-branch detection) — worktree hygiene, NOT session discovery across workspaces
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #299; PR #2811 cross-referenced as upstream fix.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `/resume latest` should search all known workspaces (not just cwd)
|
||||
- Session store lookup: scan all workspace-scoped stores, return globally most-recent
|
||||
- `--workspace <path>` flag: opt-in to workspace-scoped search (restore original behavior)
|
||||
- Upstream: track PR #2811 merge status; port fix if claw-code diverges
|
||||
|
||||
### #300 — Deprecated `enabledPlugins` key in settings.json triggers warnings on every invocation; no migration/auto-upgrade path
|
||||
|
||||
**Exact pinpoint:** `~/.claw/settings.json` ships with `enabledPlugins` as the key for plugin configuration. The config validator at `rust/crates/runtime/src/config_validate.rs:319-320` correctly identifies this as deprecated (replacement: `plugins.enabled`) and emits a deprecation warning — but three warnings fire on every `claw` invocation (startup, config load, REPL init phases). There is no `claw config migrate` subcommand, no in-place auto-rewrite, and no first-run migration prompt. Users who installed via the standard path land in a permanently-noisy state with no self-service resolution path.
|
||||
|
||||
**Live evidence:**
|
||||
- `claw status` and `claw doctor` both emit: `warning: /Users/yeongyu/.claw/settings.json: field "enabledPlugins" is deprecated (line 2). Use "plugins.enabled" instead` — three times per invocation
|
||||
- `~/.claw/settings.json` content confirmed: `{"enabledPlugins": {"example-bundled@bundled": false}}`
|
||||
- Config validator code: `rust/crates/runtime/src/config_validate.rs:319-320` — field mapped, replacement documented, warning emitted
|
||||
- Zero `config migrate` / `claw migrate` / `claw upgrade-config` surface in `claw --help` or `SlashCommandSpec`
|
||||
|
||||
**Why distinct:**
|
||||
- #293 (claw doctor provider health) — runtime health checks, NOT config schema migration
|
||||
- #285 (declarative-providers/models/websearch missing) — missing config fields, NOT deprecated key migration
|
||||
- #284 (ultraplan empty-shell) — slash command, NOT config lifecycle
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `claw config migrate` subcommand: reads settings.json, rewrites `enabledPlugins` → `plugins.enabled`, atomic write
|
||||
- Or: auto-migration on startup with `[migrated settings.json: enabledPlugins → plugins.enabled]` one-time notice
|
||||
- Install-time: generate `settings.json` with `plugins.enabled` key (not deprecated form) from the start
|
||||
- Acceptance: zero deprecation warnings on fresh install; `claw doctor` green on config status
|
||||
|
||||
**Blocker:** None
|
||||
|
||||
**Source:** Dogfood cycle #435 (2026-04-27) — discovered via live `claw status` / `claw doctor` invocation from scratch dir `/tmp/cdQ`
|
||||
|
||||
### #300 — Prompt misdelivery: ambiguous user commands don't route intelligently to correct tool
|
||||
|
||||
**Exact pinpoint:** When multiple tools are available (e.g., MCP tools + built-in exec + browser), ambiguous user commands (e.g., "run this", "check that") are routed using tool declaration order or first-match heuristics. There is no: (1) semantic matching of user intent to tool capability, (2) user-facing disambiguation ("Did you mean exec.sh or shell? Use `/exec <cmd>` or `/shell <cmd>`"), (3) clarification request when tool invocation fails, (4) context about why a tool failed (auth error? wrong args? timeout?).
|
||||
|
||||
**Live evidence:**
|
||||
- Clawhip nudge prompt explicitly lists "prompt misdelivery" as a priority discovery category across all cycles
|
||||
- Extended dogfood audit (14+ hours, 43 subagent cycles) involved varied tool invocations (git, cargo, grep, bash) where tool routing was implicit
|
||||
- No semantic intent-matching logic found in source
|
||||
|
||||
**Why distinct:**
|
||||
- #254 (tool-result atomic writes) — covers result delivery durability, NOT command routing
|
||||
- #268 (tool rendering parity) — covers tool output display, NOT input routing/dispatch
|
||||
- #286 (agent background thread lifecycle) — covers parallel execution, NOT routing logic
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #300.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- Semantic tool dispatcher: match user command (e.g., "run", "exec", "shell") to registered tools
|
||||
- Ambiguity resolution: `/tool-name: <command>` prefix syntax for explicit routing
|
||||
- Tool failure context: propagate error reason to user ("exec failed: command not found" vs. "timeout")
|
||||
- Clarification UX: when ambiguous, ask user to specify tool (like shell completions for `/`)
|
||||
- Integration with #286 (agent lifecycle): tool routing aware of parallel execution context
|
||||
|
||||
### #301 — No pre-built binary distribution or install script for new users/contributors
|
||||
|
||||
**Exact pinpoint:** Installing claw-code requires cloning the repository and building from source (`cargo build --release`), which on a 9-crate workspace takes 5-10+ minutes on a cold build. There is no: (1) GitHub Releases page with pre-built binaries for major platforms (macOS arm64/x86_64, Linux x86_64/arm64, Windows), (2) install script (`curl -fsSL | sh` pattern), (3) Homebrew formula (`brew install claw-code`), (4) Docker image (`docker pull ultraworkers/claw-code`), (5) `cargo install claw-code` via crates.io. This is the first and highest-friction step for any new contributor or user.
|
||||
|
||||
**Live evidence:**
|
||||
- CONTRIBUTING.md (cycle #411) documents `cargo build` as the only build path
|
||||
- ARCHITECTURE.md (cycle #426) identifies 9 Rust crates in the workspace — cold build is substantial
|
||||
- No GitHub Releases page, no `install.sh`, no `Dockerfile` found in repo root
|
||||
- Clawhip nudge prompt lists "startup friction" as a priority discovery category
|
||||
|
||||
**Why distinct:**
|
||||
- #294 (first-run onboarding wizard) — covers in-app guided setup, NOT binary distribution
|
||||
- #293 (claw doctor health check) — covers diagnostic tooling, NOT installation
|
||||
- CONTRIBUTING.md — documents build, but cannot solve the cold-build friction itself
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #301.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- GitHub Actions release workflow: `cargo build --release` on tag push, upload artifacts per platform
|
||||
- `install.sh`: detect platform, download correct binary from GitHub Releases, chmod +x, add to PATH
|
||||
- Homebrew tap: `brew install ultraworkers/tap/claw-code`
|
||||
- Docker: multi-stage Dockerfile (builder + minimal runtime image)
|
||||
- `cargo install`: publish to crates.io as `claw-code` once API stabilizes
|
||||
- Quick start in README: one-liner install command at top
|
||||
|
||||
### #302 — `usage` block in `claw status --output-format json` always reports zero; no live context-window budget signal
|
||||
|
||||
**Exact pinpoint:** `claw --output-format json status` returns a `usage` block with all fields zeroed (`cumulative_input: 0`, `cumulative_output: 0`, `estimated_tokens: 0`, etc.) regardless of session state. There is no way for a downstream tool (CI orchestrator, wrapper script, UI) to programmatically read current token consumption or estimate remaining context budget without starting an actual conversation turn. The sandbox also silently falls back to `filesystem_active: true` with `supported: false` — the JSON carries no machine-readable reason why namespace isolation is unavailable other than the prose `fallback_reason` field.
|
||||
|
||||
**Live evidence:**
|
||||
- `claw --output-format json status` (HEAD d01ebd3, scratch dir `/tmp/cdR`, 2026-04-27 10:06 KST):
|
||||
```json
|
||||
"usage": {
|
||||
"cumulative_input": 0,
|
||||
"cumulative_output": 0,
|
||||
"cumulative_total": 0,
|
||||
"estimated_tokens": 0,
|
||||
"latest_total": 0,
|
||||
"messages": 0,
|
||||
"turns": 0
|
||||
}
|
||||
```
|
||||
- Three duplicate deprecation warnings emitted to stderr before JSON (confirms #300 is still live)
|
||||
- `sandbox.supported: false` + `sandbox.filesystem_active: true` — contradictory state with no structured error code
|
||||
|
||||
**Why distinct:**
|
||||
- #298 (unstructured event/log output) — covers log stream, NOT the status JSON schema
|
||||
- #293 (claw doctor health check) — covers diagnostic subcommand, NOT status JSON fields
|
||||
- #300 (deprecated config key / no migration) — same session, separate gap (config lifecycle vs. status reporting)
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #302.
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `usage` fields: populate from session store on disk (last known values) even when no active REPL session; emit `"session_active": false` flag to signal cold-read vs. live
|
||||
- Add `context_limit` field: model's max context window (from model registry) — enables budget math externally
|
||||
- Sandbox status: replace prose `fallback_reason` with structured `fallback_code` enum (`namespace_unavailable`, `non_linux`, `insufficient_caps`, etc.)
|
||||
- Acceptance: `claw status --output-format json | jq .usage.context_limit` returns non-zero on any platform; sandbox state has machine-readable `fallback_code`
|
||||
|
||||
**Blocker:** None
|
||||
|
||||
**Source:** Dogfood cycle #436 (2026-04-27 10:06 KST) — discovered via live `claw --output-format json status` from scratch dir `/tmp/cdR`, HEAD d01ebd3
|
||||
|
||||
### #303 — Session rotation silently deletes oldest rotated log when MAX_ROTATED_FILES (3) is exceeded; `--resume` loses conversation history without warning
|
||||
|
||||
**Exact pinpoint:** `session.rs` defines `MAX_ROTATED_FILES = 3` and `ROTATE_AFTER_BYTES = 256 KiB`. After every `append_to_file` call, `rotate_session_file_if_needed` renames the active file to `{stem}.rot-{timestamp_ms}.jsonl` when it exceeds 256 KiB, then `cleanup_rotated_logs` calls `fs::remove_file` on the oldest rotated file(s) to keep at most 3 rotated files. No diagnostic, warning, event, or UI signal is emitted before or after the deletion. A long session that exceeds `~1 MiB` of JSONL (active + 3 rotated × 256 KiB) has its earliest messages permanently deleted on disk. Because `--resume` replays conversation from on-disk JSONL, older turns are silently lost from the resumed context.
|
||||
|
||||
**Live evidence (static trace — no running binary required):**
|
||||
- `rust/crates/runtime/src/session.rs:13-14`: `ROTATE_AFTER_BYTES = 256 * 1024`, `MAX_ROTATED_FILES = 3`
|
||||
- `session.rs:1105-1138` (`cleanup_rotated_logs`): sorts rotated paths, computes `remove_count = rotated_paths.len().saturating_sub(MAX_ROTATED_FILES)`, then `fs::remove_file(stale_path)` for each — no log, no event, no metric
|
||||
- `session.rs:207,209`: `rotate_session_file_if_needed` + `cleanup_rotated_logs` called on every `append_to_file` invocation — deletion can trigger mid-session
|
||||
- No caller in any crate emits a structured event or user-visible warning after `cleanup_rotated_logs` returns
|
||||
|
||||
**Why distinct:**
|
||||
- #298 (unstructured event/log output) — covers the log stream format, NOT silent data deletion
|
||||
- #302 (status JSON usage always zero) — covers usage reporting, NOT session file lifecycle
|
||||
- #114 (filed 2026-04-18, post-/clear divergence) — covers phantom session IDs, NOT rotation data loss
|
||||
|
||||
**Fix shape recorded:**
|
||||
- Emit a structured `session.log_rotated` event from `cleanup_rotated_logs` with `{ deleted_files: N, oldest_deleted_ms: T, session_id }` — consumers (TUI, CLI stderr, telemetry) can react
|
||||
- Add `--history-limit` config key (default: 3) exposed in `settings.json` so power users can raise limit before hitting the cap
|
||||
- On `--resume`, check if gap between last retained message and first message in current file exceeds a threshold; surface warning: "Session history truncated: N rotation files pruned, oldest available message: <timestamp>"
|
||||
- Acceptance: `claw --output-format json status` includes `session.log_files_retained` and `session.log_files_deleted` counters
|
||||
|
||||
**Blocker:** None
|
||||
|
||||
**Source:** Dogfood cycle #437 (2026-04-27 10:16 KST) — discovered via static trace of `rust/crates/runtime/src/session.rs` (HEAD 4423774), scratch dir `/tmp/cdS`
|
||||
|
||||
---
|
||||
|
||||
### #305 — `/compact` has no dry-run or preview mode before irreversible compaction
|
||||
|
||||
**Exact pinpoint:** When a user runs `/compact`, the operation executes immediately with no preview of: (1) which messages/turns will be removed, (2) how much context will be freed, (3) what summary will replace the removed turns, (4) estimated new token count post-compaction. Compaction is effectively irreversible within a session — there is no `/compact --dry-run` or `/compact --preview` that shows the plan before committing. Users discover the compaction result only after it has already modified session state.
|
||||
|
||||
**Live evidence:**
|
||||
- Extended dogfood audit (14+ hours) ran auto-compaction on long sessions with no ability to inspect the compaction plan
|
||||
- Q's pinpoint #303 (silent log rotation) exposes adjacent risk: session content can be irreversibly lost without user awareness
|
||||
- No `dry_run`, `preview`, or `plan` flag found in compact-related source
|
||||
|
||||
**Why distinct:**
|
||||
- #283 (skip-reason typing) — covers why compaction was skipped, NOT preview before execution
|
||||
- #287 (auto-compaction chunk-aware budgeting) — covers algorithm accuracy, NOT user preview
|
||||
- #288 (auto-compaction preflight check) — covers pre-session check, NOT interactive preview
|
||||
- Q's #303 (silent log rotation) — covers log file deletion, NOT in-session compaction preview
|
||||
|
||||
**Concrete delta landed:** ROADMAP.md appended with #305 (renumbered from #302).
|
||||
|
||||
**Fix shape recorded:**
|
||||
- `/compact --dry-run`: show compaction plan (turns to remove, context freed, summary preview) without executing
|
||||
- `/compact --preview`: interactive confirmation ("Remove 15 turns, free 4,200 tokens? [y/N]")
|
||||
- Compaction summary displayed post-execution: "Compacted 15 turns → 1 summary block. Context: 45k → 28k tokens"
|
||||
- Undo window: brief window to `/compact --undo` before session state is fully committed
|
||||
|
||||
**Blocker:** None
|
||||
|
||||
**Source:** Dogfood cycle #447 (2026-04-27 10:31 KST) — discovered via grep trace of `rust/crates/runtime/src/compact.rs` + `mock-anthropic-service` (HEAD d01ebd3), branch `feat/jobdori-168c-emission-routing`
|
||||
|
||||
**Note:** This entry was initially filed with number #302 (collision with the `status JSON usage` pinpoint above). Renumbered to #305 at 2026-04-27 10:59 KST merge sync (dogfood #448).
|
||||
|
||||
14
USAGE.md
14
USAGE.md
@@ -693,3 +693,17 @@ Current Rust crates:
|
||||
- `rusty-claude-cli`
|
||||
- `telemetry`
|
||||
- `tools`
|
||||
|
||||
## Documentation
|
||||
|
||||
- [ARCHITECTURE.md](docs/ARCHITECTURE.md) — System overview, crate layout, request flow
|
||||
- [CONFIGURATION.md](docs/CONFIGURATION.md) — Env vars, settings.json, provider config
|
||||
- [SUPPORTED_PROVIDERS.md](docs/SUPPORTED_PROVIDERS.md) — Provider/model matrix
|
||||
- [API_REFERENCE.md](docs/API_REFERENCE.md) — JSON output envelope, error format
|
||||
- [TROUBLESHOOTING.md](TROUBLESHOOTING.md) — Common failure modes and mitigation
|
||||
- [ROADMAP.md](ROADMAP.md) — Pinpoint-driven development roadmap
|
||||
- [CONTRIBUTING.md](CONTRIBUTING.md) — How to contribute, pinpoint format
|
||||
- [PINPOINT_FILING_GUIDE.md](docs/PINPOINT_FILING_GUIDE.md) — Step-by-step pinpoint workflow
|
||||
- [CHANGELOG.md](CHANGELOG.md) — Recent changes
|
||||
- [SECURITY.md](SECURITY.md) — Responsible disclosure
|
||||
- [CODE_OF_CONDUCT.md](CODE_OF_CONDUCT.md) — Community standards
|
||||
|
||||
174
docs/API_REFERENCE.md
Normal file
174
docs/API_REFERENCE.md
Normal file
@@ -0,0 +1,174 @@
|
||||
# API Reference — JSON Output Envelope Contract
|
||||
|
||||
This document describes the machine-readable JSON output emitted by `claw` when
|
||||
`--output-format json` is passed. All JSON envelopes are written to **stdout**.
|
||||
Stderr is reserved for non-contractual diagnostics only (see pinpoint #168c).
|
||||
|
||||
---
|
||||
|
||||
## Output Format Flag
|
||||
|
||||
```
|
||||
claw [command] --output-format json
|
||||
claw [command] --output-format text # default
|
||||
```
|
||||
|
||||
When `json` is active, **all** output (success and error) is emitted as a single
|
||||
JSON object on stdout. Consumers must not parse stderr for errors.
|
||||
|
||||
---
|
||||
|
||||
## Success Envelope — `claw -p <prompt>`
|
||||
|
||||
Full non-compact run (default):
|
||||
|
||||
```json
|
||||
{
|
||||
"message": "<final assistant text>",
|
||||
"model": "claude-opus-4-5",
|
||||
"iterations": 3,
|
||||
"auto_compaction": null,
|
||||
"tool_uses": [...],
|
||||
"tool_results": [...],
|
||||
"prompt_cache_events": [...],
|
||||
"usage": {
|
||||
"input_tokens": 1234,
|
||||
"output_tokens": 567,
|
||||
"cache_creation_input_tokens": 0,
|
||||
"cache_read_input_tokens": 0
|
||||
},
|
||||
"estimated_cost": "$0.0123"
|
||||
}
|
||||
```
|
||||
|
||||
Compact run (`--compact`):
|
||||
|
||||
```json
|
||||
{
|
||||
"message": "<final assistant text>",
|
||||
"compact": true,
|
||||
"model": "claude-opus-4-5",
|
||||
"usage": {
|
||||
"input_tokens": 1234,
|
||||
"output_tokens": 567,
|
||||
"cache_creation_input_tokens": 0,
|
||||
"cache_read_input_tokens": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Field Reference
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `message` | string | Final assistant reply text |
|
||||
| `model` | string | Model identifier used for the turn |
|
||||
| `iterations` | integer | Number of tool-use / re-prompt iterations |
|
||||
| `compact` | boolean | Present and `true` when `--compact` mode was active |
|
||||
| `auto_compaction` | object\|null | Non-null when auto-compaction fired (see below) |
|
||||
| `tool_uses` | array | Tool calls made during the turn (TODO: verify schema) |
|
||||
| `tool_results` | array | Results returned to the model (TODO: verify schema) |
|
||||
| `prompt_cache_events` | array | Cache-hit/miss events (TODO: verify schema) |
|
||||
| `usage.input_tokens` | integer | Input tokens billed |
|
||||
| `usage.output_tokens` | integer | Output tokens billed |
|
||||
| `usage.cache_creation_input_tokens` | integer | Tokens written to prompt cache |
|
||||
| `usage.cache_read_input_tokens` | integer | Tokens served from prompt cache |
|
||||
| `estimated_cost` | string | Human-readable USD cost estimate (e.g. `"$0.0123"`) |
|
||||
|
||||
#### `auto_compaction` sub-object
|
||||
|
||||
```json
|
||||
{
|
||||
"removed_messages": 12,
|
||||
"notice": "Auto-compacted: removed 12 messages to free context."
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Envelope
|
||||
|
||||
When a command fails under `--output-format json`, an error envelope is written
|
||||
to **stdout** (pinpoint #168c / #288):
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"error": "<short human-readable reason>",
|
||||
"kind": "<snake_case error kind token>",
|
||||
"hint": "<optional actionable hint>"
|
||||
}
|
||||
```
|
||||
|
||||
### Error Envelope Fields
|
||||
|
||||
| Field | Type | Description |
|
||||
|---|---|---|
|
||||
| `type` | string | Always `"error"` |
|
||||
| `error` | string | Short prose description of the failure |
|
||||
| `kind` | string | Machine-readable snake_case token (see §Error Kinds) |
|
||||
| `hint` | string\|null | Optional remediation hint |
|
||||
|
||||
### Error Kinds (selected)
|
||||
|
||||
`kind` values are classified by `classify_error_kind()`. Common tokens include:
|
||||
|
||||
- `not_yet_implemented` — command stub not yet shipped
|
||||
- `config_error` — configuration file parse / validation failure
|
||||
- `auth_error` — API key or credential problem
|
||||
- `permission_denied` — tool-use permission denied
|
||||
- `model_error` — upstream model API error
|
||||
|
||||
See pinpoint #266 (typed-error-kind) for the full taxonomy.
|
||||
|
||||
---
|
||||
|
||||
## Streaming Behavior
|
||||
|
||||
`claw` always uses streaming internally (HTTP chunked transfer to the Anthropic
|
||||
API) but the **JSON output envelope is emitted once**, after the turn completes.
|
||||
There is no per-token or per-chunk JSON stream exposed to the caller.
|
||||
|
||||
In REPL / interactive mode (`claw` with no `-p`) the JSON format applies only to
|
||||
structured sub-commands, not to the interactive session itself.
|
||||
|
||||
---
|
||||
|
||||
## Status Snapshot (`claw status`)
|
||||
|
||||
```json
|
||||
{
|
||||
"kind": "status",
|
||||
"status": "ok",
|
||||
"config_load_error": null,
|
||||
"model": "claude-opus-4-5",
|
||||
"model_source": "config",
|
||||
"model_raw": null,
|
||||
"permission_mode": "default",
|
||||
"usage": {
|
||||
"messages": 42,
|
||||
"turns": 10,
|
||||
"latest_total": 5678,
|
||||
"cumulative_input": 12345,
|
||||
"cumulative_output": 4567,
|
||||
"cumulative_total": 16912,
|
||||
"estimated_tokens": 16912
|
||||
},
|
||||
"workspace": {
|
||||
"cwd": "/Users/you/project",
|
||||
"project_root": "/Users/you/project",
|
||||
"git_branch": "main",
|
||||
"git_state": "clean",
|
||||
"changed_files": 0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Related Pinpoints
|
||||
|
||||
- **#288** — error-envelope stdout emission contract
|
||||
- **#266** — typed-error-kind taxonomy
|
||||
- **#168c** — `--output-format json` routes error envelopes to stdout
|
||||
- **#247** — JSON envelope field preservation (hint / help text)
|
||||
96
docs/CONFIGURATION.md
Normal file
96
docs/CONFIGURATION.md
Normal file
@@ -0,0 +1,96 @@
|
||||
# Configuration
|
||||
|
||||
claw-code configuration reference. For provider details, see [SUPPORTED_PROVIDERS.md](./SUPPORTED_PROVIDERS.md). For architecture, see [ARCHITECTURE.md](./ARCHITECTURE.md).
|
||||
|
||||
## Configuration Sources
|
||||
|
||||
claw-code reads configuration from multiple sources (in priority order):
|
||||
|
||||
1. **CLI flags** — highest priority (e.g., `--model`, `--max-turns`, `--cwd`)
|
||||
2. **Environment variables** — `ANTHROPIC_*`, `OPENAI_*`, `XAI_*`, `DASHSCOPE_*`, `CLAW_*`, etc.
|
||||
3. **settings.json** — `.claw/settings.json` in the project directory, or `~/.claw/settings.json` as a user-level default
|
||||
4. **Hardcoded defaults** — lowest priority
|
||||
|
||||
> **Known issue (#283):** Auto-compaction threshold (`CLAUDE_CODE_AUTO_COMPACT_INPUT_TOKENS`) is env-var-only; no `settings.json` key exists yet.
|
||||
> **Known issue (#282):** env-vs-config consolidation is incomplete; some settings only work in one source.
|
||||
|
||||
## Environment Variables
|
||||
|
||||
### Provider Authentication
|
||||
|
||||
| Variable | Provider | Notes |
|
||||
|----------|----------|-------|
|
||||
| `ANTHROPIC_API_KEY` | Anthropic (Claude models) | Primary credential for Claude |
|
||||
| `ANTHROPIC_AUTH_TOKEN` | Anthropic | Alternative to `ANTHROPIC_API_KEY` |
|
||||
| `ANTHROPIC_BASE_URL` | Anthropic | Custom endpoint (e.g., proxy) |
|
||||
| `OPENAI_API_KEY` | OpenAI-compatible | Required for `gpt-*` / `openai/` models |
|
||||
| `OPENAI_BASE_URL` | OpenAI-compatible | Custom endpoint (OpenRouter, Ollama, etc.) |
|
||||
| `XAI_API_KEY` | xAI (Grok models) | Required for `grok-*` models |
|
||||
| `XAI_BASE_URL` | xAI | Custom endpoint |
|
||||
| `DASHSCOPE_API_KEY` | DashScope (Qwen/Kimi models) | Required for `qwen-*` / `kimi-*` models |
|
||||
| `DASHSCOPE_BASE_URL` | DashScope | Custom endpoint |
|
||||
|
||||
### Model Selection
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `ANTHROPIC_MODEL` | `claude-sonnet-4-6` | Default model when `--model` flag is not passed |
|
||||
|
||||
### Runtime Configuration
|
||||
|
||||
| Variable | Default | Description |
|
||||
|----------|---------|-------------|
|
||||
| `CLAUDE_CODE_AUTO_COMPACT_INPUT_TOKENS` | provider-specific | Auto-compaction trigger threshold (see #283) |
|
||||
| `CLAW_CONFIG_HOME` | `~/.claw` | Override config directory location |
|
||||
| `CLAWD_WEB_SEARCH_BASE_URL` | (built-in) | Custom base URL for web search tool |
|
||||
| `CLAWD_TODO_STORE` | `~/.claw/todos` | Override todo storage path |
|
||||
| `CLAWD_AGENT_STORE` | `~/.claw/agents` | Override agent store path |
|
||||
| `RUST_LOG` | `info` | Log verbosity (`trace`/`debug`/`info`/`warn`/`error`) |
|
||||
|
||||
**Related paths also respected:** `CODEX_HOME`, `CLAUDE_CONFIG_DIR` (legacy compatibility).
|
||||
|
||||
## settings.json
|
||||
|
||||
Located at `.claw/settings.json` (project-local) or `~/.claw/settings.json` (user-level). Project-local takes precedence over user-level.
|
||||
|
||||
Example:
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "claude-sonnet-4-6"
|
||||
}
|
||||
```
|
||||
|
||||
`claw /config` shows the merged, resolved configuration from all sources.
|
||||
|
||||
> **Known gap (#285):** No declarative `providers` or `models` block in `settings.json`. Provider selection is currently model-prefix-based via a hardcoded `MODEL_REGISTRY`. See [SUPPORTED_PROVIDERS.md](./SUPPORTED_PROVIDERS.md) for the full provider/model matrix.
|
||||
|
||||
## Provider Selection
|
||||
|
||||
Provider is auto-selected from model name prefix or the `openai/` namespace prefix:
|
||||
|
||||
| Model pattern | Provider | Auth env |
|
||||
|--------------|----------|----------|
|
||||
| `claude-*` | Anthropic | `ANTHROPIC_API_KEY` / `ANTHROPIC_AUTH_TOKEN` |
|
||||
| `gpt-*`, `openai/*` | OpenAI-compatible | `OPENAI_API_KEY` |
|
||||
| `grok-*` | xAI | `XAI_API_KEY` |
|
||||
| `qwen-*`, `kimi-*` | DashScope | `DASHSCOPE_API_KEY` |
|
||||
|
||||
When `OPENAI_BASE_URL` is set, the OpenAI-compatible provider is preferred for unrecognised model names — useful for Ollama or OpenRouter.
|
||||
|
||||
## Session Storage
|
||||
|
||||
Sessions are stored in `~/.claw/sessions/<session-id>/` (or under `CLAW_CONFIG_HOME`). Each session contains:
|
||||
|
||||
- Conversation history (messages)
|
||||
- Session metadata (model, created_at, etc.)
|
||||
- Tool execution state
|
||||
|
||||
See pinpoints #278 (version-comparison) and #279 (unknown-field policy) for known session persistence caveats.
|
||||
|
||||
## Related Documents
|
||||
|
||||
- [SUPPORTED_PROVIDERS.md](./SUPPORTED_PROVIDERS.md) — Provider/model matrix and auth details
|
||||
- [ARCHITECTURE.md](./ARCHITECTURE.md) — Crate layout and request flow
|
||||
- [TROUBLESHOOTING.md](../TROUBLESHOOTING.md) — Failure mitigation
|
||||
- [ROADMAP.md](../ROADMAP.md) — Pinpoints by cluster
|
||||
Reference in New Issue
Block a user