MCP server config now expands ${VAR} environment variable references
and ~/ home directory prefix in command, args, and url fields. Previously
these values were passed verbatim to execve/URL-parse, causing silent
"No such file or directory" failures for standard config patterns.
Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
deep_merge_objects now concatenates arrays when both layers provide
the same key. Previously permissions.allow, hooks.PreToolUse, etc.
from earlier config layers (e.g. ~/.claw/settings.json) were silently
discarded when a later layer (e.g. project .claw/settings.json) set
the same key. Now arrays are merged additively across layers.
Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
Hook config now supports the Claude Code structured hook format with
partial validation. Invalid hook entries are recorded in invalid_hooks
while valid siblings are retained, following the same pattern as MCP
partial validation (#440).
Key changes:
- RuntimeInvalidHookConfig now includes typed kind field (invalid_hooks_config
or unknown_hook_event) for machine-readable error classification
- Hook parsing collects all invalid entries instead of halting at first error
- Unknown hook event names recorded as invalid without rejecting valid hooks
- Legacy bare-string hooks still load with deprecation warnings
- Claude Code documented format loads without error (matcher + nested hooks)
- config/status/doctor JSON surfaces hook_validation metadata
- classify_error_kind maps hook errors to invalid_hooks_config
Generated with https://github.com/Yeachan-Heo/gajae-code
Co-authored-by: Gajae Code <dev@gajae-code.com>
Cherry-picked from PR #2816 onto current upstream/main, resolving
conflicts from PR #3015's merge (which added retry_after to ApiError
but some construction sites were missing it).
Commits preserved:
- ade85398: API timeout config, Retry-After header, configurable retry
- TimeoutConfig in HTTP client builder (connect 30s, request 5min)
- CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Retry-After header parsing on 429 responses
- ApiTimeoutConfig in runtime config (settings.json)
- 8a883430: retry 400 responses with transient gateway error bodies
- Detects known gateway phrases in 400 response bodies
- Marks them as retryable instead of hard-failing
- ed91a61e: add 'no parseable body' to CONTEXT_WINDOW_ERROR_MARKERS
- Some providers return 400 with 'no parseable body' for oversized
requests instead of a proper context_length_exceeded error
Commits skipped (already in upstream via PR #3015):
- 453ab642: optional id field (already merged)
- baa8d1ba: HTML detection in streaming (already merged)
- 33d2f789: JSON error detection in streaming (already merged)
8 files changed, 299 insertions, 80 deletions
- Add TimeoutConfig to HTTP client builder with connect_timeout (30s)
and request_timeout (5min) defaults, configurable via
CLAW_API_CONNECT_TIMEOUT and CLAW_API_REQUEST_TIMEOUT env vars
- Add with_timeout() builder to both AnthropicClient and
OpenAiCompatClient for per-client timeout configuration
- Parse Retry-After header on 429 responses and use it to override
exponential backoff delay when present
- Add ApiTimeoutConfig to runtime config with apiTimeout settings
in ~/.claw/settings.json (connectTimeout, requestTimeout, maxRetries)
- Add retry_after field to ApiError::Api for propagating rate limit
backoff hints through the retry pipeline
Add SUPPRESS_CONFIG_WARNINGS_STDERR AtomicBool flag in runtime/config.rs
and expose suppress_config_warnings_for_json_mode() via runtime crate.
In main.rs, scan raw argv for --output-format json before parse_args
and activate the flag so no settings-load warnings reach stderr on any
JSON-mode surface (status, sandbox, system-prompt, mcp list, skills list,
agents list, --resume /config*, etc.).
Text-mode surfaces are unaffected; prose deprecation warnings continue
to appear on stderr.
All 572+ tests pass (one pre-existing worker_boot failure unrelated).
JSON config output already carries collected config diagnostics in warnings[], so prose stderr emission must be reserved for text/local paths. Lazy permission-mode default resolution prevents an earlier config load from leaking the same deprecation before the JSON renderer runs.\n\nConstraint: ROADMAP #815 requires text mode to keep human stderr warnings while JSON config/list suppresses duplicate app-level config prose.\nRejected: Filtering all stderr in JSON mode | would hide cargo/compiler or unrelated diagnostics outside the app config warning path.\nConfidence: high\nScope-risk: narrow\nDirective: Keep load_collecting_warnings side-effect-free; use load() for human stderr emission.\nTested: cargo fmt; cargo test -p rusty-claude-cli --test output_format_contract config_json_reports_deprecations_structurally_without_stderr_duplicate_815; cargo test -p rusty-claude-cli --test output_format_contract; manual target/debug/claw JSON config fixture.\nNot-tested: cargo clippy -p rusty-claude-cli --all-targets -- -D warnings is blocked by pre-existing runtime dead_code/trident warnings.
- Remove retry_after: None from ApiError::Api structs in openai_compat.rs (field was removed)
- Remove SlashCommand::Team parse arm (variant was removed from enum)
- Add config_load_error_kind: None to doctor path StatusContext initializer
- Add Thinking arm to all ContentBlock match blocks in trident.rs
- Remove cargo fmt drift across commands, config, compact, tools, trident
Constraint: G003 worker outputs added config and startup evidence fields that must compile under focused runtime validation before leader push.
Rejected: pushing auto-checkpoints without leader validation | integrated tests initially failed to compile due missing imports and stale StartupEvidenceBundle fixtures.
Confidence: high
Scope-risk: narrow
Directive: When extending StartupEvidenceBundle, update all in-crate fixtures in the same change.
Tested: git diff --check; cargo fmt --manifest-path rust/Cargo.toml --all -- --check; cargo test --manifest-path rust/Cargo.toml -p runtime trusted_roots -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime startup -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p runtime worker_boot -- --nocapture; cargo test --manifest-path rust/Cargo.toml -p tools path_scope -- --nocapture; cargo check --manifest-path rust/Cargo.toml --workspace
Not-tested: full cargo test --workspace remains deferred during active G003 team work.
Co-authored-by: OmX <omx@oh-my-codex.dev>
## Problem
`runtime::config::tests::validates_unknown_top_level_keys_with_line_and_field_name`
intermittently fails during `cargo test --workspace` (witnessed during
#147 and #148 workspace runs) but passes deterministically in isolation.
Example failure from workspace run:
test result: FAILED. 464 passed; 1 failed
## Root cause
`runtime/src/config.rs::tests::temp_dir()` used nanosecond timestamp
alone for namespace isolation:
std::env::temp_dir().join(format!("runtime-config-{nanos}"))
Under parallel test execution on fast machines with coarse clock
resolution, two tests start within the same nanosecond bucket and
collide on the same path. One test's `fs::remove_dir_all(root)` then
races another's in-flight `fs::create_dir_all()`.
Other crates already solved this pattern:
- plugins::tests::temp_dir(label) — label-parameterized
- runtime::git_context::tests::temp_dir(label) — label-parameterized
runtime/src/config.rs was missed.
## Fix
Added process id + monotonically-incrementing atomic counter to the
namespace, making every callsite provably unique regardless of clock
resolution or scheduling:
static COUNTER: AtomicU64 = AtomicU64::new(0);
let pid = std::process::id();
let seq = COUNTER.fetch_add(1, Ordering::Relaxed);
std::env::temp_dir().join(format!("runtime-config-{pid}-{nanos}-{seq}"))
Chose counter+pid over the label-parameterized pattern to avoid
touching all 20 callsites in the same commit (mechanical noise with
no added safety — counter alone is sufficient).
## Verification
Before: one failure per workspace run (config test flake).
After: 5 consecutive `cargo test --workspace` runs — zero config
test failures. Only pre-existing `resume_latest` flake remains
(orthogonal, unrelated to this change).
for i in 1 2 3 4 5; do cargo test --workspace; done
# All 5 runs: config tests green. Only resume_latest flake appears.
cargo test -p runtime
# 465 passed; 0 failed
## ROADMAP.md
Added Pinpoint #149 documenting the gap, root cause, and fix.
Closes ROADMAP #149.
Three dead-code warnings eliminated from cargo check:
1. KNOWN_TOP_LEVEL_KEYS / DEPRECATED_TOP_LEVEL_KEYS in config.rs
- Superseded by config_validate::TOP_LEVEL_FIELDS and DEPRECATED_FIELDS
- Were out of date (missing aliases, providerFallbacks, trustedRoots)
- Removed
2. read_git_recent_commits in prompt.rs
- Private function, never called anywhere in the codebase
- Removed
3. workspace_sessions_dir in session.rs
- Public API scaffolded for session isolation (#41)
- Genuinely useful for external consumers (clawhip enumerating sessions)
- Added 2 tests: deterministic path for same CWD, different path for different CWDs
- Annotated with #[allow(dead_code)] since it is external-facing API
cargo check --workspace: 0 warnings remaining
430 runtime tests passing, 0 failing
Closes the startup-friction gap filed in ROADMAP (dd97c49).
WorkerCreate required trusted_roots on every call with no config-level
default. Any batch script that omitted the field stalled all workers at
TrustRequired with no auto-recovery path.
Changes:
- RuntimeFeatureConfig: add trusted_roots: Vec<String> field
- ConfigLoader: wire parse_optional_trusted_roots() for 'trustedRoots' key
- RuntimeConfig / RuntimeFeatureConfig: expose trusted_roots() accessor
- config_validate: add trustedRoots to TOP_LEVEL_FIELDS schema (StringArray)
- Tests: parses_trusted_roots_from_settings + trusted_roots_default_is_empty_when_unset
Callers can now set trusted_roots in .claw/settings.json:
{ "trustedRoots": ["/tmp/worktrees"] }
WorkerRegistry::spawn_worker() callers should merge config.trusted_roots()
with any per-call overrides (wiring left for follow-up).
Two real schema gaps found via dogfood (cargo test -p runtime):
1. aliases and providerFallbacks not in TOP_LEVEL_FIELDS
- Both are valid config keys parsed by config.rs
- Validator was rejecting them as unknown keys
- 2 tests failing: parses_user_defined_model_aliases,
parses_provider_fallbacks_chain
2. Deprecated keys were being flagged as unknown before the deprecated
check ran (unknown-key check runs first in validate_object_keys)
- Added early-exit for deprecated keys in unknown-key loop
- Keeps deprecated→warning behavior for permissionMode/enabledPlugins
which still appear in valid legacy configs
3. Config integration tests had assertions on format strings that never
matched the actual validator output (path:3: vs path: ... (line N))
- Updated assertions to check for path + line + field name as
independent substrings instead of a format that was never produced
426 tests passing, 0 failing.
This adds crate-level and type-level Rustdoc to the runtime crate's core exported types so downstream crates and contributors can understand the session, prompt, permission, OAuth, usage, and tool I/O primitives without spelunking every implementation file.
Constraint: The docs pass needed to stay focused on public runtime types without changing behavior
Rejected: Add blanket docs to every public item in one sweep | larger churn than needed for a targeted docs pass
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: When exporting new runtime primitives from lib.rs, add a short Rustdoc summary in the defining module at the same time
Tested: cargo build --workspace; cargo test --workspace
Not-tested: rustdoc HTML rendering beyond doc-test coverage
Validate hook arrays in each config file before deep-merging so malformed entries fail with source-path context instead of surfacing later as a merged hook parse error.
Constraint: Runtime hook config currently supports only string command arrays
Rejected: Add hook-specific schema logic inside deep_merge_objects | keeps generic merge helper decoupled from config semantics
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep hook validation source-aware before generic config merges so file-specific errors remain diagnosable
Tested: cargo build --workspace; cargo test --workspace
Not-tested: live claw --help against a malformed external user config
The worker boot registry now exposes the requested lifecycle states, emits structured trust and prompt-delivery events, and recovers from shell or wrong-target prompt delivery by replaying the last prompt. Supporting fixes keep MCP remote config parsing backwards-compatible and make CLI argument parsing less dependent on ambient config and cwd state so the workspace stays green under full parallel test runs.
Constraint: Worker prompts must not be dispatched before a confirmed ready_for_prompt handshake
Constraint: Prompt misdelivery recovery must stay minimal and avoid new dependencies
Rejected: Keep prompt_accepted and blocked as public lifecycle states | user requested the narrower explicit state set
Rejected: Treat url-only MCP server configs as invalid | existing CLI/runtime tests still rely on that shorthand
Confidence: high
Scope-risk: moderate
Reversibility: clean
Directive: Preserve prompt_in_flight semantics when extending worker boot; misdelivery detection depends on it
Tested: cargo build --workspace; cargo test --workspace
Not-tested: Live tmux worker delivery against a real external coding agent pane
This threads typed hook settings through runtime config, adds a shell-based hook runner, and executes PreToolUse/PostToolUse around each tool call in the conversation loop. The CLI now rebuilds runtimes with settings-derived hook configuration so user-defined Claude hook commands actually run before and after tools.
Constraint: Hook behavior needed to match Claude-style settings.json hooks without broad plugin/MCP parity work in this change
Rejected: Delay hook loading to the tool executor layer | would miss denied tool calls and duplicate runtime policy plumbing
Confidence: medium
Scope-risk: moderate
Reversibility: clean
Directive: Keep hook execution in the runtime loop so permission decisions and tool results remain wrapped by the same conversation semantics
Tested: cargo test; cargo build --release
Not-tested: Real user hook scripts outside the test harness; broader plugin/skills parity