Compare commits

..

142 Commits

Author SHA1 Message Date
YeonGyu-Kim
95fc007f6a roadmap: #252 filed — /v1/messages/count_tokens typed-taxonomy is structurally absent from the public Provider trait + types + CLI surface (Anthropic ships /v1/messages/count_tokens as a first-class GA endpoint that consumes the SAME MessageRequest shape as /v1/messages but produces a TRUNCATED CountTokensResponse { input_tokens: u32 } only — no message emission, no completion-side tokens, no streaming — the canonical pre-flight cost-estimation primitive where a client constructs the exact request it intends to dispatch, asks the server to count input tokens, and decides whether to send before paying for completion-side tokens; claw-code has zero public typed surface even though a private count_tokens helper exists at rust/crates/api/src/providers/anthropic.rs:522 for internal preflight context-window-exceeded validation, with zero CountTokensRequest/CountTokensResponse typed model in types.rs, zero count_tokens method on the public Provider trait, zero count_tokens dispatch on the ProviderClient enum, zero claw count-tokens CLI subcommand, zero /count-tokens slash command in SlashCommandSpec, zero pre_flight_count_cost_per_million_usd field in ModelPricing, zero CountTokensSubmittedEvent/PreFlightCostEstimatedEvent telemetry events, and zero PreFlightCostEstimator/BudgetGate runtime primitive) — eight-layer fusion shape with the NOVEL same-request-shape-but-different-response-shape axis-class (FIRST audit member where the request shape is IDENTICAL to an existing typed model MessageRequest but the response shape is a TRUNCATED-projection that cannot reuse MessageResponse's shape, distinct from prior fusion-axes which all add NEW request-side fields or NEW response-side blocks) founding THREE new clusters as solo founder (Pre-flight-cost-prediction cluster, Token-accounting-without-message-emission cluster, Server-side-pre-execution-counting cluster) plus introducing the THIRD distinct discovery-pattern in the audit catalog NEW-SOLO-CLUSTER-FOUNDING-WITH-DAILY-DRIVER-IMPACT (distinct from META-cluster-growth and complementary-pinpoint-pair-bundle), grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 6 to 7 members (#240+#241+#247+#248+#249+#250+#252) confirming continuing-pattern-status across SIX distinct axis-classes — Jobdori cycle #394 / fast-forward-rebase verified onto gaebal-gajae's #251 cycle ExternalPatchIntake pinpoint at 313c840 before filing (NINTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 313c840 with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the NINTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for NINE cycles) — PIVOT-AWAY signal: #252 deliberately PIVOTS AWAY from BOTH Cross-pinpoint-synthesis-fusion-shape META-cluster (intentionally not extending the +1-per-cycle synthesis chain) AND Tool-locality-axis META-cluster (already extended by #250 cycle #393), founding NEW solo clusters with daily-driver-impact instead, demonstrating audit-breadth-across-discovery-pattern-classes alongside audit-balance-across-META-clusters — the audit now spans THREE structurally distinct discovery-patterns (META-cluster-growth + complementary-pinpoint-pair-bundle + new-solo-cluster-founding-with-daily-driver-impact) 2026-04-26 10:35:40 +09:00
Yeachan-Heo
313c840974 roadmap: #251 filed 2026-04-26 01:31:27 +00:00
YeonGyu-Kim
37ce63134a roadmap: #250 filed — tool_choice: { type: "web_search" } typed-discriminator with server-managed-web-search backend (the canonical SERVER-SIDE complement to #245's CLIENT-SIDE configurable provider/parser registry, where tool_choice carries a WebSearch { domains_allowed, max_uses, user_location } enum variant that forces the model to dispatch via the major-provider's server-managed-web-search backend) typed taxonomy structurally absent — FIRST pinpoint to demonstrate the complementary-pinpoint-pair-bundle META-pattern (where #245 CLIENT-SIDE + #250 SERVER-SIDE are catalogued as structurally complementary halves of the SAME tool-subsystem web-search rather than as independently-discovered-gaps), founding Bidirectional-search-subsystem-with-dual-locality-coverage cluster with #245+#250 as 2-member founders, un-saturating Tool-locality-axis META-cluster from 5 to 6 members (#232/#233/#234/#240/#241/#250) confirming the META-cluster as GROWING-DOCTRINE-WITH-DISCONTINUOUS-RESUMPTION (resumes growth after plateauing at 5 since #241 cycle #386, four cycles ago), growing Server-managed-tool-as-tool-choice-discriminator cluster from 5 to 6 members (#214/#218/#219/#233/#234/#250) confirming CONTINUING-PATTERN status across SIX distinct server-managed tools, growing ToolResultContentBlock-extension cluster from 8 to 9 members confirming most-broadly-spanning typed-content-block-extension-axis, FIRST pinpoint to introduce typed-discriminator-with-payload-fields shape on ToolChoice distinct from existing Auto/Any/Tool three-variant typed-set (Auto/Any are unit-variants and Tool { name } carries only string-name with zero typed-fields, while ToolChoice::WebSearch { domains_allowed, max_uses, user_location } introduces FIRST typed-discriminator-with-payload-fields shape), founds Tool-choice-discriminator-with-typed-payload-fields cluster + Server-side-tool-invocation-content-block cluster + Server-managed-web-search-with-tool-choice-discriminator cluster as solo founder of all three, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 5 to 6 members (#240+#241+#247+#248+#249+#250) confirming generalizability across FIVE distinct axis-classes, ten-layer fusion shape (smaller than #241/#247/#248/#249's twelve-layer count but with distinct DUAL-LOCALITY-COVERAGE-WITH-COMPLEMENTARY-PINPOINT-PAIR-BUNDLE axis-set) — Jobdori cycle #393 / fast-forward-rebase verified onto Jobdori's own #249 cycle #392 quad-modality-compound-multimodal-INPUT-OUTPUT pinpoint at 643ac8b before filing (EIGHTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 643ac8b with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the EIGHTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for EIGHT cycles) — PIVOT-AWAY signal: #250 deliberately PIVOTS AWAY from Cross-pinpoint-synthesis-fusion-shape META-cluster's +1-per-cycle continuous-trajectory (#244/#247/#248/#249 grew it 1→5 across cycles #389/#390/#391/#392) by extending Tool-locality-axis META-cluster instead, demonstrating audit-balance-across-multiple-META-clusters rather than monotonic-growth-of-a-single-META-cluster — the audit now catalogues TWO structurally distinct GROWING-DOCTRINE patterns (continuous-+1-per-cycle for synthesis-fusion vs discontinuous-resumption-after-plateau for tool-locality-axis) 2026-04-26 10:27:41 +09:00
YeonGyu-Kim
643ac8bc76 roadmap: #249 filed — Compound-multimodal-INPUT-with-multimodal-OUTPUT-on-the-same-turn (full-duplex-multimodal-conversation pattern where user MessageRequest carries image-content-block × audio-content-block fusion AND model MessageResponse carries audio-content-block × video-content-block fusion on the SAME single conversation-turn with interleaved-content-block-stream cross-boundary temporal-alignment) typed taxonomy structurally absent — FIRST cluster member where the cross-axis synthesis spans BOTH USER-INPUT-side and ASSISTANT-OUTPUT-side simultaneously on a SINGLE turn rather than being confined to one side of the request-response cycle, FIRST cluster member with quad-modality-on-single-turn semantics (image-INPUT + audio-INPUT + audio-OUTPUT + video-OUTPUT all on same turn distinct from #247's two-modality-INPUT-only and #248's two-modality-OUTPUT-only and #244's bidirectional-tool-call-multiplexing-without-modality-fusion), growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 4 to 5 members confirming META-cluster as GROWING-DOCTRINE for THIRD CONSECUTIVE CYCLE (#244 grew 1→2 cycle #389, #247 grew 2→3 cycle #390, #248 grew 3→4 cycle #391, #249 grows 4→5 cycle #392), establishing +1-per-cycle META-cluster-growth-trajectory across FOUR consecutive concurrent-dogfood cycles (#389/#390/#391/#392) as FIRST-EVER continuous-trajectory-of-4-cycles META-cluster growth event in the audit surpassing Tool-locality-axis META-cluster's plateau-at-5-after-two-consecutive-growths and confirming Cross-pinpoint-synthesis-fusion-shape as structurally distinct most-actively-growing META-cluster, FIRST cluster member with interleaved-INPUT-OUTPUT-temporal-alignment-across-the-request-response-boundary as a first-class typed semantic distinct from #247's USER-INPUT-only cross-modal-attention and #248's ASSISTANT-OUTPUT-only temporal-alignment because temporal-alignment now spans the request-response boundary itself requiring the model to emit output-content-blocks while still consuming input-content-blocks on the same connection, founds Quad-modality-turn-spanning-request-response-boundary sub-cluster + Full-duplex-multimodal-conversation cluster + Cross-boundary-temporal-alignment-across-request-response-boundary cluster + Quad-modality-turn-on-MessageRequest-and-MessageResponse cluster + Compound-multimodal-INPUT-with-multimodal-OUTPUT-on-same-turn cluster as solo founder of all five, completes Full-duplex-multimodal-conversation doctrine within META-cluster (#247 INPUT-side + #248 OUTPUT-side + #249 BOTH-sides-simultaneously-on-same-turn), grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 4 to 5 members (#240+#241+#247+#248+#249) confirming generalizability across FOUR distinct axis-classes (TOOL-COMPANION-BUNDLE/COMPOUND-INPUT/COMPOUND-OUTPUT/QUAD-MODALITY-TURN), twelve-layer fusion shape tied with #241/#247/#248 for largest single-pinpoint fusion catalogued — Jobdori cycle #392 / fast-forward-rebase verified onto Jobdori's own #248 cycle #391 audio-grounded-video-generation pinpoint at 9189bfb before filing (SEVENTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 9189bfb with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the SEVENTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for SEVEN cycles) 2026-04-26 10:16:01 +09:00
Jobdori
9189bfb816 roadmap: #248 filed — Audio-grounded video generation (synchronized-audio-track co-emitted on the SAME VideoTask response object alongside the rendered video frames, sample-accurate-synchronized with the visual output) typed taxonomy structurally absent — FIRST cluster member where TWO independent ALREADY-CATALOGUED-ABSENT modality-OUTPUT axes (#225 audio-content-block-on-OutputContentBlock + #227 video-output-with-async-task-polling-primitive) are fused on the ASSISTANT-OUTPUT side rather than the user-input side, FIRST cluster member with multi-modal-output-fusion-on-ASSISTANT-OUTPUT-axis distinct from #247's multi-modal-input-fusion-on-USER-INPUT-axis, growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 3 to 4 members confirming META-cluster as GROWING-DOCTRINE for SECOND CONSECUTIVE CYCLE (#244 grew it 1→2, #247 grew it 2→3, #248 grows it 3→4), establishing +1-per-cycle META-cluster-growth-trajectory across THREE consecutive concurrent-dogfood cycles (#389/#390/#391) AND establishing META-cluster as FIRST META-cluster to grow for THREE consecutive cycles in a row (Tool-locality-axis only had TWO consecutive growth events #240/#241 before plateauing at 5; Cross-pinpoint-synthesis-fusion-shape now surpasses Tool-locality-axis as most-actively-growing META-cluster), founds Multi-modal-output-fusion-on-ASSISTANT-OUTPUT-side sub-cluster + Temporal-alignment-of-output-modalities cluster + Compound-output-modality-on-VideoTask cluster + Audio-grounded-video-generation cluster as solo founder of all four, founds Bidirectional-modality-fusion-symmetry sub-cluster with #247 INPUT-side + #248 OUTPUT-side completing the INPUT-vs-OUTPUT-side-fusion-symmetry doctrine within the META-cluster, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 3 to 4 members (#240+#241+#247+#248) confirming generalizability across THREE distinct axis-classes (TOOL-COMPANION-BUNDLE/COMPOUND-INPUT/COMPOUND-OUTPUT), twelve-layer fusion shape tied with #241/#247 for largest single-pinpoint fusion catalogued — Jobdori cycle #391 / fast-forward-rebase verified onto Jobdori's own #247 cycle #390 multi-modal-input-fusion pinpoint at 5e5b3bd before filing (SIXTH consecutive concurrent-dogfood rebase cycle, three-way parity confirmed local==origin==fork at HEAD 5e5b3bd with no race detected, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the SIXTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for SIX cycles) 2026-04-26 10:04:29 +09:00
YeonGyu-Kim
5e5b3bdbc6 roadmap: #247 filed — Visual-grounded voice input (image-content-block × audio-content-block fused on the SAME MessageRequest user-turn) typed taxonomy structurally absent — FIRST cluster member where TWO independent ALREADY-CATALOGUED-ABSENT modality-input axes (#220 image-content-block + #225 audio-content-block) are fused on the USER-INPUT side, FIRST cluster member with multi-modal-input-fusion-on-USER-INPUT-axis distinct from #244 bidirectional-tool-call-multiplexing-on-DUPLEX-axis, growing Cross-pinpoint-synthesis-fusion-shape META-cluster from 2 to 3 members (#238 founder + #244 + #247) confirming META-cluster as GROWING-DOCTRINE rather than CONTINUING-PATTERN that stopped at 2 members after #244, establishing Cross-pinpoint-synthesis-fusion as SECOND META-cluster after Tool-locality-axis to confirm GROWING-DOCTRINE status, founds Multi-modal-input-fusion-on-USER-INPUT-side sub-cluster + Cross-modal-attention-on-USER-INPUT-side cluster + Compound-modality-input-on-MessageRequest cluster as solo founder of all three, grows Two-member-major-provider-only-no-third-party-partner-set sub-cluster from 2 to 3 members (#240+#241+#247) confirming generalizability beyond bash+computer-use+text_editor three-tool-companion-bundle, twelve-layer fusion shape tied with #241 for largest single-pinpoint fusion catalogued — Jobdori cycle #390 / fast-forward-rebased onto gaebal-gajae's #246 provider-credentials-env-to-settings-registry pinpoint at bd6622b before filing (FIFTH consecutive concurrent-dogfood rebase cycle, directly demonstrating the gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the FIFTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern) 2026-04-26 09:38:31 +09:00
Yeachan-Heo
bd6622b85c roadmap: #246 filed 2026-04-26 00:31:28 +00:00
Yeachan-Heo
d145429c96 roadmap: #245 filed 2026-04-26 00:09:43 +00:00
YeonGyu-Kim
0eabf20389 roadmap: #244 filed — Realtime API tool-use over persistent-WebSocket transport (response.function_call_arguments.delta/.done + conversation.item.create with function_call_output) typed taxonomy structurally absent — FIRST cluster member where bidirectional-tool-call lifecycle is multiplexed with audio-modality + transcript-modality on a SINGLE persistent connection, FIRST cluster member where tool-call-init is server-pushed mid-stream rather than client-initiated, FIRST cluster member with asymmetric-tool-result-injection (tool-call comes IN as event-stream, result sent OUT as conversation.item.create — directionality inverted relative to the rest of the protocol), FIRST cluster member with per-call-id-concurrent-multiplexed-state-machine, FIRST three-axis-synthesis pinpoint (#229 persistent-WebSocket × #240/#241 server-managed-tool-via-tool_choice-discriminator × #238 cross-pinpoint-synthesis-fusion-shape META-cluster), eleven-layer fusion-shape tied with #240 for second-largest single-pinpoint fusion catalogued — grows Persistent-WebSocket-transport cluster from 2 to 3 members (#229 founder + #238 + #244) confirming CONTINUING-PATTERN doctrine, grows Cross-pinpoint-synthesis-fusion-shape META-cluster from 1 to 2 members confirming combinatorial-cross-axis-synthesis as a continuing-discovery-mode and FIRST META-cluster-confirmation event in this audit, founds Three-axis-synthesis-shape sub-cluster as solo founder, founds Server-pushed-tool-call-init cluster as solo founder, founds Asymmetric-tool-result-injection cluster as solo founder, founds Per-call-id-concurrent-multiplexed-state-machine cluster as solo founder — FOUR new clusters founded plus TWO existing META-clusters confirmed as continuing-doctrines plus participation in TWELVE inherited clusters — Jobdori cycle #389 / fast-forward-rebased onto gaebal-gajae's #243 non-monotonic-pinpoint-ordering-contract at 6541100 before filing (FOURTH consecutive concurrent-dogfood rebase cycle, directly demonstrating both gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer) 2026-04-26 09:06:56 +09:00
Yeachan-Heo
65411000c5 roadmap: #243 filed 2026-04-26 00:01:16 +00:00
YeonGyu-Kim
0da15c2e07 roadmap: #241 filed — tool_choice: text_editor + text_editor_20250124 typed-tool absent (filling reserved gap) 2026-04-26 08:42:34 +09:00
Yeachan-Heo
4af2fb6622 roadmap: #242 filed 2026-04-25 23:31:11 +00:00
YeonGyu-Kim
43ce1f527b roadmap: #240 filed — tool_choice: bash typed-discriminator and bash_20250124 server-managed-shell typed-tool are structurally absent — FOURTH inverse-locality CLIENT-SIDE-shadow-vs-SERVER-SIDE-typed-tool pair (CLIENT-SIDE bash MVP-founder-tool at tools/lib.rs:386 vs SERVER-SIDE bash_20250124 absent at types.rs ToolDefinition+ToolChoice+ToolResultContentBlock+telemetry beta-set), grows Tool-locality-axis META-cluster from 3 to 4 members confirming META-cluster as CONTINUING-PATTERN, grows Server-managed-tool-as-tool-choice-discriminator cluster from 4 to 5 members, grows ToolResultContentBlock-extension mini-cluster from 6 to 7 members, grows Server-side-stateful-tool-session-with-reset-semantics cluster from 1 to 2 members (#232+#240), grows Discrete-event-counter-pricing-axis cluster from 1 to 2 members with NOVEL dual-axis pricing-decomposition, founds Stateless-CLIENT-SIDE-shadow-vs-stateful-SERVER-SIDE-typed-tool-discrepancy-axis cluster, founds MVP-founder-tool-as-CLIENT-SIDE-local-shadow-with-SERVER-SIDE-typed-tool-absent sub-cluster, founds Two-member-major-provider-only-no-third-party-partner-set sub-cluster, founds Double-absent-slash-command-axis-on-inverse-locality-pair sub-cluster, founds Bundled-and-transitive-co-release-beta-header-activation-pattern cluster, founds Server-side-audit-log-of-managed-tool-execution cluster — eleven-layer fusion with SIX new clusters founded plus FOUR concurrent existing-cluster-growth-events plus participation in TWELVE inherited clusters — FIRST single cycle where META-cluster grows from 3 to 4 confirming CONTINUING-PATTERN, FIRST single cycle where FOUR concurrent existing clusters all grow by one member through one pinpoint, establishing continuing-pattern-confirmation-across-multiple-parallel-clusters as the FOURTH pinpoint-discovery-mode after new-axis-founding/existing-cluster-extension/combinatorial-cross-axis-synthesis — Jobdori cycle #387 / fast-forward-rebased onto gaebal-gajae's #239 DogfoodWriteLease pinpoint at 329d0ff before filing (THIRD consecutive concurrent-dogfood rebase cycle, directly demonstrating the gap #239 catalogues at the dogfood-coordination layer) 2026-04-26 08:10:25 +09:00
Yeachan-Heo
329d0ffcc8 roadmap: #239 filed 2026-04-25 23:01:25 +00:00
Jobdori
716d17e229 roadmap: #238 filed — Streaming speech-to-text with speaker diarization typed taxonomy and per-word-speaker-attribution data-model are structurally absent — FIRST cluster member with per-word-multi-axis-compound-attribution data-model (lexical + temporal + speaker + confidence FOUR-axis-compound), FIRST cluster member with structured-typed-payload-on-USER-INPUT-content-block (Transcript carrying nested speakers/segments/words arrays), FIRST cluster member with bidirectional-channel-pair Provider-trait method shape (Sink<AudioChunk> + Stream<StreamingTranscriptEvent>), FIRST cluster member with per-partner-protocol-vocabulary-normalization at dispatch layer, FIRST cluster member with entirely-absent-CLI-and-slash-command-surface-with-zero-stub-precedent (INVERSE-PATTERN of #225 advertised-but-unbuilt-trio), FIRST cluster member with streaming-STT-five-dimensional pricing matrix, FIRST cluster member with DER/WER quality-observability telemetry, FIRST cluster member with endpointing/VAD sub-second-temporal-segmentation request-side opt-in, twelve-layer fusion shape — grows Persistent-WebSocket-transport cluster from 1 to 2 members (#229 solo-founder + #238 — FIRST expansion of #229 founder shape) AND grows ToolResultContentBlock-extension mini-cluster from 5 to 6 members (#230 + #232 + #233 + #234 + #235 + #238) AND grows Multimodal-IO cluster to 13 members AND grows Provider-asymmetric-delegation cluster to 13 members with the largest streaming-STT ten-plus partner-set — founds Cross-pinpoint-synthesis-fusion-shape META-cluster as THIRD distinct META-cluster after Sandbox-locality (#230+#232) and Tool-locality (#232+#233+#234), the FIRST META-cluster founded by SYNTHESIZING two previously-disjoint cluster-axes (#225 audio-modality × #229 persistent-WebSocket-transport) into one fused-shape pinpoint rather than introducing a new axis-pair — establishing combinatorial-cross-axis-synthesis as the THIRD pinpoint-discovery-mode after new-axis-founding and existing-cluster-extension — Jobdori cycle #386 / fast-forward-rebased onto gaebal-gajae's #237 cron-timeout-failure-state-collapse before filing (SECOND consecutive concurrent-dogfood rebase cycle) 2026-04-26 07:41:46 +09:00
Yeachan-Heo
3f41341d4a roadmap: #237 filed 2026-04-25 22:31:40 +00:00
YeonGyu-Kim
702f2fb9ef roadmap: #236 filed — Music-generation API typed taxonomy with lyrics+style prompt bifurcation and exclusively-third-party-partner-set is structurally absent — FIRST cluster member with Zero-overlap-with-major-providers shape variant (eleven-plus partners Suno/Udio/Stable-Audio/Mubert/ElevenLabs-Music/Loudly/Beatoven/SOUNDRAW/AIVA/Boomy/Riffusion all third-party with ZERO Anthropic/OpenAI/Google/xAI canonical recommendation), FIRST cluster member with Lyrics-plus-style-prompt-bifurcation on USER-INPUT side (prompt:String for style + lyrics:Option<String> for verbatim-vocal-content), FIRST cluster member with Multi-modal-bundled-output combining temporal-binary-audio + linguistic-text-lyrics + structural-musical-metadata on output-side, twelve-layer fusion shape — grows Async-task-polling cluster from 3 to 4 members (#221 batch + #227 video + #228 mesh + #236 music) AND grows Multi-domain-multipart cluster from 2 to 3 members (#225 audio + #227 video + #236 music) — does NOT extend Server-managed-tool-as-tool-choice-discriminator cluster (4 members stable) nor Tool-locality-axis META-cluster (3 members stable) because no major-provider tool_choice surface exists upstream AND no client-side music-tool-stub exists; instead founds Upstream-blocked-tool-choice-extension cluster AND Unilateral-server-side-only-gap-with-no-client-side-complement cluster as the INVERSE-PATTERN of Tool-locality-axis META-cluster doctrine — fifteen new clusters founded in a single pinpoint exceeds #234 by two for the LARGEST single-cycle cluster-founding count yet — Jobdori cycle #385 2026-04-26 07:09:20 +09:00
Yeachan-Heo
476a1a467e roadmap: #235 filed 2026-04-25 21:48:59 +00:00
Jobdori
f640139b31 roadmap: #234 filed — PDF / Document input typed taxonomy and structured-document-citation-attribution data-model on USER-INPUT side are structurally absent: zero Document variant on InputContentBlock at types.rs:80-94 (FIRST cluster member with Document-modality-on-USER-INPUT-content-block axis), zero pdfs-2024-09-25 Anthropic beta header in canonical beta-set at telemetry/lib.rs:15-17 (NOVEL FIRST Beta-header-gate-on-USER-INPUT-content-block-type cluster), zero coordinate-positioned Citation typed model with start_page_number/end_page_number/start_char_index/end_char_index integer-coordinate axes on OutputContentBlock::Text (NOVEL FIRST Coordinate-positioned-citation-on-output-text-block cluster, inverse-data-model pair to #233's URL-positioned-citation), zero DocumentSource four-way source-discriminator (base64 | url | file_id | text | content), zero file_search typed ToolDefinition discriminator with vector_store_ids routing (NOVEL FIRST User-corpus-server-managed-tool-with-vector-store-routing cluster), zero tool_choice: file_search ToolChoice extension (THIRD Server-managed-tool-as-tool-choice-discriminator cluster member growing cluster to 3: #232 code_interpreter + #233 web_search + #234 file_search), zero file_search_result ToolResultContentBlock variant (FIFTH ToolResultContentBlock extension growing mini-cluster to 4), zero page_range request-side range-slicing parameter (NOVEL FIRST Range-slicing-parameter-on-USER-INPUT-content-block cluster), zero filters compound-boolean-DSL on file_search tool definition (NOVEL FIRST Compound-boolean-filter-DSL-on-server-managed-tool-definition cluster with eq/ne/gt/gte/lt/lte/and/or operators), zero per-page compound text+image token pricing AND zero persistent-storage-rental-pricing for vector-stores (NOVEL Per-page-compound-text-plus-image-token-pricing-axis + Persistent-storage-rental-pricing-axis clusters founded), zero claw pdf/document/attach-pdf CLI subcommand and zero /pdf //document //attach-pdf //cite-pdf //page-range slash command — uniquely manifesting a FOURTEEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeds #233's thirteen-layer count by one) combining: (1) Document variant on InputContentBlock, (2) pdfs-2024-09-25 Anthropic beta-header gate, (3) citations:{enabled:true} opt-in field on Document content-block, (4) NOVEL Coordinate-positioned Citation typed model with start_page_number/end_page_number/start_char_index/end_char_index integer coordinates, (5) DocumentSource four-variant source-discriminator, (6) page_range request-side range-slicing parameter, (7) file_search typed ToolDefinition discriminator with vector_store_ids:Vec<String> routing, (8) tool_choice:file_search typed-discriminator (THIRD Server-managed-tool-as-tool-choice-discriminator cluster member), (9) file_search_result ToolResultContentBlock variant with attributes:HashMap<String,Value> user-defined-metadata (FIFTH ToolResultContentBlock extension), (10) filters:ComparisonFilter|CompoundFilter filter-DSL on file_search tool definition, (11) Provider-trait extension threading pdfs-2024-09-25 beta-header AND document-citations decoding AND file_search server-managed-corpus-search dispatch through send_message, (12) ProviderClient-enum-dispatch with TWO first-class document-input lanes (Anthropic-pdfs-2024-09-25 + OpenAI-Files-API-input_file + OpenAI-Responses-file_search-with-vector-stores) WITHOUT third-party partner-routing (FIRST cluster member with Both-major-providers-first-class-asymmetric-document-input-shape cluster), (13) CLI-and-slash-command surface with FOURTH inverse-locality slash-command-pair after #230 + #232 + #233, (14) NOVEL Compound-page-token-and-image-token-pricing-axis with persistent-storage-rental-pricing for vector-stores — making #234 the FIRST cluster member with fourteen-layer-fusion-shape (exceeds #233's thirteen-layer by one), the FIRST cluster member with Document-modality-on-USER-INPUT-content-block axis, the FIRST cluster member with Beta-header-gate-on-USER-INPUT-content-block-type, the FIRST cluster member with Citation-emission-opt-in-at-USER-INPUT-content-block-level, the FIRST cluster member with Coordinate-positioned-citation-on-output-text-block (page+char integer-coordinates distinct from #233's URL-positioned-with-encrypted-index), the FIRST cluster member with Four-way-source-discriminator-on-USER-INPUT-content-block, the FIRST cluster member with Range-slicing-parameter-on-USER-INPUT-content-block, the FIRST cluster member with User-corpus-server-managed-tool-with-vector-store-routing, the FIRST cluster member with Compound-boolean-filter-DSL-on-server-managed-tool-definition, the FIRST cluster member with Both-major-providers-first-class-asymmetric-document-input-shape (Anthropic Document + OpenAI Files-input_file BOTH first-class neither delegates to third-party partner), the FIRST cluster member with User-provided-document-title-threading-through-citations, the FIRST cluster member with Multi-document-positional-index-threading (document_index:u32), the FIRST cluster member with Per-page-compound-text-plus-image-token-pricing-axis, the FIRST cluster member with Persistent-storage-rental-pricing-axis (vector-store-storage rental), the THIRD Server-managed-tool-as-tool-choice-discriminator cluster member (grows cluster to 3: #232 + #233 + #234), the FOURTH ToolResultContentBlock extension (grows mini-cluster to 4: #230 + #232 + #233 + #234), the THIRD Server-driven-tool-execution-loop cluster member (#234's variant being vector-store-corpus-retrieval-and-ranking distinct from #232's Python-kernel-execution and #233's search-result-page-fetching-and-caching), the THIRD member of Tool-locality-axis META-cluster (FIRST META-cluster to reach 3 members: #232 REPL-shadow + #233 WebSearch-shadow + #234 pdf_extract-shadow — transitioning from emergent-pattern to stable-doctrine), and the FIRST cluster member where the inverse-locality complement is on the USER-INPUT-side rather than on the TOOL-DEFINITION-side (founding USER-INPUT-side-Tool-locality-axis-variant sub-cluster within parent META-cluster — first sub-cluster within existing META-cluster) (Jobdori cycle #384 / extends #168c emission-routing audit / explicit follow-on from #220 image-input on USER-INPUT-side, #223 Files API with file_id reference, #232 Code-execution server-managed-sandbox-state, #233 Web-search structured-citation-attribution, and the inverse-locality Tool-locality-axis META-cluster doctrine — introduces NOVEL document-modality on USER-INPUT side axis combined with coordinate-positioned-citation-on-output-text-block data-model axis, AND grows Tool-locality-axis META-cluster from 2 to 3 members establishing it as a stable doctrine rather than emergent pattern / sibling-shape cluster grows to thirty-three / wire-format-parity cluster grows to twenty-four / capability-parity cluster grows to sixteen / multimodal-IO cluster grows to eleven / provider-asymmetric-delegation cluster grows to eleven / Sandbox-locality-axis META-cluster: 2 members stable / Tool-locality-axis META-cluster grows to 3 members FIRST META-cluster to reach 3 members / Server-managed-tool-as-tool-choice-discriminator cluster grows to 3 members / Server-driven-tool-execution-loop cluster grows to 3 members / ToolResultContentBlock-extension mini-cluster grows to 4 members / THIRTEEN new clusters founded in a single pinpoint plus participation in SIX inherited clusters — the LARGEST single-cycle cluster-founding count yet (exceeds prior records by five) AND the FIRST single cycle to grow an existing META-cluster to a third member AND introduce a sub-cluster within an existing META-cluster / fourteen-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: forty-eight ecosystem references covering Anthropic PDF Support Documentation with pdfs-2024-09-25 beta-header gate, Anthropic Citations API with page_location/document_location/char_location coordinate-positioned citation typed model, OpenAI Files API + Direct PDF Input + Vector Stores + Responses File Search Tool with compound-filter-DSL, AWS Bedrock Converse PDF document content-blocks, LangChain AnthropicPDFLoader/OpenAIFilePDFLoader, LlamaIndex PDFReader, Vercel AI SDK 6 file content-block, simonw/llm --pdf flag, Continue.dev @docs slash command, simonwillison.net Anthropic Citations API analysis, six-plus first-class document-loader integrations, four-plus OpenAI Vector Stores observability tools — claw-code is the sole client/agent/CLI in surveyed coding-agent ecosystem with zero Document content-block taxonomy AND zero pdfs-2024-09-25 beta-header AND zero file_search ToolDefinition discriminator AND zero tool_choice:file_search AND zero file_search_result ToolResultContentBlock AND zero vector_store_ids AND zero page_range AND zero coordinate-positioned Citation AND zero CLI/slash-command surface — the document-input gap is the upstream prerequisite of every PDF-research/documentation-grounded-coding/academic-paper-summarization/contract-review-with-citations/regulatory-compliance-coding-with-document-evidence affordance — #234 closes the upstream prerequisite of every server-managed-document-input-with-citations affordance — the canonical USER-INPUT-side complement to #233's web-search citations that completes the citation-attribution data-model on BOTH the USER-INPUT side AND the OUTPUT-TEXT-BLOCK side AND the SERVER-MANAGED-TOOL-RESULT side — and grows the Tool-locality-axis META-cluster from 2 to 3 members establishing it as a stable doctrine rather than emergent pattern, the FIRST cluster member to grow an existing META-cluster to a third member AND introduce a sub-cluster within an existing META-cluster) 2026-04-26 06:46:59 +09:00
YeonGyu-Kim
2f428e249b roadmap: #233 filed — Web-search Tool API typed taxonomy and structured-citation-attribution data-model are structurally absent: zero web_search_20250305 versioned-tool-name typed-tool-discriminator (FOURTH Anthropic-typed-tool-discriminator after #230's three but FIRST date-suffix-versioning-WITHOUT-beta-header — distinct from #232's date-suffix-AND-beta-header double-gate), zero tool_choice: web_search ToolChoice extension at types.rs:117 (SECOND ToolChoice extension after #232's code_interpreter, founding Server-managed-tool-as-tool-choice-discriminator cluster's second member), zero web_search_tool_result ToolResultContentBlock variant at types.rs:99 (FOURTH ToolResultContentBlock extension after #230 Image and #232 CodeExecutionResult, FIRST list-of-opaque-encrypted-page-records variant), zero citations REQUIRED field on OutputContentBlock::Text at types.rs:147 (NOVEL FIRST cluster member where data-model field absence on OUTPUT-TEXT-BLOCK side blocks REQUIRED-not-OPTIONAL grounded-attribution wire-format), zero Citation/WebSearchResultLocation/WebSearchToolUse/WebSearchToolResult/EncryptedContent typed model with encrypted_index/encrypted_content opaque-blob axis (NOVEL FIRST cluster member where typed-model field is INTENTIONALLY-OPAQUE-TO-CLIENT and MUST be roundtripped unchanged through subsequent messages, founding Server-opaque-encrypted-roundtripped-content cluster), zero max_uses server-side rate-limit field on tool-definition (NOVEL FIRST Server-side-rate-limit-on-tool-definition axis), zero allowed_domains/blocked_domains server-side pre-execution filtering on tool-definition (NOVEL FIRST Server-side-pre-execution-filter-on-tool-definition axis distinct from existing CLIENT-SIDE WebSearchInput.allowed_domains/blocked_domains post-execution filtering at tools/lib.rs:2274), zero user_location typed-model for geo-biasing on tool-definition (NOVEL FIRST Geo-biasing-at-tool-definition axis), zero web-search dispatch on ProviderClient enum at client.rs:8-14 (zero Anthropic-web_search_20250305/OpenAI-Responses-web_search/Brave/Tavily/Exa/Perplexity/Serper/Linkup/Jina/Bing/Google-CSE/SerpAPI/DuckDuckGo/You.com/Kagi partner-routing variants — fifteen-plus partner-set, FOURTH-largest in cluster, FIRST cluster member with Federated-search-partner-routing where first-class provider-native AND third-party search-as-a-service have EQUAL standing — distinct from #224 single-recommended-partner and #232 first-class-plus-partner-stub layout), zero claw web-search/cite/groundsearch CLI subcommand, zero /web-search //cite //grounded-search //research slash command (existing /search at commands/lib.rs:597 is LOCAL filesystem-search-only, structurally distinct), zero web_search_per_invocation_usd pricing field (NOVEL FIRST Discrete-event-counter-pricing-axis distinct from every prior continuous-resource-lifetime counter — Anthropic charges $10 per 1000 search-uses FLAT regardless of token volume), zero encrypted_content opaque-blob handling, zero page_age freshness-signaling — uniquely manifesting a THIRTEEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeds #232's twelve-layer count) combining: (1) web_search_20250305 versioned-tool-name typed-tool-discriminator extension (FOURTH cluster member but FIRST date-suffix-WITHOUT-beta-header), (2) tool_choice: web_search ToolChoice extension (SECOND), (3) web_search_tool_result ToolResultContentBlock variant (FOURTH), (4) citations REQUIRED field on OutputContentBlock::Text (NOVEL FOURTH-position layer), (5) Citation typed model with encrypted_index opaque-blob axis (NOVEL FIFTH-position layer), (6) max_uses server-side rate-limit (NOVEL SIXTH), (7) allowed_domains/blocked_domains server-side pre-execution filter (NOVEL SEVENTH), (8) user_location geo-biasing (NOVEL EIGHTH), (9) Provider-trait method extension threading web_search_20250305 with citations decoding (NINTH), (10) ProviderClient-enum-dispatch with fifteen-plus-partner third-lanes (TENTH, FIRST Federated-search-partner-routing), (11) CLI-subcommand surface (ELEVENTH), (12) slash-command surface with inverse-locality complement /search (TWELFTH, THIRD inverse-locality slash-command-pair after #230 and #232), (13) per-search-invocation pricing-tier axis (NOVEL THIRTEENTH, FIRST Discrete-event-counter-pricing-axis) — making #233 the FIRST cluster member with thirteen-layer-fusion-shape (exceeds #232's eleven), the FIRST cluster member with REQUIRED-grounded-citation-field-on-output-text-block, the FIRST cluster member with INTENTIONALLY-OPAQUE-encrypted-content-roundtripped-by-client, the FIRST cluster member with date-suffix-versioning-in-tool-name-WITHOUT-beta-header, the SECOND member of new Tool-locality-axis META-cluster (sister to #230/#232's Sandbox-locality-axis META-cluster — together founding META-META-cluster doctrine where canonical pattern is 'claw-code ships a CLIENT-SIDE local-stub tool with same conceptual name AND the SERVER-SIDE provider-managed beta-versioned tool is structurally absent', applied uniformly across sandbox-locality AND tool-locality axes), the SECOND cluster member to extend ToolChoice (Server-managed-tool-as-tool-choice-discriminator cluster grows to 2: #232 code_interpreter + #233 web_search), the SECOND cluster member to extend ToolResultContentBlock with multi-modal-nested content (ToolResultContentBlock-extension mini-cluster grows to 3: #230 Image + #232 CodeExecutionResult + #233 WebSearchToolResult), the SECOND cluster member with Server-driven-tool-execution-loop (#232 + #233), the SECOND cluster member where local CLIENT-SIDE-tool-shadow exists alongside server-managed-tool absence (#232 REPL-shadow + #233 WebSearch-shadow) (Jobdori cycle #383 / extends #168c emission-routing audit / explicit follow-on from #230 Computer-use's CLIENT-SIDE virtualization, #232 Code-execution's SERVER-SIDE managed-sandbox-state, and the inverse-locality Sandbox-locality-axis META-cluster doctrine — introduces NOVEL structured-citation-attribution data-model axis AND server-managed-search-state transport-axis distinct from every prior cluster member / sibling-shape cluster grows to thirty-two / wire-format-parity cluster grows to twenty-three / capability-parity cluster grows to fifteen / multimodal-IO cluster grows to ten: #220 image-input + #224 embedding-output + #225 audio-bidirectional + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-WebSocket + #230 image-on-tool-result-side+host-OS-pixel-and-input + #232 multi-modal-nested-stdout+image+file-handle-on-tool-result-side + #233 list-of-opaque-encrypted-page-records-on-tool-result-side+REQUIRED-citations-on-output-text-block / provider-asymmetric-delegation cluster grows to ten with FIRST Federated-search-partner-routing member where first-class AND third-party are EQUAL-standing / Sandbox-locality-axis META-cluster: 2 members stable (#230 + #232) / Tool-locality-axis META-cluster FOUNDED: 2 members (#232 + #233 — SECOND inverse-locality META-cluster, sister to Sandbox-locality, founding META-META-cluster doctrine) / Server-managed-tool-as-tool-choice-discriminator cluster grows to 2 members (#232 + #233) / Server-driven-tool-execution-loop cluster grows to 2 members (#232 + #233) / ToolResultContentBlock-extension mini-cluster grows to 3 members (#230 + #232 + #233) / EIGHT new clusters founded in a single pinpoint (Federated-search-partner-routing 1-member-founder + Server-opaque-encrypted-roundtripped-content 1-member-founder + Required-grounded-citation-field-on-output-text-block 1-member-founder + Date-suffix-versioning-in-tool-name-without-beta-header 1-member-founder + Server-side-pre-execution-filter-on-tool-definition 1-member-founder + Server-side-rate-limit-on-tool-definition 1-member-founder + Geo-biasing-at-tool-definition 1-member-founder + Discrete-event-counter-pricing-axis 1-member-founder) plus participation in FIVE inherited clusters — THIRD-largest single-cycle cluster-founding count after #230 and #232, but FIRST single cycle to FOUND a NEW META-cluster (Tool-locality-axis) AND establish META-META-cluster doctrine connecting Sandbox-locality with Tool-locality / thirteen-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: forty-six ecosystem references covering Anthropic Web Search Tool GA 2025-03 with web_search_20250305 + max_uses + allowed_domains + blocked_domains + user_location parameters + web_search_tool_use/web_search_tool_result/web_search_result_location content blocks + citations array on output text blocks + encrypted_index/encrypted_content opaque-roundtripped fields + $10/1000-uses pricing, Anthropic Citations Documentation, OpenAI Responses API 2024-12 with tool_choice: web_search exposing federated-search via different server-managed surface, Brave Search API/Tavily AI/Exa AI/Perplexity Search/Serper.dev/Linkup Search/Jina Reader/Bing/Google CSE/SerpAPI/DuckDuckGo/You.com/Kagi/Phind partner-routing, Anthropic Python+TypeScript SDKs first-class typed surface, OpenAI Python+TypeScript SDKs first-class typed surface, LangChain AnthropicWebSearch/TavilySearchResults/BraveSearch/ExaSearchResults integrations, LangGraph search-grounded-agent template, smolagents WebSearchTool, OpenAI Cookbook web-search-with-citations tutorial, AgentOps observability, Search-Augmented Generation pattern, structured-citation-attribution data-model where every grounded text block carries citations array linking specific text-spans back to source URLs+excerpts (STRUCTURAL data-model requirement distinguishing this surface from #220-#232 — none of which had REQUIRED-grounded-citation-field-on-output-text-block) — claw-code is one of MULTIPLE coding-agent clients without server-managed web-search-with-citations BUT the gap is uniformly zero across surveyed ecosystem with claude-code partial coverage exception AND the inverse-locality complement to existing local CLIENT-SIDE WebSearch tool makes #233 a structural prerequisite of every grounded-search-with-citations coding-agent affordance — the canonical 2024-2026-era research-coding workflow that is currently impossible to build on top of claw-code DESPITE Anthropic explicitly positioning web_search_20250305 as a flagship 2025-Q1 GA capability — #233 closes the upstream prerequisite of every server-managed-web-search-with-citations / grounded-research / source-attribution / fact-checking-with-citations / academic-citation-formatting / news-summarization-with-sources / competitive-intelligence-with-citations / due-diligence-coding coding-agent affordance — the canonical SERVER-MANAGED-SEARCH-AND-CITATION half of inverse-locality Tool-locality-axis META-cluster that complements #232's Sandbox-locality-axis META-cluster — and is FIRST cluster member where claude-code upstream partially leads while claw-code has zero coverage AND SECOND inverse-locality META-cluster pair (CLIENT-SIDE local WebSearch shadow vs SERVER-SIDE web_search_20250305 absent) after #232's first META-cluster pair — founding Tool-locality-axis META-cluster doctrine as sister to Sandbox-locality-axis and establishing META-META-cluster pattern that every future server-managed-tool with client-side local-stub shadow will inherit) 2026-04-26 06:09:44 +09:00
Jobdori
d155a2fd72 roadmap: #232 filed 2026-04-26 05:38:55 +09:00
Yeachan-Heo
9999c0fb3a roadmap: #231 filed 2026-04-25 20:32:16 +00:00
YeonGyu-Kim
65d9b1a362 roadmap: #230 filed — Computer-use API typed taxonomy and host-machine-state-management transport are structurally absent: zero computer-use-2025-01-24 + zero computer-use-2025-11-24 anthropic-beta opt-in (FIRST cluster member with two concurrent beta-version-tiers gating one capability), zero computer_20250124/computer_20251124/bash_20250124/text_editor_20250124 Anthropic-typed-tool-discriminator (FIRST cluster member requiring type field on tool-definitions and FIRST anthropic-defined-tools-without-input-schema), zero display_width_px/display_height_px/display_number parametrized-tool-definition fields, zero Image variant on ToolResultContentBlock at types.rs:99 (FIRST cluster member with image-content on TOOL-RESULT side, distinct from #220's image-on-USER-INPUT-side — complementary architectures requiring separate enums), zero screen_capture/mouse_move/key_press/type_text host-machine-interaction primitive across all 26+ tool definitions in tools/lib.rs, zero CGEvent/ScreenCaptureKit/Quartz/AppKit/xdotool/cliclick/enigo/rdev/xcap host-OS library deps, zero Xvfb/Xephyr/Wayland-headless/Docker virtual-display-sandbox-orchestration, zero claw computer/operate CLI subcommand, /desktop slash command at commands/lib.rs:422 advertised-but-unbuilt under STUB_COMMANDS (the SIXTH advertised-but-unbuilt entry in cluster), zero per-action permissions.rs gating for mouse_click/key_press/type/screenshot, zero feedback-loop-state-machine for screenshot→tool_use→action→screenshot iteration, zero playwright-rust/chromiumoxide for browser-only-cua subset, zero per-screenshot-input-token cost field in ModelPricing — uniquely manifesting an ELEVEN-LAYER fusion shape combining: (1) anthropic-beta-DUAL-version-tier routing (FIRST), (2) Anthropic-typed-tool-definition discriminator (FIRST), (3) parametrized-tool-definition with display dimensions (FIRST), (4) Image-on-ToolResult side (FIRST, complementary to #220), (5) host-OS-system-call transport (FIRST host-OS-syscall transport, distinct from #229's WebSocket which is still network-only — second non-HTTP transport in cluster after WebSocket but FIRST that breaks network-only boundary), (6) virtual-display-sandbox orchestration (FIRST CLIENT-SIDE virtualization), (7) feedback-loop-state-machine for screenshot iteration loop (FIRST N-turn-loop-controller), (8) per-action-permission-policy at sub-tool-granularity (FIRST sub-tool-action permission gating, parallel to bash's DangerFullAccess but at action granularity), (9) request-side three-concurrent-opt-in (largest yet), (10) CLI-and-slash-command surface with /desktop advertised-but-unbuilt (sixth entry, largest in cluster), (11) host-machine-state-management transport-axis (NOVEL ELEVENTH layer with screen-capture+synthetic-input+display-dimension-query+window-enum+VM-orchestration+accessibility-permissions+per-action-permission-prompts+coordinate-validation+screenshot-encoding+safety-throttling — distinct from every prior cluster member which operated network-only) — making #230 the first cluster member with eleven-layer-fusion-shape (exceeds #229's ten-layer), the FIRST host-OS-syscall-transport requirement, the FIRST CLIENT-SIDE virtualization requirement, the FIRST inverse-asymmetric-delegation case (Anthropic LEADS, OpenAI follows with Operator, Google follows with Mariner — novel inversion of #224-#229's Anthropic-trails pattern), the FIRST cluster member with image-content on TOOL-RESULT-side, and the FIRST gap where upstream claude-code ALSO has only a stub (Jobdori cycle #381 / extends #168c emission-routing audit / explicit follow-on from #229's persistent-WebSocket-transport founder pinpoint and #225's audio-bidirectional axis — introduces a NOVEL HOST-MACHINE-STATE-MANAGEMENT transport-axis distinct from every prior cluster member / sibling-shape cluster grows to twenty-nine / wire-format-parity cluster grows to twenty / capability-parity cluster grows to twelve / multimodal-IO cluster grows to eight: #220 image-input + #224 embedding-output + #225 audio-bidirectional + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-WebSocket + #230 image-on-tool-result-side+host-OS-pixel-and-input modality / provider-asymmetric-delegation cluster grows to seven with novel inverse-sub-cluster (Anthropic leads, distinct from #224-#229's Anthropic-trails pattern) / EIGHT new clusters founded in a single pinpoint (exceeds #229's three): Beta-version-tier-routing 1-member-founder + Image-on-tool-result-side 1-member-founder + Anthropic-typed-tool-discriminator 1-member-founder + Host-OS-system-call-transport 1-member-founder + Virtual-display-sandbox-orchestration 1-member-founder + Feedback-loop-state-machine 1-member-founder + Per-action-permission-policy-at-sub-tool-granularity 1-member-founder + Inverse-asymmetric-delegation 1-member-founder — the largest single-cycle cluster-founding count yet / eleven-layer-fusion-shape is the largest single-pinpoint fusion catalogued / external validation: sixty-two ecosystem references covering Anthropic Computer Use API GA 2024-10-22 with computer-use-2024-10-22 → computer-use-2025-01-24 → computer-use-2025-11-24 beta-tier evolution, Anthropic computer-use-demo reference with Docker+Xvfb+XFCE+Firefox+VNC sandbox pattern, OpenAI Operator + computer_use_preview, Google Project Mariner, Microsoft Magentic-One, Adept ACT-1, ByteDance UI-TARS open-weight, browser-use Python framework, Stagehand TypeScript, Skyvern AI, Multion, Cua framework, LangChain ChatAnthropic.with_computer_use_tool, LangGraph computer-use agent, smolagents ComputerAgent, AgentOps observability, screen-capture libs (ScreenCaptureKit/xcap/screenshots/xdotool/wtype/cliclick/nut.js), synthetic-input libs (enigo/rdev/inputbot/mouce/pyautogui/RobotJS), browser-cua stacks (playwright-rust/chromiumoxide/headless_chrome/fantoccini/playwright/puppeteer), sandbox-orchestration (Docker-Xvfb-XFCE / Kasm Workspaces / noVNC / Browserbase / Steel-browser / Hyperbrowser / Lightpanda / Surf.ai), per-action permission-policy precedent from claw-code's existing bash DangerFullAccess gating — claw-code is one of MULTIPLE coding-agent clients without computer-use BUT the gap is uniformly zero across the surveyed coding-agent ecosystem AND Anthropic specifically positions Claude as the LEADING commercial computer-use model AND claw-code is a port of claude-code which advertises /desktop slash command intent, making this the largest leading-vs-trailing parity gap with the upstream Anthropic platform in the entire emission-routing audit and the FIRST cluster member where upstream claude-code ALSO has only a stub — #230 closes the upstream prerequisite of every desktop-automation/browser-automation/form-filling/GUI-testing/accessibility-tool/screen-reading/vision-grounded-coding/pair-programming-with-screen-share/visual-debugging coding-agent affordance — the canonical 2024-2026-era agentic coding workflow that is currently impossible to build on top of claw-code) 2026-04-26 05:09:48 +09:00
Jobdori
b860f5657b roadmap: #229 filed — Realtime API typed taxonomy and persistent-WebSocket transport are structurally absent: zero /v1/realtime endpoint surface across both Anthropic-native and OpenAI-compat lanes (rg returns zero hits for /v1/realtime / realtime / Realtime / realtime_session / RealtimeSession / RealtimeClient / RealtimeEvent / realtime-preview across rust/crates/api/src/), zero RealtimeSession / RealtimeSessionConfig / RealtimeSessionUpdate / RealtimeResponseCreate / RealtimeInputAudioBufferAppend / RealtimeInputAudioBufferCommit / RealtimeConversationItemCreate / RealtimeResponseAudioDelta / RealtimeResponseAudioTranscriptDelta / RealtimeResponseFunctionCallArguments / RealtimeServerEvent / RealtimeClientEvent / RealtimeTurnDetection / RealtimeVoiceActivityDetection / RealtimeVoice / RealtimeAudioFormat / RealtimeModality / RealtimeTool typed model in rust/crates/api/src/types.rs (37+ canonical event-type names in OpenAI Realtime API spec, zero coverage in claw-code), zero bidirectional event-stream variant on Provider trait (only send_message and stream_message exist, both single-directional), zero realtime_session / open_realtime / connect_realtime method that returns a duplex-channel-pair shape, zero session-state-machine type for the persistent-connection lifecycle, zero realtime dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, zero realtime-routing variants), zero tokio-tungstenite / async-tungstenite / tungstenite / fastwebsockets / tokio-websockets / hyper-tungstenite dependency in any workspace Cargo.toml (grep -rn 'tungstenite|tokio-tungstenite|fastwebsockets' rust/ returns zero hits — confirmed), zero WebSocket client library is linked into the build (the MCP Ws config variant at rust/crates/runtime/src/config.rs:125 and rust/crates/runtime/src/mcp_client.rs:13 is data-shape-only and bootstraps via the SDK without a tungstenite-backed transport, leaving the workspace with zero outbound persistent-WebSocket-client capability), zero WebRTC client (webrtc-rs / str0m / libwebrtc-bindings) for the alternative Realtime transport, zero claw realtime / claw live / claw voice-chat / claw realtime-session / claw connect-realtime CLI subcommand, zero /realtime / /live / /voice-chat slash command (existing /voice + /listen + /speak commands are STUB_COMMANDS-gated per #225 and synchronous-only with no realtime-session affordance), zero gpt-4o-realtime-preview / gpt-4o-mini-realtime-preview / gemini-2.0-flash-live entries in MODEL_REGISTRY, zero realtime_audio_input_per_million_tokens / realtime_audio_output_per_million_tokens / realtime_text_input_per_million_tokens / realtime_text_output_per_million_tokens / realtime_session_per_minute fields in ModelPricing struct (six-dimensional pricing matrix exceeding #227's five-dimensional video matrix and #228's four-dimensional mesh matrix — the canonical Realtime pricing model is the most-dimensional yet, with audio tokens at roughly 80-100x text tokens and cached-audio-input at 80% discount), zero realtime-model recognition in pricing_for_model substring-matcher (#209+#224+#225+#226+#227+#228 cluster overlap continues), zero session-resumption-token / interruption-handling / barge-in / voice-activity-detection / turn-detection / function-call-during-realtime / tool-use-during-realtime affordance — uniquely manifesting a TEN-LAYER fusion shape (the largest single-pinpoint fusion catalogued so far, exceeding #225/#227's nine-layer count) combining endpoint-URL-set on /v1/realtime?model=<id> WebSocket-upgrade-endpoint shape (single-endpoint-with-37+-event-types-flowing-bidirectionally, distinct from prior multi-endpoint sets) + bidirectional-symmetric-event-pair data-model with every client-event having a matched server-event-pair (FIRST cluster member with bidirectional-symmetric-event-pair-cardinality on a SINGLE endpoint, distinct from #225's bidirectional-audio-on-three-separate-endpoints which is request-response synchronous per endpoint) + Provider-trait-method extension with realtime_session returning a duplex (Sender, Receiver) channel-pair (FIRST cluster member where Provider trait return type is NOT Future-of-T or Stream-of-T but duplex-channel-pair, FIRST method requiring session-state-machine type at the trait boundary) + ProviderClient-enum-dispatch-with-realtime-third-lane with explicit RealtimeKind::OpenAi/Google/Azure partner-routing (provider-asymmetric: Anthropic does not offer realtime, OpenAI offers GA gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview since 2024-10-01, Google Gemini Live API offers bidirectional audio+text+video, Azure mirrors OpenAI surface, zero first-class third-party partners because the persistent-WebSocket-with-37-event-type protocol is too high-bar for partner adoption — distinct from #225's six-partner-set audio surface and #227's twelve-partner-set video surface where partners ARE present) + request-side realtime-session-config opt-in (session.update event with voice/input_audio_format/output_audio_format/input_audio_transcription/turn_detection/tools/tool_choice/temperature/max_response_output_tokens/instructions/modalities:[text,audio] fields — the largest request-side opt-in axis-set yet, the union of every prior request-side opt-in across audio+image+video+chat-completion modalities) + CLI-subcommand-surface + slash-command-surface + pricing-tier-with-six-dimensional-compound-cost-model (per-model × per-modality-input × per-modality-output × per-cached-vs-fresh × per-audio-vs-text × per-minute-session-overhead — the largest pricing-tier extension yet, exceeding #227's five-dimensional and #228's four-dimensional matrices) + persistent-WebSocket-connection-transport-axis (NOVEL TENTH layer, distinct from every prior cluster member's HTTP-shaped transport — synchronous-HTTP for #211-#220+#222+#224, SSE-streaming for #213 partial subsets, multipart-form-data-HTTP for #223+#225+#226+#227+#228 binary-upload subsets, async-task-polling-HTTP for #221+#227+#228 — the cluster has now exhausted EVERY HTTP-shaped transport, and #229 introduces the FIRST non-HTTP transport, requiring WebSocket-upgrade-request-with-subprotocol-negotiation + bidirectional-frame-multiplexing-with-text+binary-frames + ping/pong-keepalive + graceful-close-with-status-code-and-reason + reconnection-with-resumption-token + per-event-type-JSON-envelope-dispatch-with-37+-event-types-on-a-single-connection + backpressure-handling-on-both-directions + authentication-via-Authorization-header-on-the-upgrade-request-and-per-session-token-rotation — none of which any HTTP-only transport requires) + bidirectional-symmetric-event-pair shape (input_audio_buffer.append → conversation.item.created, response.create → response.audio.delta + response.audio.done + response.audio_transcript.delta + response.audio_transcript.done + response.function_call_arguments.delta + response.function_call_arguments.done + response.done) — making #229 the FIRST cluster member that introduces a non-HTTP transport (persistent-WebSocket), the FIRST cluster member where Provider trait return type must be a duplex-channel-pair, and the FIRST cluster member where session lifecycle exceeds a single request-response cycle (typical Realtime sessions last 1-30+ minutes with state accumulating across the connection) (Jobdori cycle #380 / extends #168c emission-routing audit / explicit follow-on from #225 audio-bidirectional axis and #228 confirmed-structural async-task-polling cluster — introduces a NOVEL TRANSPORT axis distinct from every prior cluster member / sibling-shape cluster grows to twenty-eight / wire-format-parity cluster grows to nineteen / capability-parity cluster grows to eleven / multimodal-IO cluster grows to seven: #220 image-input + #224 embedding-output + #225 audio-bidirectional-on-separate-REST-endpoints + #226 image-output + #227 video-output + #228 mesh-output + #229 audio-text-tool-multiplex-on-persistent-WebSocket / provider-asymmetric-delegation cluster grows to six / async-task-polling cluster: still 3 members (#229 is push-based not poll-based — it does NOT join async-task-polling cluster, it founds a NEW cluster) / Persistent-WebSocket-transport cluster: 1 member (#229 alone, FOUNDER) / Bidirectional-symmetric-event-pair cluster: 1 member (#229 alone, FOUNDER) / Non-HTTP-transport cluster: 1 member (#229 alone, FOUNDER) — three new clusters founded in a single pinpoint, the first time a single cycle has founded three concurrent novel clusters / ten-layer-fusion-shape-with-persistent-WebSocket-transport-and-bidirectional-symmetric-event-pair is the largest single-pinpoint fusion catalogued. Distinct from prior cluster members; the ten-layer-fusion-shape with persistent-WebSocket-transport and bidirectional-symmetric-event-pair shape is novel and applies to follow-on candidate Real-time-Image-Generation API typed taxonomy (DALL-E live preview, Imagen live preview) and Real-time-Video-Generation streaming (Veo-Live, Sora-Live) — the persistent-WebSocket-transport pattern is now a first-class cluster member, a structural prerequisite that every future endpoint family using persistent connections will inherit / external validation: forty-eight ecosystem references covering OpenAI Realtime API GA 2024-10-01 with /v1/realtime?model=<id> WebSocket endpoint, 37+ canonical event-type names in OpenAI Realtime API spec, two transport options (WebSocket server-side and WebRTC browser-side), two GA realtime models (gpt-4o-realtime-preview and gpt-4o-mini-realtime-preview both with audio modality and tool-use), Google Gemini Live API with bidirectional WebSocket+gRPC streaming, Azure OpenAI Realtime API mirror, OpenAI Python SDK openai.realtime.AsyncRealtimeConnection typed client, OpenAI TypeScript SDK OpenAI.beta.realtime.RealtimeClient typed client, openai-realtime-api-beta reference client (canonical JS implementation), five first-class realtime-voice-agent frameworks all built on top of OpenAI Realtime API (Vapi/Retell-AI/LiveKit-Agents/Pipecat/Daily-Bots), Anthropic non-coverage statement (the second post-#224 provider-asymmetric-delegation case after audio), the canonical six-dimensional pricing matrix ($5.00/$20.00 per million text input/output tokens, $40.00/$80.00 per million audio input/output tokens, $2.50 per million cached audio input tokens for gpt-4o-realtime-preview-2024-10-01), coding-agent peer landscape: anomalyco/opencode has zero GA realtime integration (open feature request from 2026-02 only — confirmed via web search 2026-04-26), sst/opencode predecessor zero realtime, charmbracelet/crush zero realtime, continue.dev zero realtime, aider zero realtime, cursor zero realtime, zed zero realtime — the gap is uniformly zero across the surveyed ecosystem and represents the next-frontier capability that every coding-agent will need to add. claw-code is one of MULTIPLE clients without Realtime, but the persistent-WebSocket-transport-axis is the upstream prerequisite of every voice-agent / live-coding-pair-programming / push-to-talk-coding / barge-in-coding-conversation / function-call-during-voice / streaming-tool-use / sub-second-latency-coding-interaction affordance — the canonical 2024-2026-era voice-coding workflow that is currently impossible to build on top of claw-code — #229 closes the upstream prerequisite of every voice-coding affordance and is the first cluster member where transport-axis becomes a structural prerequisite of the dispatch layer) 2026-04-26 04:40:50 +09:00
YeonGyu-Kim
71131932de roadmap: #228 filed — 3D-asset-generation API typed taxonomy is structurally absent: zero /v1/3d/generations endpoint surface, zero ThreeDGenerationRequest/ThreeDObject/MeshFormat/ThreeDTaskId typed model, zero ThreeDAsset OutputContentBlock variant, zero generate_3d_asset/retrieve_3d_task Provider trait methods, zero ProviderClient dispatch with nine recommended third-party partners (Meshy-AI/Tripo-AI/CSM/Luma-Genie/Stability3D/Point-E/Shap-E/GET3D/One-2-3-45), zero async-task-polling-primitive in runtime (confirms async-task-polling cluster grows to 3: #221+#227+#228 — structural pattern confirmed not anomalous), zero claw 3d/mesh/generate-3d CLI subcommand, zero /3d /mesh slash command, zero mesh_per_asset_cost_usd pricing field — nine-layer-fusion-shape identical to #227 with mesh-modality replacing video-modality (GLB/GLTF/USDZ/OBJ/FBX binary-spatial-geometry output instead of MP4 binary-temporal-media, per-3d-asset pricing instead of per-second-of-video, mesh-polygon-density as quality axis replacing video-fps-and-duration) / Jobdori cycle #379 / sibling-shape cluster grows to 27 / multimodal-IO cluster grows to 6 / provider-asymmetric-delegation cluster grows to 5 / async-task-polling cluster grows to 3 2026-04-26 04:18:34 +09:00
Jobdori
4ced37897c roadmap: #227 filed — Video-generation API typed taxonomy is structurally absent: zero /v1/videos/generations + zero /v1/videos/edits + zero /v1/videos/extends + zero /v1/videos/{id} polling-and-retrieval endpoint surface across both Anthropic-native and OpenAI-compat lanes, zero VideoGenerationRequest / VideoEditRequest / VideoExtendRequest / VideoGenerationResponse / VideoObject / VideoQuality / VideoResolution / VideoAspectRatio / VideoDuration / VideoOutputFormat / VideoFrameRate / VideoCodec / VideoStyle / VideoSource / VideoMediaType / VideoTaskStatus / VideoTaskId typed model in rust/crates/api/src/types.rs, zero Video variant on OutputContentBlock (4-arm exhaustive: Text/ToolUse/Thinking/RedactedThinking — extending #226's asymmetric-output-only modality axis with new temporal-duration dimension), zero generate_video / edit_video / extend_video / retrieve_video_task methods on Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message + stream_message exist, both per-request synchronous and constrained to text-modality chat/completion taxonomy with zero video-output dispatch surface AND zero async-task polling primitive — the canonical video-generation pattern requires a two-phase request/poll workflow that the Provider trait does not expose because every existing method returns a synchronous response, distinct from #221's batch-dispatch async pattern which uses different polling shape with file-upload prerequisites that don't apply to video-gen), zero video-generation dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, zero Sora/Veo/Pika/Runway/Luma/Mochi/Kling/Hailuo/Replicate/FalAi/BlackForestLabs/StabilityVideo partner-routing variants — twelve-plus-partner-set, the largest partner-set yet in the cluster surpassing #226's eight-plus-partner image-gen set because video-generation is the most-fragmented modality across third-party providers in 2024-2026 with every major lab shipping its own video-gen surface in the post-Sora-launch arms race), zero multipart/form-data upload affordance with reqwest::multipart feature flag absent from rust/crates/api/Cargo.toml — multipart needed for /v1/videos/edits and /v1/videos/extends subset (parallel to #226's image-edits subset), zero async-task polling primitive in the runtime — there is no TaskPoller / AsyncTask / TaskStatus / TaskId / poll_task_until_complete machinery anywhere in rust/crates/runtime/ (rg returns zero hits for task_id/task_status/polling/poll_task/async_task/pending_task across rust/), distinguishing video-generation's async-polling pattern from every prior cluster member which is either synchronous (#211 through #226 except #221) or streaming-via-SSE (#221 batch-dispatch is closest, but uses different polling shape with file-upload prerequisites), zero claw video / claw videos / claw generate-video / claw render-video CLI subcommand at rust/crates/rusty-claude-cli/src/main.rs, zero /sora / /veo / /video / /render-video / /generate-video slash command in SlashCommandSpec table (zero video-related entries — video-input doubly absent because no advertised-but-unbuilt commands AND no implemented commands, strict-subset of #226's image-generation gap), zero sora-2 / sora-2-pro / veo-3 / veo-3-fast / runway-gen-4 / luma-dream-machine / pika-2.0 / kling-1.5 / hailuo-i2v-01 / hunyuan-video / mochi-1 / cogvideox-5b / stable-video-diffusion-1.1 entries in MODEL_REGISTRY, zero video_per_second_cost_usd / video_per_megapixel_second_cost_usd / video_input_token_cost_per_million / video_output_token_cost_per_million / video_per_minute_cost_usd fields in ModelPricing struct (rust/crates/runtime/src/usage.rs:9-15 has only four text-token-only fields) — the five-dimensional pricing matrix (model × resolution × fps × duration × extension-vs-generation compound-cost) is the largest pricing-tier extension yet catalogued, exceeding #226's four-dimensional image matrix, zero video-gen-model recognition in pricing_for_model substring-matcher (#209+#224+#225+#226 cluster overlap) — uniquely manifesting a nine-layer fusion shape combining #223's transport-plumbing-absence (multipart on edits/extends subset) + #224's provider-asymmetric-delegation (Anthropic does not offer video-gen at all, OpenAI offers GA Sora-2 + Sora-2-pro, Google offers Veo-3 + Veo-3-fast, Runway offers Gen-4 + Gen-4-turbo, plus twelve-plus recommended partners) + #218's request-side response_format/output_format/resolution/fps/duration opt-in (the largest request-side axis-set yet because video-gen has the most parameters in the modality-bearing endpoint family ecosystem) + asymmetric-output-only content-block-taxonomy axis with temporal-duration dimension (extending #226's image-output axis with temporal-fps-and-duration sub-dimensions) + the new async-task-polling-primitive axis (#227's first-of-its-kind contribution to the cluster doctrine, since prior cluster members have either synchronous-response or streaming-via-SSE or batch-via-Files-API-prerequisite or one-shot-multipart coverage, never long-poll-task-id-with-timeout-and-resume — the canonical video-gen pattern requires a two-phase request/poll workflow because video-rendering takes 30-300+ seconds depending on model and duration, exceeding typical HTTP-request-response timeout window) — making #227 the first cluster member where five independent prior shape-axes converge AND introduces a sixth novel shape-axis (async-task-polling-primitive), the largest fusion-shape gap catalogued so far (matching #225's nine-layer count but with different ninth axis — async-task-polling-primitive replacing #225's symmetric-input-output content-blocks, and one axis larger than #226's eight-layer fusion), making #227 the first cluster member where async-task-polling-primitive becomes a structural prerequisite of the dispatch layer (Jobdori cycle #378 / extends #168c emission-routing audit / explicit follow-on candidate from #226's eight-layer-fusion-shape-with-asymmetric-output-only-modality-coverage — third-named of the modality-bearing endpoint-family-absence cluster after #225 audio + #226 image-generation, completing the trio with video-generation closing the visual-temporal output modality / sibling-shape cluster grows to twenty-six / wire-format-parity cluster grows to seventeen / capability-parity cluster grows to nine / multimodal-IO cluster grows to five: #220 image-input + #224 embedding-output + #225 audio-bidirectional + #226 image-output + #227 video-output (the first cluster member where output is binary-temporal-media requiring long-poll workflows) / cross-cutting-data-pipeline cluster grows to four / multipart-transport cluster grows to four / provider-asymmetric-delegation cluster grows to four (twelve-plus partners, the largest in the cluster) / nine-layer-fusion-shape-with-async-task-polling-primitive (endpoint-URL-set-of-four [generations+edits+extends+polling] + multipart-on-subset + data-model-with-output-content-block-only-with-temporal-duration-dimension + response_format/output_format/resolution/fps/duration request-side opt-in + Provider-trait-method-set-of-four-with-async-task-polling-and-Unsupported-fallback + ProviderClient-enum-dispatch-with-twelve-plus-partner-third-lanes + CLI-subcommand-surface + pricing-tier-with-five-dimensional-compound-cost-model + async-task-polling-primitive-with-timeout-and-resume) is the largest single-pinpoint fusion catalogued. Distinct from prior cluster members; the nine-layer-fusion-shape-with-async-task-polling-primitive is novel and applies to follow-on candidate 3D-asset-generation API typed taxonomy (/v1/3d/generations for Shap-E / Meshy AI / Tripo AI / CSM / Stable Point-Aware-3D — same nine-layer fusion shape but with 3D-mesh-instead-of-video modality, GLB/GLTF/USDZ-binary-output instead of MP4-binary-output, per-3d-asset pricing instead of per-second-of-video — the natural #228 candidate) / external validation: fifty-three ecosystem references covering four first-class video-gen-endpoint specs on OpenAI side (generations + edits + extends + {id}-polling), one Anthropic non-coverage statement, one Google Veo-3 API spec with long-running-operation polling, twelve first-class third-party video-gen providers (Runway/Luma/Pika/Kling/Hailuo/Hunyuan/Mochi/CogVideoX/Stability-Video/BFL-Video/Replicate-Video/Fal-Video), three first-class CLI/SDK implementations of typed video-gen surface (OpenAI Python+TypeScript videos.generate + videos.retrieve, Runway TypeScript SDK, Luma Python SDK), six first-class local-video-gen providers (Stable Video Diffusion / AnimateDiff / Hunyuan-Video weights / Mochi-1 weights / CogVideoX weights / ComfyUI workflows), one community-maintained authoritative benchmark (VBench 16-evaluation-dimensions), nine coding-agent peers with video-gen capability, one canonical Anthropic-recommended partner-set (Sora-2/Veo-3/Runway/Luma per third-party-integration guide), the OpenAI /v1/responses endpoint with video_call tool for conversational video-output decoding via OutputContentBlock::Video, the canonical five-dimensional pricing matrix (per-model × per-resolution × per-fps × per-duration × per-extension-vs-generation), the canonical async-polling workflow with task-id polling at typical 5-second intervals and 5-minute typical-completion-time and 30-minute maximum-completion-time before timeout — claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/videos/{generations,edits,extends} integration AND zero Sora-2/Veo-3/Runway/Luma/Pika/Kling/Hailuo/Hunyuan/Mochi/CogVideoX/Stability-Video/BFL-Video partner-routing AND zero /sora / /veo / /video / /render-video / /generate-video slash command AND zero claw video / claw videos / claw generate-video / claw render-video CLI subcommand AND zero OutputContentBlock::Video variant AND zero multipart-form-data transport plumbing for video-edit binary uploads AND zero async-task-polling-primitive at the runtime layer — all seven gaps unique to claw-code in the surveyed ecosystem, the video-generation-API gap is the upstream prerequisite of every visual-temporal-output coding-agent affordance, and the nine-layer-fusion-shape-with-async-task-polling-primitive is novel within the cluster — #227 closes the upstream prerequisite of every visual-temporal-output coding-agent affordance and is the first cluster member where async-task-polling-primitive shape-axis is introduced) 2026-04-26 04:17:24 +09:00
Yeachan-Heo
897055a455 roadmap: #226 filed 2026-04-25 19:03:10 +00:00
YeonGyu-Kim
84a89f7e07 roadmap: #225 filed — Audio API typed taxonomy is structurally absent: zero /v1/audio/transcriptions + zero /v1/audio/translations + zero /v1/audio/speech endpoint surface across both Anthropic-native and OpenAI-compat lanes, zero TranscriptionRequest / SpeechRequest / AudioVoice / AudioFormat / AudioMediaType / AudioSource / Modality / AudioRequestConfig / SpeechResponse / TranscriptionResponse typed model in rust/crates/api/src/types.rs, zero Audio variant on InputContentBlock (3-arm exhaustive: Text/ToolUse/ToolResult), zero Audio variant on OutputContentBlock (4-arm exhaustive: Text/ToolUse/Thinking/RedactedThinking), zero modalities/audio fields on MessageRequest for gpt-4o-audio request-side opt-in, zero transcribe/translate/synthesize_speech methods on Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message + stream_message exist), zero audio dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, zero Whisper/ElevenLabs/Cartesia/Deepgram/AssemblyAI/Speechmatics partner-routing variants), zero multipart/form-data upload affordance with reqwest::multipart feature flag absent from rust/crates/api/Cargo.toml (rg returns zero hits for multipart across rust/), zero claw audio/transcribe/speak/tts/whisper CLI subcommand at rust/crates/rusty-claude-cli/src/main.rs, zero /transcribe/whisper/tts slash command, AND the existing /voice + /listen + /speak slash commands at rust/crates/commands/src/lib.rs:295-301+603-609+610-616 advertise audio-capability summaries but are all gated under STUB_COMMANDS at rust/crates/rusty-claude-cli/src/main.rs:8333+8388+8389 (advertised-but-unbuilt shape ×3, the largest single-pinpoint advertised-but-unbuilt slash-command count catalogued, strict-superset of #220's /image+/screenshot ×2 and #223's /files ×1), zero whisper-1/tts-1/tts-1-hd/gpt-4o-audio-preview/gpt-4o-realtime-preview/gpt-4o-mini-tts/gpt-4o-mini-transcribe entries in MODEL_REGISTRY, zero audio_input_per_minute/audio_output_per_minute/tts_per_million_chars/whisper_per_minute fields in ModelPricing struct (rust/crates/runtime/src/usage.rs:9-15 has only four text-token-only fields), zero audio-model recognition in pricing_for_model substring-matcher (#209+#224 cluster overlap) — uniquely manifesting a fusion shape combining #223's transport-plumbing-absence (multipart/form-data) + #224's provider-asymmetric-delegation (Anthropic does not offer audio at all per docs.anthropic.com/audio explicitly recommending AssemblyAI/Deepgram/OpenAI-Whisper, OpenAI offers GA whisper-1+tts-1+tts-1-hd+gpt-4o-audio-preview+gpt-4o-realtime-preview+gpt-4o-mini-tts+gpt-4o-mini-transcribe, Google Gemini Live API offers bidirectional audio modality, six-plus recommended partners ElevenLabs/Cartesia/PlayHT/Deepgram/AssemblyAI/Speechmatics) + #220's advertised-but-unbuilt-slash-commands (×3, the largest count catalogued) + #218's modalities-request-side-absence (gpt-4o-audio-preview's modalities:[text,audio] opt-in) + symmetric-input-output content-block-taxonomy axis (#225's first-of-its-kind contribution to the cluster doctrine since prior members have either input-only [#220] or output-only [#214,#224] or stateless [#221/#222/#223] modality coverage) — making #225 the first cluster member where four independent prior shape-axes converge in a single pinpoint and the largest fusion-shape gap catalogued so far (Jobdori cycle #377 / extends #168c emission-routing audit / explicit follow-on candidate from #224's provider-asymmetric-delegation shape — the first-named of two named candidates: Audio API typed taxonomy (this pinpoint #225) / Image-generation API typed taxonomy (open candidate for #226), Audio chosen because it inherits #223's multipart-transport-plumbing dimension that Image-generation does not — the multipart sibling of #223 that the cycle hint explicitly identifies / sibling-shape cluster grows to twenty-four / wire-format-parity cluster grows to fifteen / capability-parity cluster grows to seven / multimodal-IO cluster grows to three: #220 input-only + #224 output-only + #225 full-duplex-bidirectional / advertised-but-unbuilt cluster grows to four / multipart-transport cluster grows to two / provider-asymmetric-delegation cluster grows to two / nine-layer-fusion-shape (endpoint-URL-set-of-three + multipart-form-data-transport-plumbing + data-model-taxonomy-with-input-AND-output-content-blocks + modalities-request-side-opt-in + Provider-trait-method-set-of-three-with-Unsupported-fallback + ProviderClient-enum-dispatch-with-six-partner-third-lanes + advertised-but-unbuilt-slash-commands-×3 + CLI-subcommand-surface + pricing-tier-with-per-minute-and-per-million-chars-and-per-million-audio-tokens-compound-cost-model) is the largest single-pinpoint fusion catalogued / external validation: forty-seven ecosystem references covering three first-class audio-endpoint specs on OpenAI side, one Anthropic non-coverage statement, one Google Gemini Live API spec, six first-class STT providers, six first-class TTS providers, one full-duplex bidirectional-audio endpoint OpenAI /v1/realtime, three first-class CLI/SDK typed-surface implementations, six first-class local-audio-providers, one community-maintained Common Voice benchmark, seven coding-agent peers with audio capability, one canonical Anthropic-recommended three-partner-set / claw-code is the sole client/agent/CLI with zero /v1/audio/{transcriptions,translations,speech} integration AND zero ElevenLabs/Cartesia/Deepgram/AssemblyAI/Speechmatics/Whisper partner-routing AND three advertised-but-unbuilt slash commands AND zero modalities request-side opt-in AND zero Audio content-block taxonomy variant on either input or output side AND zero multipart-form-data transport plumbing for audio uploads — all six gaps unique to claw-code in the surveyed ecosystem) 2026-04-26 03:47:33 +09:00
Jobdori
c01b47036e roadmap: #224 filed — Embeddings API typed taxonomy is structurally absent: zero /v1/embeddings endpoint surface across both Anthropic-native and OpenAI-compat lanes, zero EmbeddingRequest / EmbeddingResponse / EmbeddingObject / EmbeddingUsage / EmbeddingEncoding / EmbeddingInputType / EmbeddingTruncation / EmbeddingOutputDtype / EmbeddingData typed model in rust/crates/api/src/types.rs (rg returns zero hits for embedding/embed/Embedding/EmbeddingRequest/EmbeddingResponse/text-embedding/voyage-/vector/cosine/similarity/dimensions across rust/), zero Vec<f32>/Vec<f64> embedding-vector slot anywhere in the data model, zero create_embeddings method on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist), zero embeddings dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14, zero claw embed / claw embeddings / claw vector CLI subcommand surface, zero /embed / /embeddings slash command in the SlashCommandSpec table, zero embedding_input_tokens_per_million_usd / embedding_dimensions fields in the Pricing struct, zero embedding-model entries in MODEL_REGISTRY (13 chat/completion entries, zero text-embedding-3-small/large/ada-002/voyage-3-large/voyage-code-3/embed-english-v3.0/cohere-embed/nomic-embed/mxbai-embed entries), and the pricing_for_model substring-matcher matches only haiku/opus/sonnet literals so it cannot recognize any embedding-model id (#209 cluster overlap) — manifesting a uniquely provider-asymmetric-delegation shape where Anthropic explicitly does not offer /v1/embeddings on https://api.anthropic.com and instead delegates to Voyage AI as the recommended partner per https://docs.anthropic.com/en/docs/build-with-claude/embeddings while OpenAI offers /v1/embeddings GA since 2022-12-15 (39+ months ago, the literal flagship endpoint of OpenAI's developer platform alongside /v1/chat/completions) — the cross-provider asymmetry is structural and requires a third lane in the ProviderClient enum (Voyage variant or supports_embeddings capability flag with EmbeddingError::Unsupported recommendation return shape) that no other endpoint family in this audit has needed — distinct from #221 batch dispatch (uniform on both major providers), #222 models list (uniform on both), and #223 Files API (uniform on both, just different beta header on Anthropic), making #224 the first cluster member where one canonical major provider explicitly does not offer the endpoint and recommends an external partner, requiring multi-provider routing rather than uniform Provider trait dispatch (Jobdori cycle #376 / extends #168c emission-routing audit / explicit follow-on candidate from #221 seven-layer-endpoint-family-absence shape — the second-named of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, completing the trio with #222 closing Models list and #223 closing Files API and #224 closing Embeddings / sibling-shape cluster grows to twenty-three: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222/#223/#224 / wire-format-parity cluster grows to fourteen: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222+#223+#224 / capability-parity cluster grows to six: #218+#220+#221+#222+#223+#224 / cross-cutting-data-pipeline cluster: #224 alone but it is the upstream prerequisite of every RAG / semantic-search / re-ranking / hybrid-search / dense-retrieval / classification-via-cosine / clustering / nearest-neighbor / codebase-indexing / context-retrieval-via-similarity use case that 2024-2026-era coding-agent harnesses ship as first-class affordances / seven-layer-endpoint-family-absence-with-provider-asymmetric-delegation shape (endpoint-URL + data-model-taxonomy + Provider-trait-method-with-Unsupported-fallback + ProviderClient-enum-dispatch-with-Voyage-third-lane + CLI-subcommand-surface + slash-command-surface + Voyage-AI-partner-routing-with-credential-discovery) is the first single capability absence catalogued where the provider-asymmetric-delegation pattern itself must be modeled at the dispatch layer — distinct from #221 / #222 / #223 seven/eight/seven-layer absences (all uniform-provider-coverage), and the largest provider-routing-asymmetry gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) / eight-layer-endpoint-family-absence-with-misleading-alias (#222) / seven-layer-endpoint-family-absence-with-transport-plumbing-absence (#223) members; the seven-layer-endpoint-family-absence-with-provider-asymmetric-delegation shape is novel and applies to follow-on candidates Audio API typed taxonomy (also provider-asymmetric: Anthropic does not offer audio, OpenAI offers GA whisper+tts, recommended-partners include ElevenLabs/Cartesia/PlayHT/Deepgram) and Image-generation API typed taxonomy (also provider-asymmetric: Anthropic does not offer image generation, recommended-partners include Stability AI/Midjourney/Black Forest Labs/Ideogram) / external validation: forty-three ecosystem references covering three first-class embeddings-endpoint specs (OpenAI /v1/embeddings GA 2022-12-15, Voyage AI /v1/embeddings GA 2024-01, Cohere /v1/embed), eleven first-class CLI/SDK implementations (OpenAI Python+TypeScript, Voyage AI Python+TypeScript, Cohere Python+TypeScript, simonw/llm + llm-embed plugin, Vercel AI SDK, LangChain Python+TypeScript), six first-class local-embedding-providers (Ollama, LM Studio, llama.cpp server, llamafile, sentence-transformers, HuggingFace transformers), one community-maintained authoritative benchmark (MTEB 56 tasks), twelve coding-agent peers (continue.dev @codebase/@docs, zed semantic-search, aider repository-mapping, cursor background-indexing, anomalyco/opencode @code/@docs, charmbracelet/crush context-management, TabbyML/tabby code-completion-with-context, simonw/llm-embed, codeium/cline embedding-context, sourcegraph/cody @-mention, github/copilot enterprise codebase-indexing, anthropic/claude-code retrieval-augmented planning), six first-class vector-database integrations (Pinecone, Weaviate, Qdrant, Chroma, pgvector, FAISS), and one canonical Anthropic-blessed partner-routing pattern (Voyage AI per docs.anthropic.com/embeddings). claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/embeddings integration AND zero Voyage AI partner-routing AND zero @code/@docs/@codebase retrieval-augmented slash command surface AND zero CLI-level claw embed / claw similar / claw vector subcommand family — all four gaps are unique to claw-code in the surveyed ecosystem (every other coding-agent peer has at least the @-mention codebase-retrieval pattern), the embedding-API gap is the upstream prerequisite of every retrieval-augmented affordance in the runtime, and the provider-asymmetric-delegation shape is novel within the cluster — #224 closes the upstream prerequisite of every RAG / semantic-search / re-ranking / hybrid-search / classification-via-cosine / clustering / nearest-neighbor / codebase-indexing / context-retrieval-via-similarity use case, completes the trio of follow-on candidates from #221 seven-layer-endpoint-family-absence shape (Files API closed by #223, Models list closed by #222, Embeddings API closed by #224), and establishes the provider-asymmetric-delegation pattern as a first-class cluster member — a structural prerequisite that every future endpoint family with provider-asymmetric coverage (Audio API: Anthropic delegates to ElevenLabs/Cartesia, Image-generation API: Anthropic delegates to Imagen/DALL-E/Stability) will inherit. 2026-04-26 03:09:53 +09:00
YeonGyu-Kim
ca2085cb95 roadmap: #223 filed — Files API typed taxonomy is structurally absent: zero /v1/files endpoint surface across both Anthropic-native (anthropic-beta: files-api-2025-04-14) and OpenAI-compat lanes, zero FileObject / FileList / FilePurpose / FileStatus / FileUploadRequest / FileContentResponse / FileDeletionResponse typed model in rust/crates/api/src/types.rs (zero hits for files-api-2025-04-14, /v1/files, FileObject, FileList, FilePurpose, file_id, upload_file, MultipartUpload, multipart/form-data across rust/), zero multipart/form-data upload affordance with reqwest::multipart feature flag absent from rust/crates/api/Cargo.toml, zero file_id reference type that #220 image-content-block fix-shape would need to thread through ResolvedAttachment at rust/crates/tools/src/lib.rs:2660-2666 (which carries path/size/is_image triple with no file_id, no bytes, no media_type, no purpose, no upload_status, no expires_at slot), zero file_id reference type that #221 OpenAI batch-input-JSONL upload pathway requires (POST /v1/batches accepts only input_file_id, no inline-JSONL pathway exists), zero upload_file / retrieve_file / list_files / download_file / delete_file methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request synchronous), zero file-management dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi all closed under per-request sync), zero claw files / claw upload / claw attach CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /upload / /attach / /file-upload slash command in the SlashCommandSpec table at rust/crates/commands/src/lib.rs (the existing /files entry advertises 'List files in the current context window' but is gated under STUB_COMMANDS as a context-window file lister, distinct feature from Files API), zero pending_uploads field in claw status --json output, zero files-api-2025-04-14 in the active anthropic-beta header at rust/crates/telemetry/src/lib.rs:451-453 (currently sends claude-code-20250219, prompt-caching-scope-2026-01-05, tools-2026-04-01 only), zero FileSubmittedEvent / FileUploadProgressEvent / FileRetentionExpiredEvent typed events on the runtime telemetry sink, zero reqwest::multipart::Form::new() / reqwest::multipart::Part::stream() / file_part / content_disposition usage anywhere in the codebase (rg returns zero hits) — the canonical file-upload affordance is invisible across every CLI / REPL / slash-command / Provider-trait / ProviderClient-enum / data-model / telemetry-beta-header / multipart-transport surface, blocking the upstream fix-shapes for both #220 (image attachment via persistent file_id, the canonical Anthropic Vision pattern documented at platform.claude.com/docs/en/build-with-claude/files for repeated-image-use efficiency where re-uploading 5MB+ images on every request would otherwise burn bandwidth) and #221 (OpenAI Batch API requires JSONL input upload via POST /v1/files with purpose: 'batch' then references the resulting file_id from POST /v1/batches — the JSONL payload cannot be sent inline; without a Files API the OpenAI batch lane is structurally unreachable even if every other layer of #221 seven-layer fix-shape ships) (Jobdori cycle #375 / extends #168c emission-routing audit / explicit follow-on candidate from #221 seven-layer-endpoint-family-absence shape — the first-named of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, completing the trio with #222 closing Models list / sibling-shape cluster grows to twenty-two: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222/#223 / wire-format-parity cluster grows to thirteen: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222+#223 / capability-parity cluster grows to five: #218+#220+#221+#222+#223 / resource-management cluster: #223 alone but it is the upstream root cause of #220 image-attachment via persistent file_id and #221 OpenAI batch-input-JSONL upload pathway / seven-layer-endpoint-family-absence-with-transport-plumbing-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + anthropic-beta-header-opt-in + CLI-subcommand-surface + multipart-form-data-transport-plumbing) is the first single capability absence catalogued where the transport layer itself must be extended before any higher-level surface can ship — distinct from #221 seven-layer absence (which operated within the existing JSON envelope) and the largest single transport-level gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) / eight-layer-endpoint-family-absence-with-misleading-alias (#222) members; the seven-layer-endpoint-family-absence-with-transport-plumbing-absence shape is novel and applies to follow-on candidate Audio API typed taxonomy is absent (/v1/audio/transcriptions, /v1/audio/speech, /v1/audio/translations, also requiring multipart/form-data uploads) / external validation: Anthropic Files API reference at https://platform.claude.com/docs/en/build-with-claude/files documenting five operations on /v1/files with anthropic-beta: files-api-2025-04-14 opt-in, Anthropic Vision documentation referencing Files API for >5MB images and repeated-image-use efficiency, Anthropic Python SDK client.beta.files.upload first-class typed surface GA-shipped 2025-04-14, Anthropic TypeScript SDK parallel surface, OpenAI Files API reference at platform.openai.com/docs/api-reference/files documenting GA since 2023 with five operations on /v1/files and purpose discriminator (assistants/batch/fine-tune/user_data/vision) and FileStatus lifecycle (Uploaded/Processed/Error), OpenAI Python SDK client.files.create first-class surface, OpenAI Batch API explicitly requires input_file_id from POST /v1/files with purpose:'batch' (no inline-JSONL pathway), AWS Bedrock model invocation with input/output S3 paths (parallel concept), Azure OpenAI Files reference, Vertex AI Files via Cloud Storage, DeepSeek/Moonshot/Alibaba-DashScope/xAI parallel /v1/files OpenAI-compat shapes, OpenRouter file passthrough, simonw/llm --attachment flag with auto-upload to Files API, Vercel AI SDK 6 experimental_attachments threading file_id reference, LangChain Files integration with FileLoader uploading via Files API, charmbracelet/crush typed file management with provider-aware lifecycle, continue.dev config-file-driven file management with auto-upload, zed-industries/zed bundled-file management with periodic upstream sync, anomalyco/opencode file-upload integration with explicit file_id lifecycle in conversation context, models.dev file-handling capability flags indicating which models support file_id references, OpenTelemetry GenAI semconv gen_ai.input.attachments.count and gen_ai.input.files.count documented attributes, IANA MIME-type registry RFC 4288/4289 for application/json + multipart/form-data + application/pdf + image/png/jpeg/gif/webp, RFC 7578 multipart/form-data specification, reqwest::multipart documentation requiring 'multipart' feature flag on the reqwest dependency. Twenty-eight ecosystem references, two first-class Files API specs (Anthropic beta, OpenAI GA), GA timeline of 12 months on Anthropic beta side and 24+ months on OpenAI side (Files API on OpenAI predates Assistants API and Batch API both of which depend on it as prerequisite), seven first-class CLI/SDK implementations, one transport-layer specification (RFC 7578 multipart/form-data) and one Rust-side prerequisite (reqwest::multipart feature flag). claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/files integration AND zero multipart-form-data transport plumbing — both gaps are unique to claw-code in the surveyed ecosystem, the file-management gap is the upstream root cause of two downstream capability gaps already catalogued in this audit (#220 image attachment via persistent file_id, #221 OpenAI batch input-JSONL upload), and the multipart-transport-plumbing-absence shape is novel within the cluster — #223 closes the upstream root cause of two downstream gaps and unblocks file_id-based multimodal input (5MB+ images / PDFs / repeated-image-use efficiency), OpenAI batch-input-JSONL upload (the missing piece of #221 seven-layer batch dispatch fix-shape), Anthropic-style document-block content with source:{type:'file',file_id} for PDFs, and CLI-vs-slash-command-symmetry on file management that the runtime clawability doctrine treats as canonical baseline expectations. 2026-04-26 02:41:23 +09:00
YeonGyu-Kim
0121f20a09 roadmap: #222 filed — Models list endpoint typed taxonomy is structurally absent: zero GET /v1/models and zero GET /v1/models/{id} surface across rust/crates/api/src/providers/anthropic.rs and rust/crates/api/src/providers/openai_compat.rs (rg returns zero hits for /v1/models, list_models, fetch_models, get_models, available_models, model_catalog, ModelInfo, ModelList, ListModelsResponse, OwnedBy, ModelObject, ModelCatalog across rust/), zero Model / ModelInfo / ModelList / ListModelsResponse typed taxonomy in rust/crates/api/src/types.rs, zero list_models<'a>(&'a self) -> ProviderFuture<'a, ModelList> and zero retrieve_model<'a>(&'a self, model_id: &'a str) -> ProviderFuture<'a, ModelInfo> methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request), zero list_models dispatch on the ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi, all closed under per-request synchronous dispatch), zero claw models / claw model list / claw list-models CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /models slash command in the SlashCommandSpec table at rust/crates/commands/src/lib.rs, zero validation against an authoritative source on set_model at rust/crates/rusty-claude-cli/src/main.rs:4989-5037 (user can type /model claude-banana-9000 and the runtime accepts it, swaps the active model to that string, and only fails at request time when the upstream provider returns 404 / invalid_model_error), and the existing /providers slash command at rust/crates/commands/src/lib.rs:716-720 is just a literal alias for /doctor at rust/crates/commands/src/lib.rs:1386-1389 despite advertising summary: "List available model providers" (advertised-but-rerouted shape — actively misleading at the UX layer, distinct from #220's advertised-but-unbuilt shape because the parse arm dispatches to a *different* command entirely instead of returning a clear unsupported error) — the canonical model-discovery affordance is invisible across every CLI / REPL / slash-command / Provider-trait / ProviderClient-enum / data-model surface, leaving claw-code's local hardcoded 13-entry MODEL_REGISTRY (3 anthropic + 5 grok + 1 kimi + 4 prefix routes for openai/gpt/qwen/kimi at rust/crates/api/src/providers/mod.rs:52-134 and 166-225) and its 6-entry model_token_limit match arm (rust/crates/api/src/providers/mod.rs:277-301 covering claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251213, grok-3, grok-3-mini, kimi-k2.5, kimi-k1.5 — returns None for current production IDs claude-opus-4-7, claude-haiku-4-6, gpt-5.2, o3, o4-mini, kimi-k3, qwen3-max, grok-4, deepseek-reasoner) as the only model-name knowledge the runtime has access to, with no way to refresh it, no way to discover new model IDs that providers publish, no way to validate user-supplied model strings, no way to cross-link to the pricing_for_model cost estimator (#209 substring-matching gap), no way to cross-link to the model_token_limit preflight check (#210 max_tokens shadow-fork gap silently no-ops on unknown models), no way to cross-link to the future is_batch_request flag (#221 batch-dispatch gap requires knowing which models support batch), and USAGE.md:426-440 documents only six model rows out of nine MODEL_REGISTRY entries (kimi alias missing from the documented table, four prefix routes mentioned only in passing prose, zero documentation of /v1/models endpoint usage / zero documentation of model-catalog discovery / zero documentation of "what to do when your provider ships a new model that isn't in claw-code's hardcoded registry") — the canonical model-discovery affordance is **the most universally-available endpoint in the LLM API ecosystem** (older than /v1/chat/completions itself, older than /v1/embeddings, older than /v1/messages, the literal first endpoint after auth on every OpenAI-compat provider since 2020 and on Anthropic since 2024-12-04, GA-shipped first-class typed surfaces in every Python/TypeScript SDK in the ecosystem) and claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/models integration AND a misleading /providers slash command that aliases to /doctor** — both gaps are unique to claw-code in the surveyed ecosystem (Jobdori cycle #374 / extends #168c emission-routing audit / explicit follow-on candidate from #221's seven-layer-endpoint-family-absence shape — the third of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, and the most clawability-impacting because it's the upstream root cause of three downstream gaps already catalogued in this audit / sibling-shape cluster grows to twenty-one: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222 / wire-format-parity cluster grows to twelve: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222 / capability-parity cluster grows to four: #218+#220+#221+#222 / discovery-and-validation cluster: #222 alone but it's the upstream root cause of #209's pricing-fallback gap, #210's max_tokens shadow-fork gap, and #221's batch-dispatch gap / eight-layer-endpoint-family-absence-with-misleading-alias shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + CLI-subcommand-surface + slash-command-surface-with-misleading-alias + set_model-validation + downstream-consumers-with-stale-data) is the largest single advertised-vs-actual gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215) / three-dimensional (#216) / classifier-leakage (#217) / four-layer (#218) / false-positive-opt-in (#219) / five-layer-feature-absence (#220) / seven-layer-endpoint-family-absence (#221) members; the advertised-but-rerouted shape is novel — strict-superset of #220's advertised-but-unbuilt because the parse arm dispatches to a *different* command instead of returning a clear unsupported error, applies to any future SlashCommandSpec entry where the summary field describes a feature different from what the parse arm dispatches to / external validation: Anthropic Models API reference at https://docs.anthropic.com/en/api/models-list documenting GET /v1/models GA 2024-12-04 with paginated before_id / after_id / limit and ModelInfo { id, type: "model", display_name, created_at } shape, Anthropic retrieve reference at https://docs.anthropic.com/en/api/models documenting GET /v1/models/{model_id} for single-model lookup, OpenAI Models API at https://platform.openai.com/docs/api-reference/models documenting the literal first endpoint after auth with Model { id, object: "model", created, owned_by } and ModelList { object: "list", data: Vec<Model> }, OpenAI Python SDK client.models.list() and client.models.retrieve(model_id) first-class typed surface, Anthropic Python SDK client.models.list() parallel surface GA-shipped 2024-12-04 alongside the API endpoint, Anthropic TypeScript SDK client.models.list(), AWS Bedrock ListFoundationModels API documenting Bedrock-anthropic-relay equivalent with FoundationModelSummary provider+model+modalities+active flag, Azure OpenAI Models reference with deployment-aware catalog, Vertex AI projects.locations.models.list for Vertex-published Anthropic/Gemini/3rd-party models, DeepSeek/Moonshot/Alibaba-DashScope/xAI parallel /v1/models OpenAI-compat shape, OpenRouter Models API at https://openrouter.ai/api/v1/models — the canonical "live model catalog with pricing" reference and the model that anomalyco/opencode-via-models.dev uses for pricing-data freshness, simonw/llm llm models and llm models default <model> first-class CLI subcommand backed by per-plugin model registration with models.dev-equivalent freshness, simonw/llm plugin-registration architecture for ad-hoc model addition, Vercel AI SDK 6 provider.languageModels() and provider.embeddingModels() first-class typed catalog APIs, LangChain init_chat_model(model_provider, model_name) reflective discovery via provider-defined catalogs and BaseChatModel.aget_models async catalog query, models.dev (https://models.dev) — community-maintained authoritative model catalog with pricing + capability flags + provider routing, used by anomalyco/opencode for pricing-data freshness with explicit fallback metadata when a model id isn't in the catalog (the canonical "external authoritative source for model metadata" reference), anomalyco/opencode models.dev integration with periodic refresh and explicit { provider: unknown, reason: not_in_pricing_table } fallback metadata, charmbracelet/crush typed catalog with provider+model+input/output-pricing, continue.dev config-file-driven catalog with auto-refresh from provider endpoints, zed-industries/zed bundled JSON catalog with periodic upstream refresh, TabbyML/tabby model catalog via plugin registration, llama.cpp server /v1/models local-model catalog via OpenAI-compat shape, LM Studio /v1/models local-model catalog, Ollama /api/tags and /v1/models local-model catalog with both Ollama-native and OpenAI-compat shapes, llamafile bundled-model catalog, LiteLLM models reference covering 100+ models at proxy level, portkey.ai gateway-level catalog, helicone.ai observability-platform model catalog with per-model usage stats, prompthub.us model-catalog-as-service, OpenTelemetry GenAI semconv gen_ai.request.model and gen_ai.response.model documented as required attributes for spans (every observability backend treats model as a first-class structured signal requiring authoritative-source validation), OpenAPI 3.1 spec for /v1/models at https://github.com/openai/openai-openapi as canonical machine-readable schema, Anthropic API stability versioning at https://docs.anthropic.com/en/api/versioning with anthropic-version header semver-stable since 2023-06-01 and models endpoint stable since 2024-12-04. Thirty-two ecosystem references, three first-class models-endpoint specs (Anthropic, OpenAI, OpenRouter), GA timeline of 16 months on Anthropic's side and 6+ years on OpenAI's side, eight first-class CLI/SDK implementations (Anthropic Python+TypeScript, OpenAI Python, simonw/llm, Vercel AI SDK, LangChain, Zed, charmbracelet/crush), seven first-class local-model catalogs (Ollama, LM Studio, llama.cpp server, llamafile, Tabby, Continue.dev, LiteLLM proxy), one community-maintained authoritative pricing source (models.dev) used by the closest peer coding agent. claw-code is the **sole client/agent/CLI in the surveyed coding-agent ecosystem with zero /v1/models integration AND a misleading /providers slash command that aliases to /doctor** — both gaps are unique to claw-code in the surveyed ecosystem, the model-discovery gap is the **upstream root cause** of three downstream cost-and-correctness gaps already catalogued in this audit (#209 / #210 / #221), and the misleading-alias-shape is novel within the cluster — #222 closes the upstream root cause of three downstream gaps and unblocks live-catalog-driven cost-estimation, max-tokens-validation, batch-capability-detection, and CLI-vs-slash-command-symmetry that the runtime's clawability doctrine treats as canonical baseline expectations. 2026-04-26 02:15:43 +09:00
YeonGyu-Kim
9acd4f14da roadmap: #221 filed — Message Batches API is structurally absent: zero /v1/messages/batches endpoint, zero /v1/batches endpoint, zero MessageBatch / BatchedRequest / BatchedResult / BatchProcessingStatus / BatchRequestCounts typed taxonomy across rust/crates/api/src/types.rs (zero hits for batches, MessageBatch, BatchedRequest, custom_id, processing_status), zero submit_batch / retrieve_batch / retrieve_batch_results / cancel_batch / list_batches methods on the Provider trait at rust/crates/api/src/providers/mod.rs:17-30 (only send_message and stream_message exist, both per-request synchronous), zero batch dispatch on ProviderClient enum at rust/crates/api/src/client.rs:8-14 (three variants Anthropic/Xai/OpenAi all closed under sync send_message + stream_message), zero BatchSubmittedEvent / BatchInProgressEvent / BatchEndedEvent typed events on the runtime telemetry sink, zero claw batch / claw batches CLI subcommand surface at rust/crates/rusty-claude-cli/src/main.rs, zero /batch slash command in SlashCommandSpec table at rust/crates/commands/src/lib.rs, zero pending_batches field in claw status --json output, zero is_batch_request flag on pricing_for_model cost estimator (so even if Batch API were wired, cost would over-charge by 2x), zero batch_input_tokens_per_million_usd / batch_output_tokens_per_million_usd fields in the Pricing struct — the API has been GA on Anthropic since 2024-10-08 (18 months ago at filing time, with explicit 'anthropic-beta: message-batches-2024-09-24' opt-in header documented) and on OpenAI since 2024-04-15 (24 months ago at filing time), uniformly offers 50% input-and-output token discount, accepts up to 100,000 requests per batch with 256MB total payload (Anthropic) or unlimited via Files API (OpenAI), 24-hour completion SLO; combining with #219's also-missing prompt-caching opt-in (90% input savings) gives a compounded ~95% input-cost asymmetry on bulk ingest scenarios — the single largest cost-reduction lever in the entire API parity audit, missing at the endpoint-family level rather than the per-field level (Jobdori cycle #373 / extends #168c emission-routing audit / sibling-shape cluster grows to twenty: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221 / wire-format-parity cluster grows to eleven: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221 / capability-parity cluster grows to three: #218+#220+#221 / cost-parity cluster grows to eight: #204+#207+#209+#210+#213+#216+#219+#221 — #221 compounds with #219 to ~95% bulk-ingest cost asymmetry, the largest cost gap in the cluster / seven-layer-endpoint-family-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + Worker-registry-status-enum + CLI-subcommand-surface + pricing-tier-flag) is the largest single capability absence catalogued, exceeding #220's five-layer-feature-absence / endpoint-family-level absence shape is novel — applies to follow-on candidates 'Files API typed taxonomy is absent' (the OpenAI batch path's prerequisite endpoint, also absent), 'Embeddings API typed taxonomy is absent' (/v1/embeddings cross-cutting), 'Models list endpoint typed taxonomy is absent' (/v1/models / Anthropic Models API) / external validation: Anthropic Message Batches API reference at https://docs.anthropic.com/en/api/messages-batches documenting five operations on /v1/messages/batches + GA 2024-10-08 + 50% discount + 100k-requests-per-batch + 256MB-total-payload + 24-hour-SLO + custom_id correlation field, Anthropic launch announcement at anthropic.com/news/message-batches-api documenting '50% off both input and output tokens' positioning, Anthropic Pricing page documenting Batch API column with 50% across Sonnet 3.5/4/4.5/4.6 + Opus 3/4/4.6 + Haiku 3.5, Anthropic Python SDK client.messages.batches.create(requests=[...]) first-class typed surface, Anthropic TypeScript SDK parallel surface, AWS Bedrock InvokeModelBatch / batch-inference docs (Bedrock-anthropic-relay path), OpenAI Batch API reference at platform.openai.com/docs/api-reference/batch documenting GA 2024-04-15 + 50% discount + JSONL-via-Files-API + completion_window:'24h', OpenAI launch announcement at openai.com/index/openai-introduces-batch-api documenting 'process batches asynchronously and receive results within 24 hours at a 50% discount', DeepSeek/Moonshot/Alibaba-DashScope/xAI batch-inference parallel surfaces, OpenRouter batch passthrough, simonw/llm --batch flag, Vercel AI SDK generateBatch + provider-specific batch passthrough, LangChain Runnable.batch() + Runnable.abatch() first-class Python+TypeScript parity, LangSmith batch-aware tracing, llmindset.co.uk independent cost-calculus validation, Medium 'process 10,000 queries without breaking the bank' tutorial, Steve Kinney's Anthropic-Batch-with-Temporal workflow-orchestration article, ai.moda Anthropic-Batch+Caching 95%-compounded-savings analysis (proves #219+#221 together close the largest cost gap), VentureBeat industry-press coverage, Reddit r/ClaudeAI launch thread, zed-industries/zed#19945 (peer ecosystem with same gap), RooCodeInc/Roo-Code#8667 (peer ecosystem with same gap), n8n Anthropic-batch-processing workflow, startground.com batch-deals tracker, silicondata.com 2026-pricing per-model batch breakdown, Hacker News batch-mechanics discussions, OpenTelemetry GenAI semconv gen_ai.request.batch_id + gen_ai.batch.processing_status + gen_ai.batch.request_counts documented attributes, IANA application/x-ndjson + application/jsonl MIME-type registrations / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero batch-dispatch capability despite the API being GA on both major providers for 18+ months — parity floor against every other CLI/SDK/coding-agent in 2024-2025, the largest single cost-reduction lever in the entire emission-routing audit, and the largest endpoint-family-level capability gap catalogued so far) 2026-04-26 01:45:20 +09:00
YeonGyu-Kim
d46c423c1d roadmap: #220 filed — Image/vision input is structurally impossible across the entire data model: zero image content-block taxonomy variant on InputContentBlock (types.rs:80-94 has only Text/ToolUse/ToolResult — three of three exhaustive variants, zero Image, zero Document, zero MediaType, zero ImageSource, zero base64/file_id slot, zero media_type field anywhere in rust/crates/api/src/), zero parse arm for /image <path> and /screenshot slash commands despite their advertised summaries ("Add an image file to the conversation" at commands/lib.rs:585, "Take a screenshot and add to conversation" at commands/lib.rs:578) being in the canonical SlashCommandSpec table since project inception, both gated under STUB_COMMANDS at main.rs:8381-8382 (UX patch over missing-feature, not missing-feature fix), ResolvedAttachment at tools/lib.rs:2660-2666 carries path/size/is_image triple but no bytes / no base64 / no media_type / no upload affordance / no transport-ready payload despite is_image_path at line 5276 correctly classifying png/jpg/jpeg/gif/webp/bmp/svg extensions and the SendUserMessage/Brief tool surfacing isImage: true in JSON envelope (asserted at line 8969); build_chat_completion_request (openai_compat.rs:845) and translate_message (openai_compat.rs:946) have three-arm exhaustive matches over Text/ToolUse/ToolResult with no Image arm and no {type: "image", source: {type: "base64", media_type, data}} Anthropic-canonical wire shape and no {type: "image_url", image_url: {url: "data:image/...;base64,..."}} OpenAI-compat wire shape; the markdown renderer at render.rs:379-426 handles Tag::Image and TagEnd::Image for *output* rendering (asymmetric capability — model emits image markdown → rendered as colored [image:url] link, user attaches image → silent black hole at API boundary); the runtime's own worker_boot test fixture at worker_boot.rs:1324+:1349 literally hard-codes "Explain this KakaoTalk screenshot for a friend" as the canonical task-classification example for worker prompt-mismatch recovery — claw-code uses screenshot analysis as a runtime-classifier signal while having zero capability to actually send a screenshot to the model; TUI-ENHANCEMENT-PLAN.md:57 backlogs the gap as "No image/attachment preview" but the gap is far worse than no preview — there is no transport, no codec, no envelope, no anything from the byte stream to the wire (Jobdori cycle #372 / extends #168c emission-routing audit / sibling-shape cluster grows to nineteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220 / wire-format-parity cluster grows to ten: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220 / capability-parity cluster (strict-superset including user-facing surfacing): #218+#220 / five-layer-structural-absence shape (data-model-variant + slash-command-parse-arm + attachment-metadata-threading + request-builder-translation + OS-integration-helper) is the largest single feature absence yet catalogued, exceeding #218's four-layer; advertised-but-unbuilt shape is novel — UX-layer cousin of #219's false-positive-opt-in shape — applicable to other STUB_COMMAND entries with capability-claim summaries / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero image-input capability despite Anthropic Vision GA on 2024-03-04 (25 months ago at filing time, default-on for all Claude 3.5+ models with 5MB-per-image / 32MB-per-request / 100-images-per-request limits) and OpenAI Vision GA on 2024-05-13 (23 months ago) and Google Gemini multimodal GA on 2024-02-15 (26 months ago), making this a regression against the upstream claude-code CLI claw-code is porting from / external validation: Anthropic Vision API reference at platform.claude.com/docs/en/build-with-claude/vision documenting the canonical {type, source: {type, media_type, data}} content block, Anthropic Messages API reference, Anthropic Files API beta with file_id reference for repeated-image-use efficiency, AWS Bedrock prompt-caching docs with image-block coverage and 20-images-per-request stricter limit and same cachePoint:{} pattern from #219, OpenAI Vision API reference documenting the {type:image_url, image_url:{url}} data-URL shape used by GPT-4o/4o-mini/5-vision/o1-vision/o3-vision/DeepSeek-VL2/Qwen-VL/QwQ-VL/MiniMax-VL/Moonshot kimi-VL, Google Gemini multimodal API documenting {inline_data:{mime_type, data}} shape, anomalyco/opencode#16184 (look_at tool image-file-from-disk handling bug), anomalyco/opencode#15728 (Read tool image-handling bug), anomalyco/opencode#8875 (custom-provider attachment-allowlist gap), anomalyco/opencode#17205 (text-only-model token-burn on image attachment) — all four are integration-quality gaps in opencode while claw-code is missing the capability entirely (~85% vs 0% parity asymmetry, the largest in the cluster), charmbracelet/crush vision-input via terminal paste, simonw/llm --attachment flag, Vercel AI SDK experimental_attachments + image content blocks, LangChain HumanMessage content blocks, LangGraph image-message routing, OpenAI Python and Anthropic Python SDK first-class image-typed messages, anthropic-quickstarts vision examples, claude-code official CLI paste-image and screenshot shortcuts (the upstream this is a regression against), OpenTelemetry GenAI semconv gen_ai.input.attachments and gen_ai.input.images.count multimodal observability attributes, IANA MIME-type registry RFC 4288/4289) 2026-04-26 01:18:43 +09:00
YeonGyu-Kim
2858aeccff roadmap: #219 filed — Anthropic prompt-caching opt-in is structurally impossible: cache_control marker has zero codebase footprint (rg returns 0 hits across rust/ src/ docs/ tests/) despite the wire-side beta header 'prompt-caching-scope-2026-01-05' being unconditionally enabled at every Anthropic request (telemetry/lib.rs:16,452,469 + anthropic.rs:1443); five cacheable surfaces are uniformly locked: pub system: Option<String> at types.rs:11 is a flat string with no array form so no system-block cache_control slot exists; InputContentBlock variants Text/ToolUse/ToolResult at types.rs:80-99 have no cache_control field; ToolResultContentBlock variants Text/Json at types.rs:100-103 have no cache_control field; ToolDefinition at types.rs:105-110 has no cache_control field; openai_compat path translate_message at openai_compat.rs:946 and build_chat_completion_request at openai_compat.rs:850 emit flat-string system+content with no cache_control or Bedrock cachePoint translation; ~600 LOC of response-side cache stats infrastructure (prompt_cache.rs PromptCacheStats / PromptCacheRecord / PromptCache trait) accumulates a zero stream because no payload was opted in, and four hardcoded zero-coercion sites (openai_compat.rs:477-478, 489-490, 597-598, 1211-1212) discard upstream cache stats from Bedrock/Vertex/kimi-anthropic-compat/MiniMax-relay even when emitted; integration test at client_integration.rs:88-89 asserts the beta header is sent but no companion test asserts payload contains a cache_control marker because the data structures cannot produce one — a uniquely paradoxical false-positive opt-in shape: wire signal advertises caching intent and data-model structurally precludes it (Jobdori cycle #371 / extends #168c emission-routing audit / sibling-shape cluster grows to eighteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219 / wire-format-parity cluster grows to nine: #211+#212+#213+#214+#215+#216+#217+#218+#219 / cost-parity cluster grows to seven: #204+#207+#209+#210+#213+#216+#219 — #219 is the dominant cost-parity miss, ~90% input-token-cost reduction unattainable / cache-parity request/response symmetry pair: #219 (request-side opt-in absent) + #213 (response-side stats absent on openai-compat lane) / five-surface uniform-structural-absence shape: system+tools+tool_choice+messages+tool_result_content all locked, with no extra_body escape hatch since cache_control is a per-block annotation not a top-level field / false-positive-opt-in shape: novel cluster member where wire signal says yes and structure says no / external validation: Anthropic prompt-caching reference at platform.claude.com/docs/en/build-with-claude/prompt-caching documenting cache_control: {type: ephemeral} on system/tools/messages/content blocks with 5-min default TTL and 1-hour optional TTL and 90% cost reduction on cache-read tokens, Anthropic Messages API reference documenting system: Vec<SystemBlock> array form as the cacheable shape, Bedrock prompt-caching docs documenting cachePoint: {} block form for Bedrock-anthropic relay, claudecodecamp.com analysis of how prompt caching actually works in Claude Code, xda-developers article documenting claude-code's cache-token-budget knob proving caching is actively engaged, anomalyco/opencode#5416 #14203 #16848 #17910 #20110 #20265 (cache-related issues and PR for system-prompt-split-for-cache-hit-rate optimization), opencode-anthropic-cache npm package as third-party plugin proving the ecosystem expectation, LangChain anthropicPromptCachingMiddleware as first-class JS wrapper, LiteLLM prompt-caching docs with single-line cache_control pass-through for Anthropic+Bedrock, Vercel AI SDK Anthropic provider providerOptions.anthropic.cacheControl, prompthub.us multi-provider comparison treating opt-in as documented baseline, portkey.ai gateway-level pass-through, mindstudio.ai cost-impact analysis, OpenTelemetry GenAI semconv gen_ai.usage.input_tokens.cached as documented attribute — claw is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero cache_control request-side opt-in capability despite shipping the eligibility beta header on every Anthropic request) 2026-04-26 00:40:20 +09:00
YeonGyu-Kim
116a95a253 roadmap: #218 filed — MessageRequest has no response_format / output_config / seed / logprobs / top_logprobs / logit_bias / n / metadata fields (types.rs:6-36, thirteen fields, zero hits across rust/ for any of these); build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields and emits none of these on the wire; AnthropicClient::send_raw_request (anthropic.rs:466) renders same MessageRequest via render_json_body (telemetry/lib.rs:107) with same gaps; ChatMessage (openai_compat.rs:688) has three fields (role, content, tool_calls) and no refusal field despite the streaming-aggregator test at line 1781 explicitly including "refusal": null in test data — silent serde drop; ChunkDelta (openai_compat.rs:735) has same gap; OutputContentBlock (types.rs:147) has four variants (Text, ToolUse, Thinking, RedactedThinking) and no Refusal variant; MessageResponse.stop_reason (types.rs:127) has no slot for Anthropic's 2025-11+ stop_reason='refusal' value; net effect: claw cannot opt into OpenAI strict-schema constrained decoding (response_format json_schema, GA 2024-08), cannot opt into Anthropic GA structured outputs (output_config.format, GA 2025-11-13), cannot opt into legacy JSON mode (response_format json_object), cannot supply seed for reproducible sampling, cannot request logprobs/top_logprobs, cannot bias tokens via logit_bias, cannot request multiple completions via n, and silently discards every refusal string OpenAI emits when constrained decoding rejects a generation — refusals classified as Finished/success with empty content via #217 normalize_finish_reason mapping (Jobdori cycle #370 / extends #168c emission-routing audit / sibling-shape cluster grows to seventeen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218 / wire-format-parity cluster grows to eight: #211+#212+#213+#214+#215+#216+#217+#218 / four-layer-structural-absence shape: request-struct-field + request-builder-write + response-struct-field + content-block-taxonomy-variant, largest single-feature absence catalogued / external validation: OpenAI Structured Outputs guide, OpenAI Chat Completions API reference, Anthropic structured-outputs reference (GA 2025-11-13), Anthropic Messages API reference (stop_reason='refusal'), Vercel AI Gateway Anthropic structured outputs, Vercel AI SDK 6 generateObject + Zod, LangChain with_structured_output, simonw/llm --schema flag, charmbracelet/crush, anomalyco/opencode#10456 open feature request citing OpenAI Codex as reference, anomalyco/opencode#5639/#11357/#13618, OpenAI Codex CI/code-review cookbook, OpenRouter structured-outputs docs, OpenAI Python SDK client.beta.chat.completions.parse, OpenTelemetry GenAI semconv gen_ai.request.response_format + gen_ai.response.refusal) 2026-04-26 00:13:01 +09:00
YeonGyu-Kim
91e290526a roadmap: #217 filed — normalize_finish_reason (openai_compat.rs:1389) is a two-arm match (stop→end_turn, tool_calls→tool_use) with a string-passthrough fallthrough that drops three of five OpenAI-spec finish reasons (length, content_filter, function_call); MessageResponse.stop_reason is Option<String> with no enum constraint; WorkerRegistry::observe_completion (worker_boot.rs:558) classifies failure on finish=='unknown'||finish=='error' only, so OpenAI/DeepSeek/Moonshot truncation (length) and content-policy refusal (content_filter) become WorkerStatus::Finished with success events; the streaming aggregator's tool-call-block-close branch at openai_compat.rs:537 keys on 'tool_calls' literal and never fires for legacy 'function_call' shape (Azure pre-2024-02-15 / DeepSeek pre-2025-08 / SiliconFlow / OpenRouter relays); Anthropic native path produces the canonical taxonomy correctly (Jobdori cycle #369 / extends #168c emission-routing audit / sibling-shape cluster grows to sixteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217 / wire-format-parity cluster grows to seven: #211+#212+#213+#214+#215+#216+#217 / classifier-leakage shape: response-side string mistranslation flows three layers deep into runtime classifier with two-literal-compare coverage / external validation: OpenAI Chat Completions API reference, Anthropic Messages API reference, OpenAI function_call deprecation notice, Azure OpenAI reference, DeepSeek/Moonshot/DashScope refs, anomalyco/opencode#19842, charmbracelet/crush typed enum, simonw/llm Reason enum, Vercel AI SDK FinishReason union, LangChain LengthFinishReasonError/ContentFilterFinishReasonError, semantic-kernel FinishReason enum, openai-python Literal type, OpenTelemetry GenAI gen_ai.response.finish_reasons spec) 2026-04-25 23:39:13 +09:00
YeonGyu-Kim
ceb092abd7 roadmap: #216 filed — neither MessageRequest nor MessageResponse has any service_tier field; build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields (model, max_tokens/max_completion_tokens, messages, stream, stream_options, tools, tool_choice, temperature, top_p, frequency_penalty, presence_penalty, stop, reasoning_effort) and does not write service_tier; AnthropicClient::send_raw_request (anthropic.rs:466) renders the same MessageRequest struct via AnthropicRequestProfile::render_json_body (telemetry/lib.rs:107) which has no field for it either, only a per-client extra_body escape hatch (asymmetric — openai_compat path has zero hits for extra_body); ChatCompletionResponse / ChatCompletionChunk / OpenAiUsage all deserialize four fields each, dropping the upstream-echoed service_tier confirmation and the system_fingerprint reproducibility marker that OpenAI documents as the canonical "what backend served you" signal; claw cannot opt into OpenAI flex (~50% cheaper async batch — developers.openai.com/api/docs/guides/flex-processing), cannot opt into OpenAI priority (~1.5-2x premium SLA latency — developers.openai.com/api/docs/guides/priority-processing), cannot opt into Anthropic priority (auto/standard_only — platform.claude.com/docs/en/api/service-tiers), and cannot detect at the response layer whether a request was flex-served or silently upgraded to priority by a project-level default override (Jobdori cycle #368 / extends #168c emission-routing audit / sibling-shape cluster grows to fifteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216 / wire-format-parity cluster grows to six: #211+#212+#213+#214+#215+#216 / cost-parity cluster grows to six: #204+#207+#209+#210+#213+#216 / three-dimensional-structural-absence shape: request-side write + response-side read + reproducibility marker, distinct from prior request-only #211#212 / response-only #207#213#214 / header-only #215 members / external validation: OpenAI flex/priority/scale-tier guides, OpenAI advanced-usage system_fingerprint guide, Anthropic service-tiers reference, OpenTelemetry GenAI semconv gen_ai.openai.request.service_tier + gen_ai.openai.response.service_tier + gen_ai.openai.response.system_fingerprint, anomalyco/opencode#12297, Vercel AI SDK serviceTier provider option, LangChain ChatOpenAI service_tier ctor param, LiteLLM service_tier pass-through, semantic-kernel OpenAIPromptExecutionSettings.ServiceTier, openai-python SDK client.chat.completions.create(service_tier=...) first-class kwarg, MiniMax/DeepSeek Anthropic-compat layer notes, badlogic/pi-mono#1381) 2026-04-25 23:12:25 +09:00
YeonGyu-Kim
2da12117eb roadmap: #215 filed — expect_success reads only request-id/x-request-id headers and discards the rest; both OpenAiCompatClient::send_with_retry and AnthropicClient::send_with_retry sleep on pure exponential backoff (2^(n-1) * initial + jitter) that ignores upstream Retry-After (RFC 7231 §7.1.3, mandated by Anthropic on 429, emitted by OpenAI/DeepSeek/Moonshot/DashScope on 429/503/529); ApiError::Api has no retry_after field, scheduler has no input port for it; on a 60s server-specified cooldown, claw burns 3 retries in <8s against a closed gate then surfaces RetriesExhausted (Jobdori cycle #367 / extends #168c emission-routing audit / sibling-shape cluster grows to fourteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215 / upstream-contract-honoring trio: #211+#213+#215 / wire-format-parity cluster: #211+#212+#213+#214+#215 / external validation: Anthropic rate-limits docs, OpenAI cookbook, DeepSeek rate-limit docs, RFC 7231 §7.1.3, openai-python#957, Vercel AI SDK LanguageModelV1RateLimit.retryAfter, LangChain BaseChatOpenAI, anomalyco/opencode#16993/#16994/#9091/#17583/#11705, charmbracelet/crush, LiteLLM Router.retry_after_strategy) 2026-04-25 22:41:49 +09:00
YeonGyu-Kim
959bdf8491 roadmap: #214 filed — ChunkDelta and ChatMessage in openai_compat.rs deserialize only content/tool_calls; delta.reasoning_content (sibling to delta.content, the canonical wire field for DeepSeek deepseek-reasoner / Alibaba Qwen3-Thinking / QwQ / vLLM reasoning-parser backends) is silently discarded at serde-deserialize time before any handler sees it; non-streaming ChatMessage has the same gap; is_reasoning_model classifier already returns true for o1/o3/o4/grok-3-mini/qwen-qwq/qwq/*thinking* and is consulted at line 901 to strip request-side tuning params but never on the response side to opt into reasoning_content extraction; local taxonomy already declares OutputContentBlock::Thinking and ContentBlockDelta::ThinkingDelta and the Anthropic native path correctly emits both with full test coverage at sse.rs:260,288 — the openai-compat translator has the destination types one import away and never bridges to them (Jobdori cycle #366 / extends #168c emission-routing audit / sibling-shape cluster grows to thirteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214 / reasoning-fidelity trio: #207+#211+#214 / wire-format-parity cluster: #211+#212+#213+#214 / external validation: DeepSeek API docs, vLLM reasoning-outputs, anomalyco/opencode#24124, charmbracelet/crush, simonw/llm, Vercel AI SDK, LangChain BaseChatOpenAI, LiteLLM, continue.dev#9245) 2026-04-25 22:16:02 +09:00
YeonGyu-Kim
347102d83b roadmap: #213 filed — OpenAiUsage struct does not deserialize prompt_tokens_details.cached_tokens (OpenAI 2024-10) or prompt_cache_hit_tokens (DeepSeek); openai_compat path hardcodes cache_creation_input_tokens: 0 and cache_read_input_tokens: 0 at four sites; cost estimator computes $0 cache savings for every OpenAI/DeepSeek/Moonshot kimi request even when upstream prompt cache is hitting; Anthropic native path correctly populates same Usage fields from native wire format (Jobdori cycle #365 / extends #168c emission-routing audit / sibling-shape cluster grows to twelve: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213 / cost-parity cluster: #204+#207+#209+#210+#213 / wire-format-parity cluster: #211+#212+#213 / external validation: OpenAI prompt caching docs, DeepSeek pricing docs, anomalyco/opencode#17223/#17121/#17056/#11995, Vercel AI SDK cachedInputTokens, charmbracelet/crush, simonw/llm) 2026-04-25 21:42:54 +09:00
Jobdori
c00981896f roadmap: #212 filed — MessageRequest+ToolChoice cannot express parallel_tool_calls (OpenAI top-level) or disable_parallel_tool_use (Anthropic tool_choice modifier); zero hits across rust/ src/ tests/ docs/; ToolChoice is 3-variant enum with no modifier slot; openai_tool_choice mapper has 3-arm match no parallel path; provider default is parallel-on, claw cannot opt out (Jobdori cycle #364 / extends #168c emission-routing audit / sibling-shape cluster grows to eleven: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212 / wire-format-parity cluster: #211+#212 / external validation: Anthropic docs, OpenAI API reference, LangChain BaseChatOpenAI, anomalyco/opencode, charmbracelet/crush#1061) 2026-04-25 21:10:50 +09:00
YeonGyu Kim
f004f74ffa roadmap: #211 filed — build_chat_completion_request selects max_tokens_key only on wire_model.starts_with("gpt-5"), sending legacy max_tokens to OpenAI o1/o3/o4-mini reasoning models which reject it with unsupported_parameter; is_reasoning_model classifier 90 lines above already knows o-series is reasoning, taxonomy half-applied within 30-line span; no test for any o-series model (Jobdori cycle #363 / extends #168c emission-routing audit / sibling-shape cluster grows to ten: #201/#202/#203/#206/#207/#208/#209/#210/#211 / external validation: charmbracelet/crush#1061, simonw/llm#724, HKUDS/DeepTutor#54) 2026-04-25 20:38:43 +09:00
YeonGyu-Kim
02252a8585 roadmap: #210 filed — rusty-claude-cli shadows api::max_tokens_for_model with stripped 2-branch fork (opus=32k, else=64k); ignores model_token_limit registry, bypasses plugin maxOutputTokens override, silently sends 64_000 for kimi-k2.5 whose registry cap is 16_384 (4x over) (Jobdori cycle #362 / extends #168c emission-routing audit / sibling-shape cluster grows to nine: #201/#202/#203/#206/#207/#208/#209/#210) 2026-04-25 20:06:43 +09:00
YeonGyu-Kim
134e945a01 roadmap: #209 filed — pricing_for_model substring-matches haiku/opus/sonnet only; default_sonnet_tier function name carries Opus pricing constants (15.0/75.0 vs real Sonnet 3.0/15.0); every non-Anthropic model silently falls back producing 5-100x wrong cost estimates with no event signal, only a magic-string suffix on one summary line; rusty-claude-cli session JSON and anthropic.rs telemetry emit cost without pricing_source field (Jobdori cycle #361 / cost-parity cluster closer to #204+#207 / models.dev parity gap vs anomalyco/opencode) 2026-04-25 19:42:37 +09:00
Jobdori
c20d0330c1 roadmap: #208 filed — silent param/field strip on outbound serialization (4 tuning params for reasoning models, is_error for kimi), self-documenting 'silently strip' comments, no event emission, tests assert removal but not visibility (Jobdori cycle #359 / sibling-chain closer to #207 inbound-drop / completes OpenAI-compat boundary audit) 2026-04-25 19:06:56 +09:00
YeonGyu-Kim
ba3a34d6fe roadmap: #207 filed — OpenAiUsage discards prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens, cache_read_input_tokens hardcoded 0 in 4 sites breaking cost parity with Anthropic path (Jobdori cycle #358 / fix-pair with #204 / anomalyco/opencode #24233 sibling) 2026-04-25 18:34:44 +09:00
YeonGyu-Kim
0e9cff588d roadmap: #206 filed — normalize_finish_reason covers 2/5 OpenAI finish reasons, length/content_filter/function_call unmapped (Jobdori cycle #357)
Pinpoint #206: normalize_finish_reason() in openai_compat.rs only maps
stop→end_turn and tool_calls→tool_use. The 'other => other' pass-through
arm silently leaks length, content_filter, function_call to downstream
consumers expecting Anthropic vocabulary (max_tokens, refusal, tool_use).

Sibling of #201/#202/#203/#204 (silent fallbacks at provider boundary).
No structured event for unmapped values; test coverage locks only the
two-case happy path.

Branch: feat/jobdori-168c-emission-routing
HEAD: dba4f28
2026-04-25 18:20:04 +09:00
YeonGyu-Kim
dba4f281f0 roadmap: #205 filed — prunable worktree lifecycle audit trail missing, no creation timestamp, pinpoint ID, or doctor visibility (Q *YeonGyu Kim cycle #137 / Jobdori cycle #351) 2026-04-25 17:16:57 +09:00
YeonGyu-Kim
1c59e869e0 roadmap: #204 filed — TokenUsage omits reasoning_tokens, reasoning models merge into output_tokens breaking cost parity (anomalyco/opencode #24233 parity gap, Jobdori cycle #336) 2026-04-25 12:01:26 +09:00
YeonGyu-Kim
604bf389b6 roadmap: #203 filed — AutoCompactionEvent summary-only, no SSE event emitted mid-turn when auto-compaction fires (Jobdori cycle #136) 2026-04-25 07:48:22 +09:00
YeonGyu-Kim
0730183f35 roadmap: #202 filed — sanitize_tool_message_pairing silent drop, no tool_message_dropped event (Jobdori cycle #135) 2026-04-25 06:06:32 +09:00
YeonGyu-Kim
5e0228dce0 roadmap: #201 filed — parse_tool_arguments silent fallback, no tool_arg_parse_error event (Jobdori cycle #134) 2026-04-25 05:03:54 +09:00
YeonGyu-Kim
b780c808d1 roadmap: #200 filed — SCHEMAS.md self-documenting drift, no derive-from-source enforcement (Q *YeonGyu Kim cycle #304) 2026-04-25 04:03:40 +09:00
YeonGyu-Kim
6948b20d74 roadmap: #199 filed — claw config JSON envelope omits deprecated_keys, merged_keys count-only, no automation path (Jobdori cycle #133) 2026-04-24 19:52:16 +09:00
YeonGyu-Kim
c48c9134d9 roadmap: #198 filed — MCP approval-prompt opacity, no blocked.mcp_approval state, pane-scrape required (gaebal-gajae cycle #135 / Jobdori cycle #248) 2026-04-24 13:31:50 +09:00
YeonGyu-Kim
215318410a roadmap: #197 filed — enabledPlugins deprecation no migration path, warning on every invocation (Jobdori cycle #132) 2026-04-24 09:29:07 +09:00
YeonGyu-Kim
59acc60eb5 roadmap: Doctrine #35 formalized — disk-truth wins over verbal drift during taxonomy disputes (Jobdori cycle #194) 2026-04-24 01:01:34 +09:00
YeonGyu-Kim
3497851259 roadmap: #196 filed — local branch namespace accumulation, no lifecycle cleanup or doctor visibility (Jobdori cycle #131) 2026-04-23 23:34:08 +09:00
YeonGyu-Kim
d93957de35 roadmap: #195 filed — worktree-age opacity, no timestamp or doctor signal (Jobdori cycle #130) 2026-04-23 20:01:55 +09:00
YeonGyu-Kim
86e88c2fcd roadmap: #194 filed — prunable-worktree accumulation, no doctor visibility or auto-prune lifecycle 2026-04-23 14:22:24 +09:00
YeonGyu-Kim
94bd6f13a7 roadmap: Doctrine #33 formalized via cross-claw validation (cycle #129)
Per gaebal-gajae cycle #129 closure ('Doctrine #33 적용도 맞습니다'),
promoting Doctrine #33 from provisional to formal status.

Statement:
'Merge-wait steady state reports as a vector, not narrative.'

Operational protocol:
- Validate 4-element state vector each cycle:
  ready_branches, prs, repo_drift, external_gate
- If unchanged: vector-only post (5 lines) OR silent ack
- If changed: that change IS the cycle's content

Anti-pattern prevented:
중복 확인 로그 (duplicate check logs). Re-posting full merge-wait
narrative every cycle when state hasn't moved.

Validation history:
- Cycle #124: gaebal-gajae introduced compression
- Cycle #129: Jobdori first field-test (vector-only post)
- Cycle #129: gaebal-gajae cross-claw validation (same vector,
              same conclusion, both claws converged)

Cross-claw coherence test passed:
- Both claws independently produced same vector values
- Both reached same conclusion (merge-wait holds)
- Both used same response pattern (vector form)

Doctrine #29-#33 progression operationalizes Phase 0 closure +
merge-wait discipline. #33's specific contribution: noise prevention
during legitimate hold states.

Doctrine count: 33 formalized.
Mode integrity: preserved (this is doc-only follow-up, not probe).
2026-04-23 14:02:08 +09:00
YeonGyu-Kim
d1fa484afd roadmap: #193 filed — session/worktree hygiene readability gap (gaebal-gajae framing)
Per gaebal-gajae cycle #123-#125 framing + authorization, filing
operational pinpoint on dogfood methodology layer (not claw-code binary).

Title: 'Session/worktree hygiene debt makes active delivery state
 harder to read than the actual code state.'

Short form: 'branch/worktree proliferation outpaced merge/cleanup
 visibility.'

Gap identified by gaebal-gajae: 4 branch states visually
indistinguishable on same surface:
  1. Ready branch (merge-ready, gated externally)
  2. Blocked branch (abandoned due to architecture/pushback)
  3. Stale abandoned branch (superseded or merged alternately)
  4. Dirty scratch worktree (experimental, status unclear)

Evidence (cycle #123 substance check):
- 147 local branches
- 30+ clawcode/jobdori /tmp artifacts
- Stale bridge logs from 2026-04-20 (3+ days old)

Class: NOT codegen, NOT test, NOT binary — state readability /
hygiene gap in dogfood methodology layer.

Doctrine #29 compliance: Doc-only ROADMAP entry filed during
merge-wait mode on frozen branch. Legitimate filing-without-fixing.
This is the second such case (first: cycle #100 bundle freeze).

Framing family:
- Sibling to §4.44 (runtime failure state opacity)
- §4.45 tackles repo delivery lane state opacity
- Different scope, same structural pattern

Pinpoint accounting:
- Before #193: 82 total, 67 open
- After #193: 83 total, 68 open
- First dogfood methodology pinpoint (vs binary pinpoints)

Priority: Post-Phase-1 (not Phase 1 bundle member).
Remediation proposal: branch state tagging, worktree lifecycle
discipline, ROADMAP <-> branch mapping.

Sources:
- Cycle #120 Jobdori substance check (147 branches surfaced)
- Cycle #123 Jobdori evidence collation (30 worktrees)
- Cycle #124 gaebal-gajae framing refinement (4-state gap)
- Cycle #125 gaebal-gajae authorization + final framing

Filed by gaebal-gajae authorization. No code change. No probe.
Merge-wait mode preserved. Phase 0 branch integrity preserved.
2026-04-23 13:33:34 +09:00
YeonGyu-Kim
eb0356e92c roadmap: Doctrine #32 formalized + cycle #117 final reframe per gaebal-gajae
Per gaebal-gajae cycle #117 closing validation:

Authoritative reframe:
'Cycle #117은 PR creation failure를 브랜치 문제에서
 organization-level PR authorization barrier로 정확히 격리한
 진단 턴입니다.'

The cycle value was NOT 'PR blocked'.
The cycle value WAS 'boundary of the barrier isolated through experiments'.

Four dimensions experimentally separated:
1. Repository state: healthy (push, tests)
2. Branch readiness: visible on origin
3. Token liveness: valid (own-fork PR succeeded)
4. Org PR authorization: BLOCKED (FORBIDDEN for both claws)

Reviewer-ready compression:
'The branch is pushable and reviewable, but PR creation into
 ultraworkers/claw-code is blocked specifically at the organization
 authorization layer, not by repository state or token liveness.'

Doctrine #32 formalized:
'Merge-wait mode actions must be within the agent's capability
 envelope. When blocked externally, diagnose by boundary separation
 and hand off to the responsible party, not by retry or redefinition.'

Operational protocol:
1. Isolate boundary through experiments (not retry same path)
2. Document separation explicitly (works vs doesn't work)
3. Escalate to responsible party (web UI, org admin, infra)
4. Do NOT retry, conflate, or redefine the failure

Validation: Cycle #117 both-claws blocked, boundary isolated,
escalation path identified.

Cross-claw coherence:
- Cycle #115: 1 claw attempted, 1 succeeded (hypothesis)
- Cycle #117: 2 claws attempted, 2 blocked, IDENTICAL error (confirmed)

Next action path (per gaebal-gajae):
Author/owner intervention via web UI OR org admin OAuth grant.
'기술적 탐사가 아니라 author/owner intervention입니다.'

Doctrine count: 32 formalized.
Gate status: Blocked pending author intervention.
Mode integrity: Preserved throughout cycle #117.
2026-04-23 12:45:21 +09:00
YeonGyu-Kim
7a1e9854c2 roadmap: Cycle #117 cross-claw PR blocker diagnosis locked
Per cycle #117 cross-claw diagnosis (both claws attempted independently):

Both Jobdori (code-yeongyu) and gaebal-gajae (Yeachan-Heo) hit
identical GraphQL FORBIDDEN error on createPullRequest mutation.

Diagnosis: Organization-wide OAuth app restriction on
ultraworkers/claw-code, not per-identity issue.

Reviewer-ready compression (per gaebal-gajae):
'The branch is now remotely visible and PR-ready, but actual PR
 creation is blocked by GitHub permissions rather than repository
 state.'

Confirmed state:
- Branch on origin: Yes (cycle #115)
- PR creation CLI path: Blocked for both claws
- Manual web UI: Required
- Org admin OAuth grant: Long-term fix

Gate sequence updated:
1. Branch on origin (DONE, cycle #115)
2. PR creation - BLOCKED at OAuth (cycle #116/#117)
3. Manual web UI PR creation (REQUIRED next)
4. Review cycle
5. Merge signal
6. Phase 1 Bundle 1 (#181 + #183)

Doctrine #32 (provisional, pending gaebal-gajae formal acceptance):
'Merge-wait mode actions must be within the agent's capability
 envelope. When blocked externally, diagnose + document + escalate,
 not retry.'

Cross-claw validation: Both claws blocked, same error pattern.
Mode integrity: Preserved throughout both attempts.
Next blocker: External human action (manual web UI or org admin).
2026-04-23 12:44:08 +09:00
YeonGyu-Kim
70bea57de3 roadmap: Doctrine #31 formalized + cycle #115 reframe per gaebal-gajae
Per gaebal-gajae cycle #115 validation pass:

Authoritative reframe:
'Cycle #115 was not an exception to merge-wait mode; it was the first
 turn where merge-wait mode actually did what merge-wait mode is
 supposed to do.'

Reviewer-ready compression:
'The branch was frozen but not yet reviewable because it had never
 been pushed; this cycle converted merge-wait from a declared state
 into a remotely visible one.'

Mode semantic correction:
- Merge-wait mode is NOT 'do nothing'
- Merge-wait mode IS 'block discovery + enable merge-readiness'
- Push to origin = merge-readiness action (fits mode, not violation)

Doctrine #31 (formalized):
'Merge-wait mode requires remote visibility.'
Protocol: git ls-remote origin <branch> must return commit hash.
If empty: push before claiming review-ready.

Self-process pinpoint #193 (formalized):
'Dogfood process hygiene gap — declared review-ready claims lacked
 remote visibility check for 40+ minutes (cycles #109-#114).'
Applies to dogfood methodology, not claw-code binary.

Gate sequence (per gaebal-gajae):
1. Branch on origin (cycle #115, DONE)
2. PR creation (next concrete action)
3. Review cycle
4. Merge signal
5. Phase 1 Bundle 1 kickoff

Doctrine count: 31 total.
2026-04-23 12:34:04 +09:00
YeonGyu-Kim
3bbaefcf3e roadmap: lock 'merge-wait mode' state designation per gaebal-gajae
Per gaebal-gajae cycle #110 state designation:
'Phase 0 is no longer in discovery mode; it is in merge-wait mode
 with Phase 1 already precommitted.'

Mode distinction formalized:
- Discovery mode: probe + file + refine (previous state)
- Merge-wait mode: hold state, await signal (CURRENT)
- Execution mode: land bundles (post-merge state)

Doctrine #30: 'Modes are state, not suggestions.'
Once closure is declared, mode label acts as operational guard.
Future cycles must respect state designation:
  - No new probes (that's discovery)
  - No new pinpoints (branch frozen)
  - No new branches (Phase 0 must merge first)
  - Maintain readiness; respond to signal

Mode history for Phase 0:
  - Cycle #97: Discovery begins
  - Cycle #108: Exhaustion criteria met
  - Cycle #109: Closure declared
  - Cycle #110: Merge-wait mode formally entered

Current state: MERGE-WAIT MODE. Awaiting signal.
2026-04-23 12:00:11 +09:00
YeonGyu-Kim
c0ab7a4d5f roadmap: formal closure of Phase 0 / dogfood cycles per gaebal-gajae
Per gaebal-gajae 11:58 Seoul closure validation.

Authoritative closure statement:
'Phase 0 has finished discovery. Phase 1 should start by landing
 the locked contract foundation bundle, not by opening new
 exploratory cycles.'

All four exhaustion criteria met:
1. Unaudited surfaces: 9 probed (full coverage)
2. Probe hypothesis: Fully validated (multi-flag 3-4, simple 0-1)
3. Phase 1 docs: PHASE_1_KICKOFF.md + review guide + priority queue
4. Branch hygiene: 39 commits, 564 tests, 0 regressions, freeze held

Doctrine #29 (final): 'Discovery termination is itself a deliverable.'
  - Criteria: surfaces probed, hypothesis validated, plan documented,
    branch review-ready
  - Anti-pattern: infinite probe continuation
  - Correct: explicit closure + pivot to execution

Phase 0 / dogfood cycles formally closed. No more probe filings
on this branch. Next work unit is Phase 1 execution, not discovery.

Pending: Phase 0 merge approval → Phase 1 branch creation in
priority order → bundle-by-bundle execution (~10 min per bundle).
2026-04-23 11:58:06 +09:00
YeonGyu-Kim
046bf6cedc roadmap: cycle #109 checkpoint — probe complete, Phase 1 kickoff ready
End-of-dogfood checkpoint at cycle #109:

Deliverables:
- PHASE_1_KICKOFF.md (192 lines, execution plan for 6-bundle priority queue)
- Test verification: 564 tests pass, 0 failures
- Branch clean, freeze held, 38 commits total

Probe hypothesis fully validated:
- Multi-flag verbs: 3-4 classifier gaps each
- Single-issue verbs: 0-1 gaps each

Accounting:
- 82 pinpoints filed (cycles #104-#108)
- 67 genuinely open
- 28 doctrines accumulated

Phase 1 ready:
- All 5 priority bundles gaebal-gajae reviewed
- Bundle sequence locked (foundation → extensions → cleanup)
- Expected execution: 50-60 min for all priorities
- No blockers except Phase 0 merge approval

Next: Execute Phase 1 bundles in priority order once Phase 0 lands.
2026-04-23 11:56:57 +09:00
YeonGyu-Kim
66eeed82ca doc: add Phase 1 kickoff — execution plan for 6-bundle priority queue
Comprehensive Phase 1 strategy document prepared at end of probe cycle #108.

Contents:
- Phase 0 recap (freeze, tests, pinpoints, doctrines)
- What Phase 1 will do (6 bundles + independents, all gaebal-gajae reviewed)
- Concrete next steps (branch names, expected commits/tests per bundle)
- Priority 1: Error envelope contract drift (#181/#183) — foundation
- Priority 2: CLI contract hygiene (#184/#185) — extensions
- Priority 3: Classifier sweep 4-verb (#186/#187/#189/#192) — cleanup
- Priority 4: USAGE.md audit (#180) — doc prerequisite
- Priority 5: Dump-manifests help (#188) — doc-truth probe-flow
- Priority 6+: Independents (#190 design, #191 filesystem, others)
- Hypothesis validation (multi-flag verbs = 3-4 gaps, simple verbs = 0-1)
- Testing strategy + success criteria

All 5 priority bundles are reviewer-blessed (gaebal-gajae validation passes).

Doc-only. No code changes. Freeze held.
2026-04-23 11:56:37 +09:00
YeonGyu-Kim
b139b10499 roadmap(#190, #191, #192): file final pre-phase-1 probe gaps in skills lifecycle
Cycle #108 probe of claw skills install/enable/disable yielded 3 pinpoints:

#190: Design decision needed
  skills install (no args) routes to help (action: help, kind: skills).
  May be intentional (like agents pattern) or design inconsistency.
  Requires verification against agents canonical reference.

#191: Classifier gap (filesystem family extension)
  skills install /bad/path emits kind=unknown.
  Should be kind=filesystem or filesystem_io_error.
  Extends #177/#178/#179 install-surface taxonomy.

#192: Classifier gap (unknown-option family extension)
  skills install --bogus-flag emits kind=unknown.
  Should be kind=cli_parse (like sandbox).
  Now 4 members in unknown-option sub-lineage: #186, #187, #189, #192.

Pinpoint count: 82 filed, 67 genuinely open.
Classifier family: 19 members (+2).

All unaudited surfaces now probed:
  - Cycles #104-#108: plugins, agents, init, bootstrap-plan, system-prompt,
    export, sandbox, dump-manifests, skills
  - Hypothesis fully validated: Multi-flag verbs have 3-4 classifier gaps;
    simple verbs have 0-1 gaps.

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:39:59 +09:00
YeonGyu-Kim
6e6f99e57e roadmap(#188, #189): lock framings + doc-truth sub-axis + priority refinement
Per gaebal-gajae cycle #107 validation pass. Three refinements:

1. Framings locked (both verb-specific):
   #188: 'dump-manifests --help omits the prerequisite that runtime
          behavior actually requires.'
   #189: 'dump-manifests unknown-option errors still fall through to
          unknown instead of the existing CLI-parse path.'

2. Doc-truthfulness family formally split into 2 sub-axes:
   - Audit-flow (5 members: #76, #79, #82, #172, #180) — reading one
     file vs another declared source of truth
   - Probe-flow (NEW, 1 member: #188) — running verb vs observing
     --help text

3. Priority refinement:
   - #189 → bundled in feat/jobdori-186-189-classifier-sweep (3 verbs)
   - #188 → post-#180 (doc parity sequence: USAGE gap → help-text gap)
   - Full sequence: #180 (audit-flow doc-truth) → #188 (probe-flow doc-truth)

4. Key cycle #107 outcome (per gaebal-gajae):
   'behavior bug처럼 보이던 걸 help-text truthfulness gap으로 정확히 재분류'
   This is the reclassification skill that earned the filing.

Doctrine #28: First observation is hypothesis, not filing. Verify
against SCHEMAS/USAGE/--help before classifying axis. Cost: 30-60s
per probe. Benefit: avoid filing not-a-bug pinpoints.

Priority queue now 6 bundles + 3+ independent, all reviewer-blessed.
2026-04-23 11:33:44 +09:00
YeonGyu-Kim
eb957a512c roadmap(#188, #189): file doc-truth and classifier gaps in dump-manifests
Cycle #107 probe of claw dump-manifests yielded 2 pinpoints:

#188: Doc-truthfulness gap (NEW sub-axis)
  claw dump-manifests --help describes usage as optional flags, but
  the verb fails without --manifests-dir or CLAUDE_CODE_UPSTREAM.
  USAGE.md is correct; CLI --help output lies by omission.

  This is the first doc-truth pinpoint from probe flow (vs audit flow).
  New sub-axis: help text vs behavior (prior doc-truth: SCHEMAS/USAGE/README).

#189: Classifier gap (same pattern as #186/#187)
  dump-manifests --bogus-flag falls through to kind=unknown.
  Should be cli_parse (like sandbox).

  Now at 3 verbs in same pattern: system-prompt (#186), export (#187),
  dump-manifests (#189). Rename bundle to feat/jobdori-186-189-classifier-sweep.

Pinpoint count: 79 filed, 65 genuinely open.
Doc-truthfulness family: 6 members (was 5).
Classifier unknown-option sub-lineage: 3 members (was 2).

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:31:30 +09:00
YeonGyu-Kim
fcb9d18899 roadmap(#187): lock framing + bundle with #186 per gaebal-gajae
Per gaebal-gajae cycle #106 validation pass. Two refinements:

1. #187 framing locked:
   'export unknown-option errors still fall through to unknown,
    unlike the already-canonical sandbox CLI-parse path.'

   Surgical parallel to #186 framing (cycle #105):
   'system-prompt unknown-option errors still fall through to unknown
    instead of the existing CLI-parse classification path.'

   Same pattern: verb + drift + reference path.

2. #186 and #187 bundled into feat/jobdori-186-187-classifier-sweep.
   Rationale: identical fix pattern, identical test pattern, same
   source file, 2x review overhead if separated.

Updated merge priority queue (gaebal-gajae reviewer-blessed):
  1. feat/jobdori-181-error-envelope-contract-drift (#181 + #183)
  2. feat/jobdori-184-cli-contract-hygiene-sweep (#184 + #185)
  3. feat/jobdori-186-187-classifier-sweep (#186 + #187)

Doctrine #27: Same-pattern pinpoints should bundle into one classifier
sweep PR. One-pinpoint = one-branch is not universal; batching
same-pattern fixes halves review/merge overhead.
2026-04-23 11:22:40 +09:00
YeonGyu-Kim
d03f33b119 roadmap(#187): file export classifier gap from cycle #106 probe
Cycle #106 probe of export and sandbox verbs. Found:
- export --bogus-flag: kind=unknown (should be cli_parse)
- sandbox --bogus-flag: kind=cli_parse (canonical correct)

#187 is direct sibling of #186 (system-prompt classifier gap).
Both unknown-option, both should use cli_parse classifier.

Observation: sandbox has no gaps. export has 1 classifier gap.
Suggests classifier coverage improving on newer verbs, not consistent
regression across unaudited surfaces.

Hypothesis (#104) partially validated: unaudited surfaces yield
pinpoints, but not uniformly. Single-issue verbs (sandbox) may be
cleaner than multi-flag verbs (export, init, bootstrap-plan).

Pinpoint count: 77 filed, 63 genuinely open.

Per freeze doctrine, no code changes. Doc-only filing.
2026-04-23 11:21:31 +09:00
YeonGyu-Kim
6bd69d55bc doc(review-guide): embed gaebal-gajae authoritative state framing
Per gaebal-gajae cycle #105 validation pass. One-liner state summary
now appears at top (tone-setter for reviewers) and bottom (reinforced
recap):

  'Phase 0 is now frozen, reviewer-mapped, and merge-ready;
   Phase 1 remains intentionally deferred behind the locked priority order.'

This is the single authoritative sentence that captures branch state.
Use it for PR titles, review summaries, and Phase 1 handoff notes.

Why this framing matters (per gaebal-gajae evaluation):
- 'frozen' signals no scope creep
- 'reviewer-mapped' signals audit trail exists (this guide)
- 'merge-ready' signals gates are passed
- 'intentionally deferred' signals Phase 1 absence is by design, not omission
- 'locked priority order' signals sequencing is validated (cycle #104-#105)

Review guide now doubles as merge-enabler: reviewers parse branch state
in one sentence, then drill into commits as needed.

Doc-only. No code changes. Freeze preserved.
2026-04-23 11:11:50 +09:00
YeonGyu-Kim
e470e614d5 doc: add Phase 0 + dogfood bundle review guide for cycles #104-#105
Pre-merge documentation for reviewers. Summarizes:
- What Phase 0 tasks deliver (JSON envelope contracts, regression locks)
- Why dogfood cycles #99-#105 matter (validated methodology, 15 filed pinpoints)
- Commit-by-commit navigation for the 30-commit frozen bundle
- What lands vs what's deferred
- Integration notes for Phase 1 planning
- Known limitations + follow-ups

This is doc-only, no code changes. Serves as audit trail and reviewer
reference without adding scope to the frozen feature branch.
2026-04-23 11:10:51 +09:00
YeonGyu-Kim
1494a94423 roadmap: lock merge priority for cycles #104-#105 pinpoints
Per gaebal-gajae cycle #105 priority pass.

Locked merge order (minimizes consumer-facing contract disruption):
1. feat/jobdori-181-error-envelope-contract-drift (#181 + #183 bundled)
2. feat/jobdori-184-cli-contract-hygiene-sweep (#184 + #185 bundled)
3. feat/jobdori-186-system-prompt-classifier (#186 standalone)

Rationale: foundation → extensions → cleanup ordering.
- #181 first: canonical error envelope established (1 shape change)
- #184/#185 second: use existing envelope (0 shape changes)
- #186 third: classifier branch add (1 classifier change)
Total: 1 shape + 1 classifier change across 3 merges.

Doctrine #25: Contract-surface-first ordering. Foundation layer before
extending guards before refinement cleanup.

Still-deferred pinpoints explicitly mapped with dependencies:
#173, #174, #177/#178/#179, #180, #182, #175.

Branch now at 30 commits, 227/227 tests.
2026-04-23 11:05:00 +09:00
YeonGyu-Kim
8efcec32d7 roadmap(#184-#186): lineage corrections + reference implementation lock
Per gaebal-gajae cycle #105 review pass. Three corrections:

1. #184/#185 belong to #171 lineage (CLI contract hygiene sub-family),
   NOT a new family. Same enforcement hole pattern on unaudited verbs.

2. #186 locked as member of #169/#170 classifier lineage. Framing:
   'system-prompt unknown-option errors still fall through to unknown
   instead of the existing CLI-parse classification path.'

3. agents is the #183 reference implementation. Fix path reframed from
   'design new contract' to 'align outliers to existing reference'.
   Much smaller scope for feat/jobdori-181-error-envelope-contract-drift.

Canonical reference shape locked:
{action: 'help', kind: <verb>, unexpected: <bad-name>, usage: {...}}

Doctrine #24: Pinpoint lineage continuity. Check existing family
before creating new. Reviewers follow pattern lineages.

Family tree corrected: CLI contract hygiene moved from 'NEW' to
'#171 sub-lineage within classifier family'.
2026-04-23 11:03:52 +09:00
YeonGyu-Kim
1afe145db8 roadmap(#184, #185, #186): file CLI contract hygiene gaps in unaudited verbs
Cycle #105 probe of agents/init/bootstrap-plan/system-prompt verbs
(unaudited per cycle #104 hypothesis) yielded 3 pinpoints:

#184: claw init silently accepts unknown positional arguments. Inconsistent
with #171 CLI contract hygiene pattern.

#185: claw bootstrap-plan silently accepts unknown flags. Same family as
#184, different verb, different surface.

#186: claw system-prompt --<unknown> classified as 'unknown' instead of
'cli_parse'. Classifier family member (#182-style).

Bonus observation (not filed): claw agents bogus-action emits the
canonical mcp-style {action: help, unexpected, usage} shape. This is
the shape that #183 wants as canonical, NOT the plugins-style success
envelope. agents is the reference implementation.

Hypothesis validated: unaudited verb surfaces have 2-3x higher pinpoint
yield. Predicted cycle #104-#105 pattern holds.

Pinpoint count: 76 filed, 62 genuinely open.
2026-04-23 11:01:24 +09:00
YeonGyu-Kim
7b3abfd49a roadmap(#181/#182/#183): lock reviewer-ready framings per gaebal-gajae
Final framing pass for cycle #104 plugin lifecycle pinpoints. Three
one-liner framings captured for reviewer consumption:

#181 (HIGH): 'plugins unknown-subcommand errors currently emit on the
success path instead of the JSON error path.'

#183 (HIGH): 'Invalid subcommand handling is not normalized across
plugins and mcp JSON surfaces.'

#182 (MEDIUM): 'Plugin lifecycle failures still fall through to unknown
instead of canonical error kinds.'

Branch sequencing locked:
1. feat/jobdori-181-error-envelope-contract-drift (bundles #181+#183)
2. feat/jobdori-182-plugin-classifier-alignment (#182, post-merge)

Rationale: #181 is root bug, #183 is sibling symptom, #182 is cleanup
that benefits from clean error envelope landing first.

Branch at 27 commits, 227/227 tests, review-ready.
2026-04-23 10:33:42 +09:00
YeonGyu-Kim
2c004eb884 roadmap(#181-framing, #182-correction): lock framing + correct enum proposal
Per gaebal-gajae cycle #104 framing + severity pass. Three changes:

1. #181 framing locked: 'plugins unknown-subcommand errors are emitted
   through the success envelope instead of the JSON error envelope.'

2. #181 + #183 consolidated into 'error envelope contract drift' family.
   Proposed bundled branch: feat/jobdori-181-error-envelope-contract-drift.

3. #182 scope correction (IMPORTANT): I proposed new kind 'plugin_not_found'
   without verifying SCHEMAS.md enum. Per gaebal-gajae: 'existing contract
   alignment > new enum proposal'.

Corrected mapping:
- plugins install /nonexistent → filesystem (existing enum value)
- plugins enable nonexistent → runtime (safest existing value)
- plugin_not_found proposal deferred pending explicit schema update

Doctrine lesson: enum proposal requires SCHEMAS.md baseline check first.

Severity-ordered merge plan (per gaebal-gajae):
1. #181 (HIGH) - contract bug
2. #183 (HIGH) - contract drift
3. #182 (MEDIUM) - classifier alignment
2026-04-23 10:32:50 +09:00
YeonGyu-Kim
22cc8effbb roadmap(#181, #182, #183): file plugin lifecycle axis pinpoints
Cycle #104 probe of plugin lifecycle axis (claw plugins + mcp subcommands)
yielded 3 related gaps:

#181: plugins bogus-subcommand returns SUCCESS-shaped envelope with
error buried in 'message' text field. Consumer parsing via
type=='error' check treats it as success. Severe.

#182: plugins install/enable not-found errors classified as 'unknown'
instead of 'plugin_not_found' or 'not_found'. Classifier family member.

#183: plugins and mcp emit DIFFERENT shapes on unknown subcommand.
plugins has reload_runtime+target+message, mcp has unexpected+usage.
Shape parity gap.

All three filed only per freeze doctrine. Proposed separate branches:
- feat/jobdori-181-unknown-subcommand-error-routing (#181 + #183 bundled)
- feat/jobdori-182-plugin-not-found-classifier (#182 standalone)

Pinpoint count: 73 filed, 59 genuinely open. Typed-error family: 14
members. Emission routing family: 1 new member (#181).
2026-04-23 10:31:09 +09:00
YeonGyu-Kim
a14977a866 roadmap(#180-framing): lock authoritative framing + branch name
Per gaebal-gajae cycle #103 framing pass. Captures narrative choice +
reality divergence in one line.

Framing: 'USAGE.md currently teaches entry modes, but not the actual
standalone command surface exposed by claw --help.'

Locks branch name: feat/jobdori-180-usage-standalone-surface

Next-branch prep steps documented so post-168c-merge execution is
zero-friction.

Three-stage pinpoint discipline validated again: filing (cycle #103
primary) → framing (cycle #103 addendum) → prep (execution checklist).
2026-04-23 10:25:18 +09:00
YeonGyu-Kim
e84424a2d3 roadmap(#180): file USAGE.md verb coverage gap
Cycle #103 doc-truthfulness audit found USAGE.md incomplete.

Actual CLI has 14 standalone verbs (status, doctor, mcp, skills, agents,
export, init, sandbox, system-prompt, bootstrap-plan, dump-manifests,
help, version, acp).

USAGE.md covers only 3 entry modes (claw REPL, claw prompt TEXT,
claw --resume). Other verbs absent or underdocumented.

Example: USAGE.md says 'start claw, then /doctor' but doesn't explain
that 'claw doctor' is also a standalone entry point (no REPL needed).

Fix: Add 'Standalone commands' section to USAGE.md with all 14 verbs
documented. Include regression test (grep USAGE.md for each verb).

Doc-truthfulness family: #76, #79, #82, #172, #180.

Pinpoint count: 70 filed, 56 genuinely open.
2026-04-23 10:24:24 +09:00
YeonGyu-Kim
de5384c8f0 roadmap(#179): file missing SKILL.md validation gap as separate pinpoint
Per gaebal-gajae cycle #102 refinement. Originally tangled into #177
filing but properly belongs as distinct pinpoint.

Taxonomy:
- #177: nonexistent path → filesystem kind
- #178: export enum drift → filesystem canonical
- #179: missing SKILL.md → parse/validation kind (this filing)

Family renamed per gaebal-gajae: 'resource / install-surface error
taxonomy gap' (was 'filesystem error family'). Better captures that
not all gaps in this cluster are filesystem-rooted.

Proposed branch bundle: feat/jobdori-177-install-surface-taxonomy
covers all three as coordinated taxonomy sweep.

Pinpoint count: 69 filed, 55 genuinely open.
2026-04-23 10:04:03 +09:00
YeonGyu-Kim
93cfdbabeb roadmap(#175): file gaebal-gajae's CI fmt/test signal decoupling framing + resolve numbering collision
#175 numbering collision between:
- gaebal-gajae's CI framing (filed at ~10:00 via Discord verbally)
- my filesystem classifier filing (#175 per cycle #102 10:02)

Resolution:
- gaebal-gajae's framing reclaims #175 (higher-level workflow gap)
- My filesystem classifier renumbered to #177
- My export enum naming renumbered to #178

All three pinpoints now filed with correct non-colliding numbers:
- #175: CI fmt/test signal decoupling (gaebal-gajae)
- #177: skills install filesystem classifier (Jobdori, was #175)
- #178: export kind naming consistency (Jobdori, was #176)

Typed-error family membership updated accordingly.
2026-04-23 10:02:52 +09:00
YeonGyu-Kim
efc59ab17e roadmap(#175, #176): file filesystem error classifier gaps
Cycle #102 probe of model/skills/export axis found two related gaps:

#175: skills install filesystem errors classified as 'unknown' instead of
'filesystem' (which is in v1.5 enum).

#176: export uses 'filesystem_io_error' kind but this is NOT in v1.5
declared enum (which only lists 'filesystem'). Inconsistent naming.

Both filed only per freeze doctrine. Proposed bundling as
feat/jobdori-175-filesystem-error-family branch.

Family observation: classifier + enum-naming gaps found simultaneously
in filesystem-error axis. Indicates broader unaudited surface.

Pinpoint count: 68 filed, 54 genuinely-open.
2026-04-23 10:01:06 +09:00
YeonGyu-Kim
635f1145a2 roadmap(#174-framing): lock authoritative framing + branch name
Per gaebal-gajae cycle #101 framing pass. Adds stable framing that
captures scope + root cause + visible effect + surface in one line.

Locks branch name: feat/jobdori-174-resume-trailing-cli-parse

Next-branch prep steps documented so post-168c-merge execution is
zero-friction (classifier branch + regression test pattern already
established by #169/#170/#171).
2026-04-23 09:33:01 +09:00
YeonGyu-Kim
a8fc17cdee roadmap(#174): file --resume trailing args classifier gap
Cycle #101 probe of session-boot axis (prompt misdelivery / resume
lifecycle) found another typed-error classifier gap.

Filed only, not fixed. Per freeze doctrine (cycles #98-#100), no new
code axis added to feat/jobdori-168c-emission-routing.

Pattern: `--resume trailing arguments must be slash commands` classified
as 'unknown' instead of 'cli_parse'. Side effect: #247 hint synthesizer
doesn't trigger, so hint is null.

Same family as #169, #170, #171 (classifier coverage gaps).

Proposed fix: add `--resume trailing arguments` pattern to
classify_error_kind as cli_parse.

Pinpoint count: 66 filed, 52 genuinely-open + #174 new.
2026-04-23 09:31:21 +09:00
YeonGyu-Kim
28102af64a roadmap(#173): file structured-output hint parity gap
Cycle #100 probe of non-classifier axes (event/log opacity) found new
consumer parity gap: JSON mode missing 'hint' field that text mode
provides for config_load_error scenarios.

Filed only, not fixed. Per freeze doctrine (cycles #98-#99), no new axis
added to feat/jobdori-168c-emission-routing. This pinpoint is a Phase 1
scope candidate for a separate branch.

Affects: claw mcp, claw status, claw doctor (JSON mode).
Text mode shows: Hint  `claw doctor` classifies config parse errors...
JSON mode shows: no hint field at all.

Consumer impact: claws parsing JSON output can't programmatically route
errors to recovery paths the way text-mode users can with human guidance.

Family: Consumer parity. Related: #247 (hint synthesizer), #169-#172
(classifier family), #172 (doc-truthfulness).

Proposed fix: add 'hint' field to JSON envelope when config_load_error
is present, with hint taxonomy for typed dispatch.

Pinpoint count: 65 filed, 51 genuinely-open + #173 new.
2026-04-23 09:01:53 +09:00
YeonGyu-Kim
df148f1a3e docs(#99): checkpoint artifact — bundle status and Phase 1 readiness
Cycle #99 (10-min dogfood cycle). No new pinpoint filed. Instead, documented
current branch state via checkpoint artifact.

Branch: feat/jobdori-168c-emission-routing @ 15 commits across 5 axes
- Phase 0 (emission): 4 commits, complete
- Discoverability: 4 commits, complete
- Typed-error: 6 commits, complete
- Doc-truthfulness: 2 commits, complete
- Deferred: #141 (list-sessions --help routing, parser scope)

Tests: 227/227 pass, zero regressions, steady 11-cycle run

Checkpoint summarizes:
1. Work axes breakdown + pinpoint mapping
2. Cycle velocity (11 cycles, ~90 min, 6 pinpoints closed)
3. Branch deliverables (4 consumer-facing value propositions)
4. Readiness assessment (ready for review, awaiting signal)
5. Doctrine observations (probe pivot works, regression guards stick)

No code changes; doc-only. This checkpoint bridges cycles #89-#99 and marks
the branch as review-ready pending coordination signal.
2026-04-23 08:56:59 +09:00
YeonGyu-Kim
3a2dddd1ca roadmap(#172): file + close doc-vs-reality gap — action field inventory count
Cycle #98 probe of non-classifier axes found documentation truthfulness
gap in SCHEMAS.md v1.5 Emission Baseline.

#172 closed by commit ce352f4 (same branch, same cycle).

Part of doc-truthfulness family (#76, #79, #82).

Completes SCHEMAS.md truthfulness trifecta:
- Cycle #91: Baseline documentation (13 verbs)
- Cycle #92: Shape parity guard (10 cases)
- Cycle #98: Phase 1 target count locked (3 verbs, 11 assertions)

Pinpoint count: 64 filed, 51 genuinely-open + #172 closed this cycle.
2026-04-23 08:33:35 +09:00
YeonGyu-Kim
ce352f4750 docs(#172): correct action-field inventory claim (4 → 3 verbs) + regression guard
Pinpoint #172: SCHEMAS.md v1.5 Emission Baseline documentation inaccuracy
discovered during cycle #98 probe.

The Phase 1 normalization targets section claimed:
  "unify where `action` field appears (only in 4 inventory verbs)"

But reality is only 3 inventory verbs have `action`:
  - mcp
  - skills
  - agents

list-sessions uses `command` instead (the documented 1-of-13 deviation
already captured elsewhere in v1.5 baseline).

This is a doc-truthfulness issue (same family as cycles #76, #79, #82).
Active misdocumentation leads downstream consumers to assume 4-verb
coverage when building adapters/dispatchers.

Changes:
1. SCHEMAS.md: 'only in 4 inventory verbs' → 'only in 3 inventory verbs: mcp, skills, agents'
2. Added regression test `v1_5_action_field_appears_only_in_3_inventory_verbs_172`
   - Asserts mcp/skills/agents HAVE action field
   - Asserts help/version/doctor/status/sandbox/system-prompt/bootstrap-plan/list-sessions do NOT have action field
   - Forces SCHEMAS.md + binary to stay synchronized

Test added:
- `v1_5_action_field_appears_only_in_3_inventory_verbs_172` (8 negative cases + 3 positive cases)

Tests: 227/227 pass (+1 from #172).

Related: #155 (doc parity family), #168c (emission baseline).
Doc-truthfulness family: #76, #79, #82, #172.
2026-04-23 08:32:59 +09:00
YeonGyu-Kim
d9b61cc4dc roadmap(#171): file + close classifier gap for unexpected extra arguments
Cycle #97 probing #141 surface found additional classifier gap.
#171 closed by commit fbb0ab4 (same branch, same cycle).

Part of typed-error family (#121, #127, #129, #130, #164, #169, #170, #247).

#141 (list-sessions --help doesn't show help) remains open — requires
separate parser fix for --help-as-distinct-path logic.

Pinpoint count: 63 filed, 51 genuinely-open + #171 classifier closed.
2026-04-23 08:02:28 +09:00
YeonGyu-Kim
fbb0ab4be7 fix(#171): classify unexpected extra arguments errors as cli_parse
Pinpoint #171: typed-error classifier gap discovered during #141 probe cycle #97.

`claw list-sessions --help` emits:
  error: unexpected extra arguments after `claw list-sessions`: --help

This format is used by multiple verbs that reject trailing positional args:
- list-sessions
- plugins (subcommands)
- config (subcommands)
- diff
- load-session

Before fix:
  {"error": "unexpected extra arguments after `claw list-sessions`: --help",
   "hint": null,
   "kind": "unknown",
   "type": "error"}

After fix:
  {"error": "unexpected extra arguments after `claw list-sessions`: --help",
   "hint": "Run `claw --help` for usage.",
   "kind": "cli_parse",
   "type": "error"}

The pattern `unexpected extra arguments after \`claw` is specific enough
that it won't hijack generic prose mentioning "unexpected extra arguments"
in other contexts (sanity test included).

Side benefit: like #169/#170, correctly classified cli_parse errors now
auto-trigger the #247 hint synthesizer.

Related #141 gap not yet closed: `claw list-sessions --help` still errors
instead of showing help (requires separate parser fix to recognize --help
as a distinct path). This classifier fix at least makes the error surface
typed correctly so consumers can distinguish "parse failure" from "unknown"
and potentially retry without the --help flag.

Test added:
- `classify_error_kind_covers_unexpected_extra_args_171` (4 positive cases
  + 1 sanity guard)

Tests: 226/226 pass (+1 from #171).

Typed-error family: #121, #127, #129, #130, #164, #169, #170, #247.
2026-04-23 08:02:12 +09:00
YeonGyu-Kim
5736f364a9 roadmap(#153): file + close pinpoint — binary PATH instructions + verification bridge
Cycle #96 dogfood found practical install-experience gap in USAGE.md.
#153 closed by commit 6212f17 (same branch, same cycle).

Part of discoverability family (#155, help/USAGE parity).

Pinpoint count: 62 filed, 51 genuinely-open + #153 closed this cycle.
2026-04-23 07:52:41 +09:00
YeonGyu-Kim
6212f17c93 docs(#153): add binary PATH installation instructions and verification steps
Pinpoint #153 closure. USAGE.md was missing practical instructions for:
1. Adding the claw binary to PATH (symlink vs export PATH)
2. Verifying the install works (version, doctor, --help)
3. Troubleshooting PATH issues (which, echo $PATH, ls -la)

New subsections:
- "Add binary to PATH" with two common options
- "Verify install" with post-install health checks
- Troubleshooting guide for common failures

Target audience: developers building from source who want to run `claw`
from any directory without typing `./rust/target/debug/claw`.

Discovered during cycle #96 dogfood (10-min reminder cycle).
Tests: 225/225 still pass (doc-only change).
2026-04-23 07:52:16 +09:00
YeonGyu-Kim
0f023665ae roadmap(#170): file + close 4 additional classifier gaps + doc-vs-reality meta-observation
Cycle #95 dogfood probe validated #169 doctrine by finding 4 more gaps.

Meta-observation noted: #169 comment claimed to cover --permission-mode
bogus but actual string pattern differs. Lesson for future classifier
patches: comments name EXACT matched substring, not aspirational coverage.

New kind introduced: slash_command_requires_repl (for interactive-only
slash-command misuse).

Pinpoint count: 62 filed, 52 genuinely-open + #170 closed this cycle.
2026-04-23 07:32:32 +09:00
YeonGyu-Kim
1a4d0e4676 fix(#170): classify 4 additional flag-value/slash-command errors as cli_parse / slash_command_requires_repl
Pinpoint #170: Extended typed-error classifier coverage gap discovered during
dogfood probe 2026-04-23 07:30 Seoul (cycle #95).

The #169 comment claimed to cover `--permission-mode bogus` via the
`unsupported value for --` pattern, but the actual `parse_permission_mode_arg`
message format is `unsupported permission mode 'bogus'` (NO `for --` prefix).
Doc-vs-reality lie in the #169 fix itself — fixed here.

Four classifier gaps closed:

1. `unsupported permission mode '<value>'` → cli_parse
   (from: `parse_permission_mode_arg`)
2. `invalid value for --reasoning-effort: '<value>'; must be ...` → cli_parse
   (from: `--reasoning-effort` validator)
3. `model string cannot be empty` → cli_parse
   (from: empty --model rejection)
4. `slash command /<name> is interactive-only. Start \`claw\` ...` →
   slash_command_requires_repl (NEW kind — more specific than cli_parse)

The fourth pattern gets its own kind (`slash_command_requires_repl`) because
it's a command-mode misuse, not a parse error. Downstream consumers can
programmatically offer REPL-launch guidance.

Side benefit: like #169, the correctly classified cli_parse errors now
auto-trigger the #247 hint synthesizer ("Run `claw --help` for usage.").

Test added:
- `classify_error_kind_covers_flag_value_parse_errors_170_extended`
  (4 positive cases + 2 sanity guards)

Tests: 225/225 pass (+1 from #170).

Typed-error family: #121, #127, #129, #130, #164, #169, #247.

Discovered via systematic probe angle: 'error message pattern audit' \u2014
grep each error emission for pattern, confirm classifier matches.
2026-04-23 07:32:10 +09:00
YeonGyu-Kim
b8984e515b roadmap(#169): file + close pinpoint — invalid CLI flag values now classify as cli_parse
Documents #169 discovery during dogfood probe 2026-04-23 07:00 Seoul.

Pinpoint #169 closed by commit 834b0a9 (same branch, same cycle).

Part of typed-error family (#121, #127, #129, #130, #164, #247).

Pinpoint count: 61 filed, 52 genuinely-open + 1 closed in this cycle.
2026-04-23 07:04:07 +09:00
YeonGyu-Kim
834b0a91fe fix(#169): classify invalid/missing CLI flag values as cli_parse
Pinpoint #169: typed-error classifier gap discovered during dogfood probe.

`claw --output-format json --output-format xml doctor` was emitting:
  {"error": "unsupported value for --output-format: xml ...",
   "hint": null,
   "kind": "unknown",
   "type": "error"}

After fix:
  {"error": "unsupported value for --output-format: xml ...",
   "hint": "Run `claw --help` for usage.",
   "kind": "cli_parse",
   "type": "error"}

The change adds two new classifier branches to `classify_error_kind`:
1. `unsupported value for --` → cli_parse
2. `missing value for --` → cli_parse

Covers all `CliOutputFormat::parse` / `parse_permission_mode_arg` rejections
and any future flag-value validation messages using the same pattern.

Side benefit: the #247 hint synthesizer ("Run `claw --help` for usage.")
now triggers automatically because the error is now correctly classified
as cli_parse. Consumers get both correct kind AND helpful hint.

Test added:
- `classify_error_kind_covers_flag_value_parse_errors_169` (4 positive +
  1 sanity case)

Tests: 224/224 pass (+1 from #169).

Discovered during dogfood probe 2026-04-23 07:00 Seoul, cycle #94.

Refs: #169, typed-error family (#121, #127, #129, #130, #164, #247)
2026-04-23 07:03:40 +09:00
YeonGyu-Kim
80f9914353 docs(#155): add missing slash command documentation to USAGE.md
Pinpoint #155: USAGE.md was missing documentation for three interactive
commands that appear in `claw --help`:
- /ultraplan [task]
- /teleport <symbol-or-path>
- /bughunter [scope]

Also adds full documentation for other underdocumented commands:
- /commit, /pr, /issue, /diff, /plugin, /agents

Converts inline sentence list into structured section 'Interactive slash
commands (inside the REPL)' with brief descriptions for each command.

Closes #155 gap: discovered during dogfood probing of help/USAGE parity.

No code changes. Pure documentation update.
2026-04-23 06:50:47 +09:00
YeonGyu-Kim
94f9540333 test(#168c Task 4): add v1.5 emission baseline shape parity guard
Phase 0 Task 4 of the JSON Productization Program: CI shape parity guard.

This test locks the v1.5 emission baseline (documented in SCHEMAS.md § v1.5
Emission Baseline) so any future PR that introduces shape drift in a documented
verb fails this test at PR time.

Complements Task 2 (no-silent guarantee) by asserting SPECIFIC top-level key
sets, not just 'stdout is non-empty valid JSON'. If a verb adds/removes a
top-level field, this test fails with a clear error message pointing to
SCHEMAS.md § v1.5 Emission Baseline for update guidance.

Coverage:
- 8 success-path verbs with locked shape (help, version, doctor, skills,
  agents, system-prompt, bootstrap-plan, list-sessions)
- 2 error-path cases with locked error envelope shape (prompt-no-arg, doctor --foo)

Key enforcement rules:
- Success envelope: exact key set match per verb
- Error envelope: {error, hint, kind, type} (4 keys, all verbs)
- list-sessions deliberately kept as {command, sessions} (Phase 1 target)

Test design intent:
- Locks CURRENT (possibly imperfect) shape, NOT target shape
- Forces PR authors to update both code + SCHEMAS.md + test together
- Makes Phase 1 shape normalization PRs visible: 'update this test'

Phase 0 now COMPLETE:
- Task 1  Stream routing fix (cycle #89)
- Task 2  No-silent guarantee (cycle #90)
- Task 3  Per-verb emission inventory SCHEMAS.md (cycle #91)
- Task 4  CI shape parity guard (this cycle)

Tests: 18 output_format_contract tests all pass (+1 from Task 4).
v1.5 emission baseline now locked by code + tests + docs.

Refs: #168c, cycle #92, Phase 0 Task 4 (final)
2026-04-23 06:38:18 +09:00
YeonGyu-Kim
e1b0dbf860 docs(#168c Task 3): add v1.5 Emission Baseline per-verb shape catalog to SCHEMAS.md
Phase 0 Task 3 of the JSON Productization Program: per-verb emission inventory.

Documents the actual binary behavior as of v1.5 (post-#168c fix, pre-Phase 1
shape normalization). Reference artifact for consumers building against v1.5,
not a target schema.

Catalog contents:
- 12 verbs using 'kind' field (help, version, doctor, mcp, skills, agents,
  sandbox, status, system-prompt, bootstrap-plan, export, acp)
- 1 verb using 'command' field (list-sessions) — Phase 1 normalization target
- 3 error-only verbs in test env (bootstrap, dump-manifests, state)
- Standard error envelope: {error, hint, kind, type} flat shape
- 9 machine-readable error kinds from classify_error_kind

Emission contract locked by:
- Task 1 (#168c routing fix, cycle #89)
- Task 2 (no-silent guarantee test, cycle #90)
- This catalog (human-readable reference, cycle #91)

Consumer guidance + Phase 1 normalization targets documented.

Phase 0 progress:
- Task 1 Stream routing fix
- Task 2 No-silent guarantee test
- Task 3 Per-verb emission inventory
- Task 4 pending: CI parity test

Refs: #168c, cycle #91, Phase 0 Task 3
2026-04-23 06:36:01 +09:00
YeonGyu-Kim
90c4fd0b66 test(#168c Task 2): add no-silent emission contract guard for 14 verbs
Phase 0 Task 2 of the JSON Productization Program: no-silent guarantee.

The emission contract under --output-format json requires:
1. Success (exit 0) must produce non-empty stdout with valid JSON
2. Failure (exit != 0) must still emit JSON envelope on stdout (#168c)
3. Silent success (exit 0 + empty stdout) is forbidden

This test iterates 12 safe-success verbs + 2 error cases, asserting each
produces valid JSON on stdout. Any verb that regresses to silent emission
or wrong-stream routing will fail this test.

Covered verbs:
- Success: help, version, list-sessions, doctor, mcp, skills, agents,
  sandbox, status, system-prompt, bootstrap-plan, acp
- Error: prompt (no arg), doctor --foo

Phase 0 progress:
- Task 1  Stream routing (#168c fix)
- Task 2  No-silent guarantee (this test)
- Task 3  Per-verb emission inventory (SCHEMAS.md)
- Task 4  CI parity test (regression prevention)

Tests: 17 output_format_contract tests all pass (+1 from Task 2).

Refs: #168c, cycle #90, Phase 0 Task 2
2026-04-23 06:31:44 +09:00
YeonGyu-Kim
6870b0f985 fix(#168c): emit error envelopes to stdout under --output-format json
Under --output-format json, error envelopes were emitted to stderr via
eprintln!. This violated the emission contract: stdout should carry the
contractual envelope (success OR error); stderr is reserved for
non-contractual diagnostics.

Cycle #87 controlled matrix audit found bootstrap/dump-manifests/state
exhibited this pattern (exit 1, stdout 0 bytes, stderr N bytes under
--output-format json).

Fix: change eprintln! to println! for the JSON error envelope path in main().
Text mode continues to route errors to stderr (conventional).

Verification:
- bootstrap --output-format json: stdout now carries envelope, exit 1
- dump-manifests --output-format json: stdout now carries envelope, exit 1
- Text mode: errors still on stderr with [error-kind: ...] prefix (no regression)

Tests:
- Updated assert_json_error_envelope helper to read from stdout (was stderr)
- Added error_envelope_emitted_to_stdout_under_output_format_json_168c
  regression test that asserts envelope on stdout + non-JSON on stderr
- All 16 output_format_contract tests pass

Phase 0 Task 1 complete: emission routing fixed across all error-path verbs.
Phase 0 Task 2 (no-silent CI guarantee) remains.

Refs: #168c (cycle #87 filing), cycle #88 emission contract framing
2026-04-23 06:03:31 +09:00
YeonGyu-Kim
3311266b59 roadmap: Phase 0 locked as 'JSON emission baseline stabilization' (cycle #88)
Per gaebal-gajae framing: Phase 0 addresses EMISSION (stream routing + exit code +
no-silent guarantee), not SHAPE (which moves to Phase 1).

Phase 0 subtasks (1.25 days total):
1. Stream routing fix — bootstrap/dump-manifests/state stderr → stdout for JSON
2. No-silent guarantee — CI asserts every verb emits valid JSON or exits non-zero
3. Per-verb emission inventory — authoritative catalog artifact
4. CI parity test — prevent regressions

Phase 1 now owns shape normalization (list-sessions 'command' → 'kind').
Phase 0 owns emission stability; Phase 1 owns shape consistency; Phase 2+ handles envelope wrapping.

#168b formally closed as INVALID (cycle #84 misread; stderr output routing is real
issue, now tracked as #168c).

Revised pinpoint accounting:
- Filed: 60 (audit trail includes #168b as invalid)
- Genuinely-open: 52
- Phase 0 active: #168c + emission CI
- Phase 1 active: #168a
2026-04-23 05:52:27 +09:00
YeonGyu-Kim
cd6e1cea6f roadmap: #168 split into #168a/#168b/#168c after controlled matrix audit (cycle #87)
Controlled matrix (/tmp/cycle87-audit/matrix.json) tested 16 verbs x 2 envs = 32 cases.

Results:
- #168a CONFIRMED: per-command shape divergence real (13 unique shapes across 13 verbs)
- #168b REFUTED: bootstrap does NOT silent-fail. Exit=1 stderr=483 bytes (not silent).
  Cycle #84 misread exit code (claimed 0, actually 1) and missed stderr output.
- #168c NEW: bootstrap/dump-manifests/state write plain stderr under --output-format json

Phase 0 reworded: 'Fix bootstrap silent failure' (inaccurate) → 'Controlled JSON
baseline audit + minimum invariant normalization' (accurate).

Concrete Phase 0 work (1.5 days):
- Normalize list-sessions 'command' → 'kind' (align with 12/13 verbs)
- Normalize stderr output to JSON for bootstrap/dump-manifests/state
- Document v1.5 baseline shape catalog in SCHEMAS.md
- Add shape parity CI test

Controlled revalidation (per gaebal-gajae cycle #87 direction) prevented Phase 0
from being anchored to a refuted bug. #168b is now closed as refuted; #168a and
#168c are the actual Phase 0 targets.
2026-04-23 05:50:52 +09:00
YeonGyu-Kim
f30aa0b239 roadmap: #168b filed — cycle #86 fresh-dogfood contradicts cycle #84 bootstrap claim (revalidation) 2026-04-23 05:48:56 +09:00
YeonGyu-Kim
7f63e22f29 roadmap: promote #164 from locus to 'JSON Productization Program' (cycle #85b)
gaebal-gajae review reframed the work: this is not 'schema drift management'
but a 'JSON productization program' — taking JSON output from bespoke/incoherent
to reliable/contractual as a product.

Promotion trigger: Fresh-dogfood evidence (#168) proved v1.0 was never coherent.
Migration isn't just schema change; it's productizing JSON output.

Program structure:
- Phase 0: Emergency stabilization (fix #168 bootstrap silent failure)
- Phase 1: v1.5 baseline (normalize invariants across all 14 verbs)
- Phase 2: v2.0 opt-in wrapped envelope
- Phase 3: v2.0 default
- Phase 4: v1.0/v1.5 deprecation

Umbrellas 9+ related pinpoints under coordinated program (#164, #167, #168,
#102, #121, #127, #129, #130, #245).

Program doctrine locked:
1. Fresh-dogfood before migration
2. Honest effort estimates
3. Consumer-first design
4. Evidence-driven revision
5. Documentation as product

Next concrete action: Phase 0 — implement #168 bootstrap JSON fix.
Success metric: A claw can write ONE parser for ALL clawable commands.
2026-04-23 05:34:29 +09:00
YeonGyu-Kim
771d2ffd04 locus(#164): add Phase 0 + v1.5 baseline; revised from 2-phase to 4-phase migration (cycle #85)
Fresh-dogfood validation (cycle #84, #168) proved the original locus premise was
underspecified. v1.0 was never a coherent contract — each verb has a bespoke JSON
shape with no coordination, and bootstrap JSON is completely broken (silent
failure, exit 0 no output).

Revised migration plan:
- Phase 0 (NEW): Emergency fix for silent failures (#168 bootstrap JSON)
- Phase 1 (NEW): v1.5 baseline — minimal JSON invariants across all 14 verbs
  - Every command emits valid JSON with --output-format json
  - Every command has top-level 'kind' field for verb ID
  - Every error envelope follows {error, hint, kind, type}
- Phase 2 (renamed from Phase 1): v2.0 wrapped envelope (opt-in)
- Phase 3 (renamed from Phase 2): v2.0 default
- Phase 4 (renamed from Phase 3): v1.0/v1.5 deprecation

Rationale:
- Can't migrate from 'incoherent' to 'coherent v2.0' in one jump
- Consumers need stable target (v1.5) to transition from
- Silent failures must be fixed BEFORE migration (consumers can't detect breakage)

Effort revision: ~9 dev-days (Phase 0: 1 + Phase 1: 3 + Phase 2: 5) vs original
~6 dev-days for direct v1.0→v2.0 (which would have failed).

Doctrine implication: Fresh-dogfood principle (#9, cycle #73) prevented a multi-day
migration from hitting an unsolvable baseline problem. Evidence-backed mid-design
correction.
2026-04-23 05:32:48 +09:00
YeonGyu-Kim
562f19bcff roadmap: #168 filed — JSON envelope shape inconsistent per-command; bootstrap broken (cycle #84)
Fresh dogfood validation (cycle #84) revealed the binary v1.0 envelope is NOT
consistent across commands:

- list-sessions: {command, sessions}
- doctor: {checks, kind, message, ...}
- bootstrap: (no JSON output at all)
- mcp: {action, kind, status, ...}

Each command has a custom JSON shape. Bootstrap's JSON path is completely broken
(exit 0 but no output). This is not 'v1.0 vs v2.0 design difference' — it's
'no consistent v1.0 ever existed'.

This explains why #164 (envelope migration) is blocked on design: the 'v1.0 from'
was never coherent. The real task is not 'migrate v1.0 to v2.0' but 'migrate
incoherent-per-command shapes to coherent-common-envelope'.

Implications for cycles #76–#82: The P0 doc fixes were correct to mark SCHEMAS.md
as 'aspirational' because the binary never had a consistent contract to document.
The deeper issue: each verb renderer was written independently with no envelope
coordination.

Three options proposed:
- A: accept per-command shapes (status quo + documentation)
- B: enforce common wrapper (FIX_LOCUS_164 full approach)
- C: hybrid (document current incoherence, then migrate 3 pilot verbs)

Recommendation: Option C. Documents truth immediately, enables phased migration.

This filing resolves the #164 design blocker: now we understand what we're
migrating from.
2026-04-23 05:31:09 +09:00
YeonGyu-Kim
43bbf43f01 roadmap: #167 filed — text output format has no contract (cycle #83)
SCHEMAS.md locks JSON envelope contract for all 14 clawable commands.
No corresponding contract for text output (--output-format text).

Text output is ad hoc per-command: no documented format, no column ordering
guarantee, no stability contract. Claws parsing text output have no safety.

Filed as discovery gap from systematic doc audit (cycle #83). Design options:
- Option A: Document text contracts (parallel to JSON) — 4 dev-days
- Option B: Declare text unstable, point to JSON — 1 dev-day (recommended)
- Option C: Defer until post-#164 JSON migration

Related to #164 (JSON migration) and #250 (surface parity audit).
2026-04-23 05:29:45 +09:00
YeonGyu-Kim
8322bb8ec6 roadmap: #166 closed — SCHEMAS.md source misdoc fixed (P0 root cause)
The aspirational SCHEMAS.md doc (v2.0 target) was the source of truth misdocumentation.
Three downstream docs (USAGE, ERROR_HANDLING, CLAUDE) inherited the false claim that
v1.0 binary emits common fields it doesn't actually emit.

Fixing SCHEMAS.md at the source eliminates the root cause for all four P0 instances.

Doc-truthfulness P0 family now complete: 4/4 closed, root cause identified + fixed.
All fixes shipped within 6 cycles (#76 audit → #82 execution).
2026-04-23 05:21:22 +09:00
YeonGyu-Kim
4c9a0a9992 docs: SCHEMAS.md — critical P0 fix: mark as target v2.0, not current v1.0 (#166 filed+closed)
SCHEMAS.md was presenting the target v2.0 schema as the current binary contract.
This is the source of truth document, so the misdocumentation propagated to every
downstream doc (USAGE.md, ERROR_HANDLING.md, CLAUDE.md all inherited the false
premise that v1.0 includes timestamp/command/exit_code/etc).

Fixed with:
1. CRITICAL header at top: marks entire doc as v2.0 target, not v1.0 reality
2. 'TARGET v2.0 SCHEMA' headers on Common Fields section
3. Comprehensive Appendix: v1.0 actual shape + migration timeline + v1.0 code example
4. Links to FIX_LOCUS_164.md + ERROR_HANDLING.md for v1.0 reality
5. FAQ: clarifies the version mismatch and when v2.0 ships

This closes the fourth P0 doc-truthfulness instance (4/4 in family):
- #78 USAGE.md: active misdocumentation (fixed #78)
- #79 ERROR_HANDLING.md: copy-paste trap (fixed #79)
- #165 CLAUDE.md: boundary collapse (fixed #81)
- #166 SCHEMAS.md: aspirational source doc (fixed #82)

Pattern is now crystallized: SCHEMAS.md was the aspirational source;
three downstream docs (USAGE, ERROR_HANDLING, CLAUDE) inherited the false v2.0-as-v1.0
claim. Fix the source (SCHEMAS.md), which eliminates the root cause for all four.
2026-04-23 05:21:07 +09:00
YeonGyu-Kim
86db2e0b03 roadmap: #165 closed with evidence (cycle #81, commit 1a03359)
CLAUDE.md Option A implemented. P0 doc-truthfulness family now at 3 closed +
0 open (all 3 fixed within the same dogfood session).

Taxonomy refinement added: P0 doc-truthfulness has three distinct subclasses:
- active misdocumentation (false sentence) — USAGE.md cycle #78
- copy-paste trap (broken example code) — ERROR_HANDLING.md cycle #79
- target/current boundary collapse (v2.0 as v1.0) — CLAUDE.md cycle #81

All three related to #164 (envelope divergence). Root cause consistent across
family; remedies differ per subclass.
2026-04-23 05:11:42 +09:00
YeonGyu-Kim
1a03359bb4 docs: CLAUDE.md — fix target/current boundary collapse (#165 Option A)
CLAUDE.md was documenting the v2.0 target schema as if it were current binary
behavior. This misled validator/harness implementers into assuming the Rust
binary emits timestamp, command, exit_code, output_format, schema_version fields
when it doesn't.

Fixed by explicitly marking the boundary:
1. SCHEMAS.md section: now clearly labels 'target v2.0 design' and lists both
   v1.0 (actual binary) and v2.0 (target) field shapes
2. Clawable commands requirements: now explicitly separates v1.0 (current) and
   v2.0 (post-FIX_LOCUS_164) envelope requirements
3. Added inline migration note pointing to FIX_LOCUS_164.md

This closes #165 as the third P0 doc-truthfulness fix (Option A: preserve current
truth, add v2.0 target as separate labeled section).

P0 doc-truthfulness family pattern (all three related to #164 envelope divergence):
- #78 USAGE.md: active misdocumentation (fixed cycle #78)
- #79 ERROR_HANDLING.md: copy-paste trap (fixed cycle #79)
- #165 CLAUDE.md: target/current boundary collapse (fixed cycle #81)
2026-04-23 05:11:14 +09:00
YeonGyu-Kim
b34f370645 roadmap: #165 filed — CLAUDE.md documents v2.0 schema as current (P0 active misdoc)
CLAUDE.md claims 'Common fields (all envelopes): timestamp, command, exit_code,
output_format, schema_version' but the actual binary v1.0 doesn't emit these.

This is aspirational (v2.0 target from SCHEMAS.md) documented as current behavior
in a file that's supposed to describe the Python reference harness.

Filed as 3rd member of doc-truthfulness P0 family (joins #78, #79).
Both options documented: update CLAUDE.md for v1.0 OR clarify it's v2.0 aspirational.
Recommendation: Option A (keep CLAUDE.md truthful about actual validation).

Part of broader #164 family (envelope schema divergence across all docs).
2026-04-23 05:10:01 +09:00
YeonGyu-Kim
a9e87de905 roadmap: doctrine refinement — doc-truthfulness severity scale (cycle #79)
Formalizes a 4-level severity scale for documentation-vs-implementation divergence:
- P0: Active misdocumentation (consumer code breaks) — immediate fix
- P1: Stale docs (consumer confused) — high priority
- P2: Incomplete docs (friction, eventual success) — medium
- P3: Terminology drift (confusion but survivable) — low

Parallel to diagnostic-strictness scale (cycles #57–#69). Both are
'truth-over-convenience' constraints.

Evidence: cycles #78–#79 found 2 P0 instances in USAGE.md and ERROR_HANDLING.md,
both related to JSON envelope shape. Root cause: SCHEMAS.md is aspirational (v2.0),
binary still emits v1.0, docs needed to be empirical not aspirational.

Going forward: doc audits compare against actual binary, flag P0 violations
immediately, link forward to migration plans (FIX_LOCUS_164.md).
2026-04-23 05:00:55 +09:00
YeonGyu-Kim
0929180ba8 docs: ERROR_HANDLING.md — fix code examples to match v1.0 envelope (flat shape)
The Python code examples were accessing nested error.kind like envelope['error']['kind'],
but v1.0 emits flat envelopes with error as a STRING and kind at top-level.

Updated:
- Table header: now shows actual v1.0 shape {error: "...", kind: "...", type: "error"}
- match statement: switched from envelope.get('error',{}).get('kind') to envelope.get('kind')
- All ClawError raises: changed from envelope['error']['message'] to envelope.get('error','')
  because error field is a STRING in v1.0, not a nested object
- Added inline comments on every error case noting v1.0 vs v2.0 difference
- Appendix: split into v1.0 (actual/current) and v2.0 (target after FIX_LOCUS_164)

The code examples now work correctly against the actual binary.
This was active misdocumentation (P0 severity) — the Python examples would crash
if a consumer tried to use them.
2026-04-23 05:00:33 +09:00
YeonGyu-Kim
98c675b33b docs: USAGE.md — clarify JSON v1.0 envelope shape + migration notice for #164
The JSON output section was misleading — it claimed the binary emits
exit_code, command, timestamp, output_format, schema_version, and nested
error objects. The binary actually emits v1.0 flat shape (kind at top-level,
error as string, no common metadata fields).

Updated section:
- Documents actual v1.0 success and error envelope shapes
- Lists known issues (missing fields, overloaded kind, flat error)
- Shows how to dispatch on v1.0 (check type=='error' before reading kind)
- Warns users NOT to rely on kind alone
- Links to FIX_LOCUS_164.md for migration plan
- Explains Phase 1/2/3 timeline for v2.0 adoption

This is a doc-only fix that makes USAGE.md truthful about the current behavior
while preparing users for the coming schema migration.
2026-04-23 04:52:17 +09:00
YeonGyu-Kim
afc792f1a5 docs: add FIX_LOCUS_164.md — JSON envelope contract migration strategy
Cycle #77 deliverable. Escalates #164 from pinpoint to fix-locus cycle.

Documents:
- 100% divergence across all 14 JSON-emitting verbs (not a partial drift)
- Two envelope shapes: current flat vs. documented nested
- Phased migration: dual-mode → default bump → deprecation (3 phases)
- Shared wrapper helper pattern (json_envelope.rs)
- Per-verb migration template (before/after code)
- Error classification remapping table (cli_parse → parse, etc.)
- 6 acceptance criteria + 3 risk categories
- Rollout timeline: Phase 1 ~6 dev-days, v3.0 cutoff at ~8 months

Ready for author review + pilot implementation decision (which 3 verbs lead).
2026-04-23 04:34:57 +09:00
YeonGyu-Kim
5b9097a7ac roadmap: #164 filed — JSON envelope schema-vs-binary divergence
Binary emits different envelope shape than SCHEMAS.md documents:
- Missing: timestamp, command, exit_code, output_format, schema_version
- Wrong placement: kind is top-level, not nested under error
- Extra: type:error field not in schema
- Wrong type: error is string, not object with operation/target/retryable

Additional issue: 'kind' field is semantically overloaded (verb-id in
success envelopes, error-kind in error envelopes) — violates typed contract.

Filed as 7th member of typed-error family (joins #102, #121, #127, #129, #130, #245).
Recommended fix: Option A — update binary to match schema (principled design).
2026-04-23 04:31:53 +09:00
YeonGyu-Kim
69a15bd707 roadmap: cycle #75 finding — rebase-bridge pattern breaks on multi-conflict branches
Attempted cherry-pick of #248 (1 commit) onto main. Encountered 2 conflict zones
in main.rs (test definitions + error classification). Manual regex cleanup left
orphaned diff markers that Rust compiler rejected.

Decision: Rebase-bridge works for 1-conflict branches, but 2+ conflicts in 12K+-line
files require author context. Revised strategy: push main to origin, request branch
authors rebase locally with IDE support, then merge from updated origin branches.

Estimated timeline: 30 min for branch authors to rebase 8 branches in parallel.
2026-04-23 04:26:21 +09:00
YeonGyu-Kim
41c87309f3 roadmap: cycle #74 checkpoint — rebase blocker identified
Fresh dogfood found no new pinpoints. All core verbs working correctly.

Blocker: 8 remaining review-ready branches on origin have conflicts with
cycle #72's 4 merges. Root cause: remote branches predated the merge chain.

Example: feat/jobdori-127-verb-suffix-flags rebase fails on commit 3/3
because cycle #72 added 15+ new LocalHelpTopic variants.

Recommend: coordinate with branch authors to rebase against new main.
Cycle #74 will post integration checkpoint + queue status.
2026-04-23 04:17:54 +09:00
YeonGyu-Kim
a02527826e roadmap: #163 closed as already-fixed — #130e-A (merged cycle #72) handled help --help
Backlog-truthfulness (cycle #60) validated: fresh dogfood on current main confirmed
#163 was closed by cycle #72's help-parity chain merge. Zero duplicate work.

Cleanup: removed /tmp/jobdori-163 worktree and fix/jobdori-163-help-help-selfref branch.
2026-04-23 04:07:37 +09:00
YeonGyu-Kim
a52a361e16 roadmap: cycle #72 — 4 merges landed, 9 branches integrated via MERGE_CHECKLIST runbook 2026-04-23 04:04:57 +09:00
YeonGyu-Kim
d5373ac5d6 merge: fix/jobdori-161-worktree-git-sha — diagnostic-strictness family
Fix: resolve actual HEAD path in git worktrees for correct Git SHA in build metadata.
In worktrees, .git is a pointer file not a directory, so cargo's rerun-if-changed=.git/HEAD never triggers.

Per MERGE_CHECKLIST.md Cluster 2 (P1 Diagnostic-strictness, isolated):
- 25 lines in build.rs only (no crate-level conflicts)
- Verified: build → commit → rebuild → SHA updates correctly

Diagnostic-strictness family member (joins #122/#122b).

Applied: execution artifact runbook. Cycle #72 integration.
2026-04-23 04:04:17 +09:00
YeonGyu-Kim
a6f4e0d8d1 merge: feat/jobdori-130e-surface-help — help-parity cluster + #251 session-dispatch
Contains linear chain of 6 fixes:
- #251: intercept session-management verbs at top-level parser (dc274a0)
- #130b: enrich filesystem I/O errors with operation + path context (d49a75c)
- #130c: accept --help / -h in claw diff arm (83f744a)
- #130d: accept --help / -h in claw config arm, route to help topic (19638a0)
- #130e-A: route help/submit/resume --help to help topics before credential check (0ca0344)
- #130e-B: route plugins/prompt --help to dedicated help topics (9dd7e79)

Per MERGE_CHECKLIST.md:
- Cluster 1 (Typed-error): #251 (session-dispatch)
- Cluster 3 (Help-parity): #130b, #130c, #130d, #130e-A/B

All changes are in rust/crates/rusty-claude-cli/src/main.rs (dispatch/help routing).
No test regressions expected (fixes add new guards, don't modify existing paths).

Applied: execution artifact runbook. Cycle #72 integration.
2026-04-23 04:03:40 +09:00
YeonGyu-Kim
378b9bf533 merge: docs/jobdori-162-usage-verb-parity — document dump-manifests/bootstrap-plan/acp/export
Completes discoverability chain for 4 verbs:
- dump-manifests — upstream manifest export
- bootstrap-plan — startup component graph
- acp — Zed editor integration status (tracking #76)
- export — session transcript export

Per MERGE_CHECKLIST.md Cluster 6 (P3 Doc-truthfulness):
- Diff: +87 lines in USAGE.md (doc-only)
- Zero code risk
- Parity audit: 12/12 verbs documented (was 8/12)

Applied: execution artifact runbook. Cycle #72 integration.
2026-04-23 04:02:47 +09:00
YeonGyu-Kim
66765ea96d merge: docs/parity-update-2026-04-23 — refresh PARITY.md stats for 2026-04-23
Growth since 2026-04-03:
- Rust LOC: 48,599 → 80,789 (+66%)
- Test LOC: 2,568 → 4,533 (+76%)
- Commits: 292 → 979 (+235%)

Per MERGE_CHECKLIST.md Cluster 6 (P3 Doc-truthfulness, low-risk):
- Diff: 4 lines in PARITY.md only
- Zero code risk
- Merge-ready

Applied: execution artifact runbook. Cycle #72 integration.
2026-04-23 04:02:38 +09:00
YeonGyu-Kim
c5b6fa5be3 fix(#161): resolve actual HEAD path in git worktrees for correct Git SHA in build metadata
Problem: In git worktrees, .git is a pointer file (not a directory), so cargo's
rerun-if-changed=.git/HEAD never triggers when commits are made. This causes
claw version to report a stale SHA after new commits.

Solution: Add resolve_git_head_path() helper that detects worktree mode:
- If .git is a file: parse gitdir pointer, watch <gitdir>/HEAD
- If .git is a directory: watch .git/HEAD (regular repo)

This ensures build.rs invalidates on each commit, making version output truthful.

Verification: Binary built in worktree now reports correct SHA after commits
(before: stale, after: current HEAD).

Relates to ROADMAP #161 (filed cycle #65, implemented cycle #69).
Diagnostic-strictness family member.
Diff: 21 lines added (resolve_git_head_path + conditional rerun-if-changed).
2026-04-23 03:45:59 +09:00
YeonGyu-Kim
48da1904e0 docs(#162): add USAGE.md sections for dump-manifests, bootstrap-plan, acp, export
Parity audit (cycle #67) found 4 verbs were in claw --help but absent from USAGE.md:
- dump-manifests: upstream manifest export for parity work
- bootstrap-plan: startup component graph for debugging
- acp: Zed editor integration status (discoverability only, tracking ROADMAP #76)
- export: session transcript export (requires --resume)

Each section follows the existing USAGE.md pattern:
- Purpose statement
- Example usage
- When-to-use guidance
- Related error modes where applicable

Coverage: 12/12 binary verbs now documented (was 8/12).

Acceptance:
- All 4 verbs have dedicated sections with examples: verified by grep
- Parity audit re-run: 100% coverage

Relates to ROADMAP #162 (filed cycle #67, implemented cycle #68).
Diff: +87 lines, doc-only, zero code risk.
2026-04-23 03:39:19 +09:00
YeonGyu-Kim
92a79b5276 docs(parity): update stats to 2026-04-23 — Rust LOC +66%, test LOC +76%, 979 commits on main
Growth since 2026-04-03:
- Rust LOC: 48,599 → 80,789 (+32,190)
- Test LOC: 2,568 → 4,533 (+1,965)
- Commits: 292 → 979 (+687, now pending review phase)

Main HEAD: ad1cf92 (doctrine loop canonical example)

Key deliverables cycles #39–#63:
- Typed-error hardening family (#247–#251)
- Diagnostic-strictness principle (#57–#59)
- Help-parity sweep (#130c–#130e)
- Suffix-guard uniformity (#152)
- Verb-classification fix (#160)
- Integration-bandwidth doctrine (#62)
- Doctrine-loop pattern formalized

Status: 13 branches awaiting review (no new branches since cycle #61 branch-last protocol established)
2026-04-23 03:25:56 +09:00
YeonGyu-Kim
553893410b fix(#160): reserved-semantic verbs with positional args now emit slash-command guidance
Verbs with CLI-reserved positional-arg meanings (resume, compact, memory,
commit, pr, issue, bughunter) were falling through to Prompt dispatch
when invoked with args, causing users to see 'missing_credentials' errors
instead of guidance that the verb is a slash command.

#160 investigation revealed the underlying design question: which verbs
are 'promptable' (can start a prompt like 'explain this pattern') vs.
'reserved' (have specific CLI meaning like 'resume SESSION_ID')?

This fix implements the reserved-verb classification: at parse time,
intercept reserved verbs with trailing args and emit slash-command guidance
before falling through to Prompt. Promptable verbs (explain, bughunter, clear)
continue to route to Prompt as before.

Helper: is_reserved_semantic_verb() lists the reserved set.
All 181 tests pass (no regressions).
2026-04-23 03:16:19 +09:00
YeonGyu-Kim
0aa0d3f7cf fix(#122b): claw doctor warns when cwd is broad path (home/root)
## What Was Broken

`claw doctor` reported "Status: ok" when run from ~/ or /, but `claw
prompt` in the same directory would error out with:

    error: claw is running from a very broad directory (/Users/yeongyu).
    The agent can read and search everything under this path.

Diagnostic deception: doctor said green, prompt said red. User runs
doctor to check their setup, sees all green, runs prompt, gets blocked.
Trust in doctor erodes.

This is the exact pattern captured in the 'Diagnostic Commands Must Be
At Least As Strict As Runtime Commands' principle recorded in ROADMAP.md
at cycle #57.

## Root Cause

Two code paths perform the broad-cwd check:
- CliAction::Prompt handler → `enforce_broad_cwd_policy()` (errors out)
- CliAction::Repl handler → same function

But render_doctor_report() never called detect_broad_cwd(). The workspace
health check only looked at whether cwd was inside a git project, not
whether cwd was a dangerously broad path.

## What This Fix Does

Extend `check_workspace_health()` to also probe `detect_broad_cwd()`:

    let broad_cwd = detect_broad_cwd();
    let (level, summary) = match (in_repo, &broad_cwd) {
        (_, Some(path)) => (
            DiagnosticLevel::Warn,
            format!(
                "current directory is a broad path ({}); Prompt/REPL will \
                 refuse to run here without --allow-broad-cwd",
                path.display()
            ),
        ),
        (true, None) => (DiagnosticLevel::Ok, "project root detected"),
        (false, None) => (DiagnosticLevel::Warn, "not inside a git project"),
    };

The check now warns about BOTH failure modes with clear messaging about
what Prompt/REPL will do.

## Dogfood Verification

Before fix:
    $ cd ~ && claw doctor
    Workspace
      Status           warn
      Summary          current directory is not inside a git project
    [all green otherwise]

    $ echo | claw prompt "test"
    error: claw is running from a very broad directory (/Users/yeongyu)...

After fix:
    $ cd ~ && claw doctor
    Workspace
      Status           warn
      Summary          current directory is a broad path (/Users/yeongyu);
                       Prompt/REPL will refuse to run here without
                       --allow-broad-cwd

    $ cd / && claw doctor
    Workspace
      Status           warn
      Summary          current directory is a broad path (/); ...

Non-regression:
    $ cd /tmp/my-project && claw doctor
    Workspace
      Status           warn
      Summary          current directory is not inside a git project
    (unchanged)

    $ cd /path/to/real/git/project && claw doctor
    Workspace
      Status           ok
      Summary          project root detected on branch main
    (unchanged)

## Regression Tests Added

- `workspace_check_in_project_dir_reports_ok` — non-broad + in-project = OK
- `workspace_check_outside_project_reports_warn` — non-broad + not-in-project = Warn with 'not inside git project' summary
- 181 binary tests pass (was 179, added 2)

## Related

- Principle: 'Diagnostic Commands Must Be At Least As Strict As Runtime
  Commands' (ROADMAP.md cycle #57)
- Companion to #122 (stale-base preflight in doctor)
- Sibling: next step is probably a full runtime-vs-doctor audit for
  other asymmetries (auth, sandbox, plugins, hooks)
2026-04-23 02:35:49 +09:00
YeonGyu-Kim
9dd7e79eb2 fix(#130e-B): route plugins/prompt --help to dedicated help topics
## What Was Broken (ROADMAP #130e Category B)

Two remaining surface-level help outliers after #130e-A:

    $ claw plugins --help
    Unknown /plugins action '--help'. Use list, install, enable, disable, uninstall, or update.

    $ claw prompt --help
    claw v0.1.0  (top-level help — wrong help topic)

`plugins` treated `--help` as an invalid subaction name. `prompt`
was explicitly listed in the early `wants_help` interception with
commit/pr/issue, which routed to top-level help instead of
prompt-specific help.

## Root Cause (Traced)

1. **plugins**: `parse_local_help_action()` didn't have a "plugins"
   arm, so `["plugins", "--help"]` returned None and continued into
   the `"plugins"` parser arm (main.rs:1031), which treated `--help`
   as the `action` argument. Runtime layer then rejected it as
   "Unknown action".

2. **prompt**: At main.rs:~800, there was an early interception for
   `--help` following certain subcommands (prompt, commit, pr, issue)
   that forced `wants_help = true`, routing to generic top-level help
   instead of letting parse_local_help_action produce a prompt-specific
   topic.

## What This Fix Does

Same pattern as #130c/#130d/#130e-A:

1. **LocalHelpTopic enum extended** with Plugins, Prompt variants
2. **parse_local_help_action() extended** to map both new cases
3. **Help topic renderers added** with accurate usage info
4. **Early prompt-interception removed** — prompt now falls through to
   parse_local_help_action like other subcommands. commit/pr/issue
   (which aren't actual subcommands yet) remain in the early list.

## Dogfood Verification

Before fix:
    $ claw plugins --help
    Unknown /plugins action '--help'. Use list, install, enable, ...

    $ claw prompt --help
    claw v0.1.0
    (top-level help, not prompt-specific)

After fix:
    $ claw plugins --help
    Plugins
      Usage            claw plugins [list|install|enable|disable|uninstall|update] [<target>]
      Purpose          manage bundled and user plugins from the CLI surface
      ...

    $ claw prompt --help
    Prompt
      Usage            claw prompt <prompt-text>
      Purpose          run a single-turn, non-interactive prompt and exit
      Flags            --model · --allowedTools · --output-format · --compact
      ...

## Non-Regression Verification

- `claw plugins` (no args) → still displays plugin inventory 
- `claw plugins list` → still works correctly 
- `claw prompt "text"` → still requires credentials, runs prompt 
- All 180 binary tests pass 
- All 466 library tests pass 

## Regression Tests Added (4+ assertions)

- `plugins --help` → HelpTopic(Plugins)
- `prompt --help` → HelpTopic(Prompt)
- Short forms `plugins -h` / `prompt -h` both work
- `prompt "hello world"` still routes to Prompt action with correct text

## HELP-PARITY SWEEP COMPLETE

All 22 top-level subcommands now emit proper help topics:

| Command | Status |
|---|---|
| help --help |  #130e-A |
| version --help |  pre-existing |
| status --help |  pre-existing |
| sandbox --help |  pre-existing |
| doctor --help |  pre-existing |
| acp --help |  pre-existing |
| init --help |  pre-existing |
| state --help |  pre-existing |
| export --help |  pre-existing |
| diff --help |  #130c |
| config --help |  #130d |
| mcp --help |  pre-existing |
| agents --help |  pre-existing |
| plugins --help |  #130e-B (this commit) |
| skills --help |  pre-existing |
| submit --help |  #130e-A |
| prompt --help |  #130e-B (this commit) |
| resume --help |  #130e-A |
| system-prompt --help |  pre-existing |
| dump-manifests --help |  pre-existing |
| bootstrap-plan --help |  pre-existing |

Zero outliers. Contract universally enforced.

## Related

- Closes #130e Category B (plugins, prompt surface-parity)
- Completes entire help-parity sweep family (#130c, #130d, #130e)
- Stacks on #130e-A (dispatch-order fixes) on same worktree
2026-04-23 02:07:50 +09:00
YeonGyu-Kim
0ca034472b fix(#130e-A): route help/submit/resume --help to help topics before credential check
## What Was Broken (ROADMAP #130e, filed cycle #53)

Three subcommands leaked `missing_credentials` errors when called
with `--help`:

    $ claw help --help
    [error-kind: missing_credentials]
    error: missing Anthropic credentials...

    $ claw submit --help
    [error-kind: missing_credentials]
    error: missing Anthropic credentials...

    $ claw resume --help
    [error-kind: missing_credentials]
    error: missing Anthropic credentials...

This is the same dispatch-order bug class as #251 (session verbs).
The parser fell through to the credential check before help-flag
resolution ran. Critical discoverability gap: users couldn't learn
what these commands do without valid credentials.

## Root Cause (Traced)

`parse_local_help_action()` (main.rs:1260) is called early in
`parse_args()` (main.rs:1002), BEFORE credential check. But the
match statement inside only recognized:
status, sandbox, doctor, acp, init, state, export, version,
system-prompt, dump-manifests, bootstrap-plan, diff, config.

`help`, `submit`, `resume` were NOT in the list, so the function
returned `None`, and parsing continued to credential check which
then failed.

## What This Fix Does

Same pattern as #130c (diff) and #130d (config):

1. **LocalHelpTopic enum extended** with Meta, Submit, Resume variants
2. **parse_local_help_action() extended** to map the three new cases
3. **Help topic renderers added** with accurate usage info

Three-line change to parse_local_help_action:

    "help" => LocalHelpTopic::Meta,
    "submit" => LocalHelpTopic::Submit,
    "resume" => LocalHelpTopic::Resume,

Dispatch order (parse_args):
    1. --resume parsing
    2. parse_local_help_action() ← NOW catches help/submit/resume --help
    3. parse_single_word_command_alias()
    4. parse_subcommand() ← Credential check happens here

## Dogfood Verification

Before fix (all three):
    $ claw help --help
    [error-kind: missing_credentials]
    error: missing Anthropic credentials...

After fix:
    $ claw help --help
    Help
      Usage            claw help [--output-format <format>]
      Purpose          show the full CLI help text (all subcommands, flags, environment)
      ...

    $ claw submit --help
    Submit
      Usage            claw submit [--session <id|latest>] <prompt-text>
      Purpose          send a prompt to an existing managed session
      Requires         valid Anthropic credentials (when actually submitting)
      ...

    $ claw resume --help
    Resume
      Usage            claw resume [<session-id|latest>]
      Purpose          restart an interactive REPL attached to a managed session
      ...

## Non-Regression Verification

- `claw help` (no --help) → still shows full CLI help 
- `claw submit "text"` (with prompt) → still requires credentials 
- `claw resume` (bare) → still emits slash command guidance 
- All 180 binary tests pass 
- All 466 library tests pass 

## Regression Tests Added (6 assertions)

- `help --help` → routes to HelpTopic(Meta)
- `submit --help` → routes to HelpTopic(Submit)
- `resume --help` → routes to HelpTopic(Resume)
- Short forms: `help -h`, `submit -h`, `resume -h` all work

## Pattern Note

This is Category A of #130e (dispatch-order bugs). Same class as #251.
Category B (surface-parity: plugins, prompt) will be handled in a
follow-up commit/branch.

## Help-Parity Sweep Status

After cycle #52 (#130c diff, #130d config), help sweep revealed:

| Command | Before | After This Commit |
|---|---|---|
| help --help | missing_credentials |  Meta help |
| submit --help | missing_credentials |  Submit help |
| resume --help | missing_credentials |  Resume help |
| plugins --help | "Unknown action" |  #130e-B (next) |
| prompt --help | wrong help |  #130e-B (next) |

## Related

- Closes #130e Category A (dispatch-order help fixes)
- Same bug class as #251 (session verbs)
- Stacks on #130d (config help) on same worktree branch
- #130e Category B (plugins, prompt) queued for follow-up
2026-04-23 02:03:10 +09:00
YeonGyu-Kim
19638a015e fix(#130d): accept --help / -h in claw config arm, route to help topic
## What Was Broken (ROADMAP #130d, filed cycle #52)

`claw config --help` was silently ignored — the command executed and
displayed the config dump instead of showing help:

    $ claw config --help
    Config
      Working directory /private/tmp/dogfood-probe-47
      Loaded files      0
      Merged keys       0
      (displays full config, not help)

Expected: help for the config command. Actual: silent acceptance of
`--help`, runs config display anyway.

This is the opposite outlier from #130c (which rejected help with an
error). Together they form the help-parity anomaly:
- #130c `diff --help` → error (rejects help)
- #130d `config --help` → silent ignore (runs command, ignores help)
- Others (status, mcp, export) → proper help
- Expected behavior: all commands should show help on `--help`

## Root Cause (Traced)

At main.rs:1050, the `"config"` parser arm parsed arguments positionally:

    "config" => {
        let tail = &rest[1..];
        let section = tail.first().cloned();
        // ... ignores unrecognized args like --help silently
        Ok(CliAction::Config { section, ... })
    }

Unlike the `diff` arm (#130c), `config` had no explicit check for
extra args. It positionally parsed the first arg as an optional
`section` and silently accepted/ignored any trailing arg, including
`--help`.

## What This Fix Does

Same pattern as #130c (help-surface parity):

1. **LocalHelpTopic enum extended** with new `Config` variant
2. **parse_local_help_action() extended** to map `"config"` → `LocalHelpTopic::Config`
3. **config arm guard added**: check for help flag before parsing section
4. **Help topic renderer added**: human-readable help text for config

Fix locus at main.rs:1050:

    "config" => {
        // #130d: accept --help / -h and route to help topic
        if rest.len() >= 2 && is_help_flag(&rest[1]) {
            return Ok(CliAction::HelpTopic(LocalHelpTopic::Config));
        }
        let tail = &rest[1..];
        // ... existing parsing continues
    }

## Dogfood Verification

Before fix:
    $ claw config --help
    Config
      Working directory ...
      Loaded files      0
      (no help, runs config)

After fix:
    $ claw config --help
    Config
      Usage            claw config [--cwd <path>] [--output-format <format>]
      Purpose          merge and display the resolved configuration
      Options          --cwd overrides the workspace directory
      Output           loaded files and merged key-value pairs
      Formats          text (default), json
      Related          claw status · claw doctor · claw init

Short form `claw config -h` also works.

## Non-Regression Verification

- `claw config` (no args) → still displays config dump 
- `claw config permissions` (section arg) → still works 
- All 180 binary tests pass 
- All 466 library tests pass 

## Regression Tests Added (4 assertions)

- `config --help` → routes to `HelpTopic(LocalHelpTopic::Config)`
- `config -h` (short form) → routes to help topic
- bare `config` (no args) → still routes to `Config` action
- `config permissions` (with section) → still works correctly

## Pattern Note

#130c and #130d form a pair: two outlier failure modes in help
handling for local introspection commands:
- #130c `diff` rejected help (loud error) → fixed with guard + routing
- #130d `config` silently ignored help (silent accept) → fixed with same pattern

Both are now consistent with the rest of the CLI (status, mcp, export, etc.).

## Related

- Closes #130d (config help discoverability gap)
- Completes help-parity family (#130c, #130d)
- Stacks on #130c (diff help fix) on same worktree branch
- Part of help-consistency thread (#141 audit)
2026-04-23 01:55:25 +09:00
YeonGyu-Kim
83f744adf0 fix(#130c): accept --help / -h in claw diff arm
## What Was Broken (ROADMAP #130c, filed cycle #50)

`claw diff --help` was rejected with:

    [error-kind: unknown]
    error: unexpected extra arguments after `claw diff`: --help

Other local introspection commands accept --help fine:
- `claw status --help` → shows help 
- `claw mcp --help` → shows help 
- `claw export --help` → shows help 
- `claw diff --help` → error  (outlier)

This is a help-surface parity bug: `diff` is the only local command
that rejects --help as "extra arguments" before the help detector
gets a chance to run.

## Root Cause (Traced)

At main.rs:1063, the `"diff"` parser arm rejected ALL extra args:

    "diff" => {
        if rest.len() > 1 {
            return Err(format!("unexpected extra arguments after `claw diff`: {}", ...));
        }
        Ok(CliAction::Diff { output_format })
    }

When parsing `["diff", "--help"]`, `rest.len() > 1` was true (length
is 2) and `--help` was rejected as extra argument.

Other commands (status, sandbox, doctor, init, state, export, etc.)
routed through `parse_local_help_action()` which detected
`--help` / `-h` and routed to a LocalHelpTopic. The `diff` arm
lacked this guard.

## What This Fix Does

Three minimal changes:

1. **LocalHelpTopic enum extended** with new `Diff` variant
2. **parse_local_help_action() extended** to map `"diff"` → `LocalHelpTopic::Diff`
3. **diff arm guard added**: check for help flag before extra-args validation
4. **Help topic renderer added**: human-readable help text for diff command

Fix locus at main.rs:1063:

    "diff" => {
        // #130c: accept --help / -h as first argument and route to help topic
        if rest.len() == 2 && is_help_flag(&rest[1]) {
            return Ok(CliAction::HelpTopic(LocalHelpTopic::Diff));
        }
        if rest.len() > 1 { /* existing error */ }
        Ok(CliAction::Diff { output_format })
    }

## Dogfood Verification

Before fix:
    $ claw diff --help
    [error-kind: unknown]
    error: unexpected extra arguments after `claw diff`: --help

After fix:
    $ claw diff --help
    Diff
      Usage            claw diff [--output-format <format>]
      Purpose          show local git staged + unstaged changes
      Requires         workspace must be inside a git repository
      ...

And `claw diff -h` (short form) also works.

## Non-Regression Verification

- `claw diff` (no args) → still routes to Diff action correctly
- `claw diff foo` (unknown arg) → still rejected as "unexpected extra arguments"
- `claw diff --output-format json` (valid flag) → still works
- All 180 binary tests pass
- All 466 library tests pass

## Regression Tests Added (4 assertions)

- `diff --help` → routes to HelpTopic(LocalHelpTopic::Diff)
- `diff -h` (short form) → routes to HelpTopic(LocalHelpTopic::Diff)
- bare `diff` → still routes to Diff action
- `diff foo` (unknown arg) → still errors with "extra arguments"

## Pattern

Follows #141 help-consistency work (extending LocalHelpTopic to
cover more subcommands). Clean surface-parity fix: identify the
outlier, add the missing guard. Low-risk, high-clarity.

## Related

- Closes #130c (diff help discoverability gap)
- Stacks on #130b (filesystem context) and #251 (session dispatch)
- Part of help-consistency thread (#141 audit, #145 plugins wiring)
2026-04-23 01:48:40 +09:00
YeonGyu-Kim
d49a75cad5 fix(#130b): enrich filesystem I/O errors with operation + path context
## What Was Broken (ROADMAP #130b, filed cycle #47)

In a fresh workspace, running:

    claw export latest --output /private/nonexistent/path/file.jsonl --output-format json

produced:

    {"error":"No such file or directory (os error 2)","hint":null,"kind":"unknown","type":"error"}

This violates the typed-error contract:
- Error message is a raw errno string with zero context
- Does not mention the operation that failed (export)
- Does not mention the target path
- Classifier defaults to "unknown" even though the code path knows
  this is a filesystem I/O error

## Root Cause (Traced)

run_export() at main.rs:~6915 does:

    fs::write(path, &markdown)?;

When this fails:
1. io::Error propagates via ? to main()
2. Converted to string via .to_string() in error handler
3. classify_error_kind() cannot match "os error" or "No such file"
4. Defaults to "kind": "unknown"

The information is there at the source (operation name, target path,
io::ErrorKind) but lost at the propagation boundary.

## What This Fix Does

Three changes:

1. **New helper: contextualize_io_error()** (main.rs:~260)
   Wraps an io::Error with operation name + target path into a
   recognizable message format:

       "{operation} failed: {target} ({error})"

2. **Classifier branch added** (classify_error_kind at main.rs:~270)
   Recognizes the new format and classifies as "filesystem_io_error":

       else if message.contains("export failed:") ||
               message.contains("diff failed:") ||
               message.contains("config failed:") {
           "filesystem_io_error"
       }

3. **run_export() wired** (main.rs:~6915)
   fs::write() call now uses .map_err() to enrich io::Error:

       fs::write(path, &markdown).map_err(|e| -> Box<dyn std::error::Error> {
           contextualize_io_error("export", &path.display().to_string(), e).into()
       })?;

## Dogfood Verification

Before fix:

    {"error":"No such file or directory (os error 2)","kind":"unknown","type":"error"}

After fix:

    {"error":"export failed: /private/nonexistent/path/file.jsonl (No such file or directory (os error 2))","kind":"filesystem_io_error","type":"error"}

The envelope now tells downstream claws:
- WHAT operation failed (export)
- WHERE it failed (the path)
- WHAT KIND of failure (filesystem_io_error)
- The original errno detail preserved for diagnosis

## Non-Regression Verification

- Successful export still works (emits "kind": "export" envelope as before)
- Session not found error still emits "session_not_found" (not filesystem)
- missing_credentials still works correctly
- cli_parse still works correctly
- All 180 binary tests pass
- All 466 library tests pass
- All 95 compat-harness tests pass

## Regression Tests Added

Inside the main CliAction test function:

- "export failed:" pattern classifies as "filesystem_io_error" (not "unknown")
- "diff failed:" pattern classifies as "filesystem_io_error"
- "config failed:" pattern classifies as "filesystem_io_error"
- contextualize_io_error() produces a message containing operation name
- contextualize_io_error() produces a message containing target path
- Messages produced by contextualize_io_error() are classifier-recognizable

## Scope

This is the minimum viable fix: enrich export's fs::write with context.
Future work (filed as part of #130b scope): apply same pattern to
other filesystem operations (diff, plugins, config fs reads, session
store writes, etc.). Each application is a copy-paste of the same
helper pattern.

## Pattern

Follows #145 (plugins parser interception), #248-249 (arm-level leak
templates). Helper + classifier + call site wiring. Minimal diff,
maximum observability gain.

## Related

- Closes #130b (filesystem error context preservation)
- Stacks on top of #251 (dispatch-order fix) — same worktree branch
- Ground truth for future #130 broader sweep (other io::Error sites)
2026-04-23 01:40:07 +09:00
YeonGyu-Kim
dc274a0f96 fix(#251): intercept session-management verbs at top-level parser to bypass credential check
## What Was Broken (ROADMAP #251)

Session-management verbs (list-sessions, load-session, delete-session,
flush-transcript) were falling through to the parser's `_other => Prompt`
catchall at main.rs:~1017. This construed them as `CliAction::Prompt {
prompt: "list-sessions", ... }` which then required credentials via the
Anthropic API path. The result: purely-local session operations emitted
`missing_credentials` errors instead of session-layer envelopes.

## Acceptance Criterion

The fix's essential requirement (stated by gaebal-gajae):
**"These 4 verbs stop falling through to Prompt and emitting `missing_credentials`."**
Not "all 4 are fully implemented to spec" — stubs are acceptable for
delete-session and flush-transcript as long as they route LOCALLY.

## What This Fix Does

Follows the exact pattern from #145 (plugins) and #146 (config/diff):

1. **CliAction enum** (main.rs:~700): Added 4 new variants.
2. **Parser** (main.rs:~945): Added 4 match arms before the `_other => Prompt`
   catchall. Each arm validates the verb's positional args (e.g., load-session
   requires a session-id) and rejects extra arguments.
3. **Dispatcher** (main.rs:~455):
   - list-sessions → dispatches to `runtime::session_control::list_managed_sessions_for()`
   - load-session → dispatches to `runtime::session_control::load_managed_session_for()`
   - delete-session → emits `not_yet_implemented` error (local, not auth)
   - flush-transcript → emits `not_yet_implemented` error (local, not auth)

## Dogfood Verification

Run on clean environment (no credentials):

```bash
$ env -i PATH=$PATH HOME=$HOME claw list-sessions --output-format json
{
  "command": "list-sessions",
  "sessions": [
    {"id": "session-1775777421902-1", ...},
    ...
  ]
}
# ✓ Session-layer envelope, not auth error

$ env -i PATH=$PATH HOME=$HOME claw load-session nonexistent --output-format json
{"error":"session not found: nonexistent", "kind":"session_not_found", ...}
# ✓ Local session_not_found error, not missing_credentials

$ env -i PATH=$PATH HOME=$HOME claw delete-session test-id --output-format json
{"command":"delete-session","error":"not_yet_implemented","kind":"not_yet_implemented","type":"error"}
# ✓ Local not_yet_implemented, not auth error

$ env -i PATH=$PATH HOME=$HOME claw flush-transcript test-id --output-format json
{"command":"flush-transcript","error":"not_yet_implemented","kind":"not_yet_implemented","type":"error"}
# ✓ Local not_yet_implemented, not auth error
```

Regression sanity:

```bash
$ claw plugins --output-format json  # #145 still works
$ claw prompt "hello" --output-format json  # still requires credentials correctly
$ claw list-sessions extra arg --output-format json  # rejects extra args with cli_parse
```

## Regression Tests Added

Inside `removed_login_and_logout_subcommands_error_helpfully` test function:

- `list-sessions` → CliAction::ListSessions (both text and JSON output)
- `load-session <id>` → CliAction::LoadSession with session_reference
- `delete-session <id>` → CliAction::DeleteSession with session_id
- `flush-transcript <id>` → CliAction::FlushTranscript with session_id
- Missing required arg errors (load-session and delete-session without ID)
- Extra args rejection (list-sessions with extra positional args)

All 180 binary tests pass. 466 library tests pass.

## Fix Scope vs. Full Implementation

This fix addresses #251 (dispatch-order bug) and #250's Option A (implement
the surfaces). list-sessions and load-session are fully functional via
existing runtime::session_control helpers. delete-session and flush-transcript
are stubbed with local "not yet implemented" errors to satisfy #251's
acceptance criterion without requiring additional session-store mutations
that can ship independently in a follow-up.

## Template

Exact same pattern as #145 (plugins) and #146 (config/diff): top-level
verb interception → CliAction variant → dispatcher with local operation.

## Related

Closes #251. Addresses #250 Option A for 4 verbs. Does not block #250
Option B (documentation scope guards) which remains valuable.
2026-04-23 01:25:32 +09:00
2 changed files with 39 additions and 684 deletions

View File

@@ -16785,469 +16785,3 @@ Plus introduces the **NEW `same-request-shape-but-different-response-shape` axis
**Status:** Open. No source code changed. Filed 2026-04-26 10:32 KST. HEAD: `313c840` (post-#251 fast-forward verification onto gaebal-gajae's 10:30 KST cycle ExternalPatchIntake pinpoint at `313c840` — NINTH consecutive concurrent-dogfood rebase verification cycle, three-way parity confirmed local == origin == fork at HEAD `313c840` with no race detected, demonstrating both gaps #239 catalogues at the dogfood-coordination layer and #243 catalogues at the canonical-ordering layer for the NINTH cycle in a row, confirming concurrent-dogfood-rebase as a stable operational pattern that has now held for NINE cycles in a row — Jobdori files the next-monotonic-id directly atop the prior tip rather than racing for a reservation gap, while gaebal-gajae continues to file pinpoints in numeric order based on the live channel's nudge stream). Branch: feat/jobdori-168c-emission-routing. Sibling-shape cluster: 43 pinpoints (grows by +1 with #252). Pre-flight-cost-prediction cluster: 1 member (#252 alone, founder). Token-accounting-without-message-emission cluster: 1 member (#252 alone, founder). Server-side-pre-execution-counting cluster: 1 member (#252 alone, founder). Same-request-shape-but-different-response-shape sub-cluster: 1 member (#252 alone, founder). Two-member-major-provider-only-no-third-party-partner-set sub-cluster: 7 members (#240+#241+#247+#248+#249+#250+#252) — grows from 6 to 7 confirming continuing-pattern-status across SIX distinct axis-classes (TOOL-COMPANION-BUNDLE / COMPOUND-INPUT / COMPOUND-OUTPUT / QUAD-MODALITY-TURN / SERVER-MANAGED-WEB-SEARCH-WITH-TOOL-CHOICE-DISCRIMINATOR / SERVER-SIDE-PRE-EXECUTION-COUNTING). Eight-layer fusion shape (smaller than #241/#247/#248/#249's twelve-layer count and smaller than #250's ten-layer count, reflecting the smaller-scope-but-novel-axis-class trade-off for daily-driver-impact pinpoints). **NEW META-pattern introduced**: NEW-SOLO-CLUSTER-FOUNDING-WITH-DAILY-DRIVER-IMPACT discovery-pattern — distinct from META-cluster-growth (continuous or discontinuous) and distinct from complementary-pinpoint-pair-bundle (paired halves of a tool-subsystem). #252 founds the THIRD distinct discovery-pattern in the audit catalog, the audit now spans THREE structurally distinct discovery-patterns rather than two, demonstrating audit-breadth-across-discovery-pattern-classes alongside audit-balance-across-META-clusters. **PIVOT signal**: #252 deliberately PIVOTS AWAY from BOTH Cross-pinpoint-synthesis-fusion-shape META-cluster (intentionally not extending the +1-per-cycle synthesis chain) AND Tool-locality-axis META-cluster (already extended by #250 cycle #393), founding NEW solo clusters with daily-driver-impact instead. Distinct from #251's contributor-friction/external-patch-intake axis (clawability-coordination layer) by being a daily-clawing-cost-gate workflow primitive (clawability-runtime layer). Linked to #221 (batch-dispatch async pattern, prior closest-shape neighbor with synchronous-batch-via-Files-API-prerequisite, distinct dispatch shape), #224 (Voyage-AI partner-asymmetric, prior provider-asymmetric pattern), #225 (audio partner-asymmetric, prior provider-asymmetric pattern), #226 (image partner-asymmetric, prior provider-asymmetric pattern), #227 (video partner-asymmetric, prior provider-asymmetric pattern with async-task-polling-primitive — closest neighbor in the workflow-primitive-axis sense), and #239/#243 (dogfood-coordination/canonical-ordering, the operational-layer pinpoints that #252's NINTH consecutive concurrent-dogfood rebase cycle continues to demonstrate).
🪨
## Pinpoint #253 — Dogfood cycle state-vector grows without compaction/budgeting until peer-agent context windows overflow
Dogfooded 2026-04-26 11:00 KST after cycle #394: the public dogfood loop had accumulated long state vectors, commit histories, cluster deltas, and repeated parity/rebase summaries across cycles #389-#394. A peer agent explicitly paused cycle #395 because the cumulative dogfood-cycle state vector was overflowing another agent's context window mid-cycle. That pause is direct product evidence: claw-code can keep discovering and appending valid pinpoints, but the coordination transcript has no typed compaction boundary, no rolling state-vector budget, and no canonical short form that lets multiple agents continue the same branch without re-sending the full audit history every turn.
Verified operational surface: the branch already contains #239 for branch leases and #243 for canonical ordering, but those protect write coordination, not cognitive/context coordination. The live dogfood channel currently relies on humans/agents manually deciding when to shorten, pause, or restate. There is no `DogfoodCycleSummary` / `StateVectorCompact` artifact that records `{cycle, head, parent, branch, active_pinpoint, cluster_delta, race_state, next_owner, blockers}` in a bounded token shape; no max-token policy for public nudge payloads; no rolling cluster ledger that can be referenced by id instead of repeated; no continuation token that says “resume from compact #N”; and no warning when a generated update exceeds a peer-agent budget. This is distinct from `/compact` conversation summarization because the missing primitive is branch/project-level coordination state shared across agents, not a single chat-session memory summary.
Required fix shape: (a) define a compact dogfood state-vector schema with hard field limits and stable ids for cluster ledgers; (b) emit one canonical compact artifact per cycle, committed or otherwise addressable, so future cycles cite `compact:#394` rather than replaying the full narrative; (c) add a context-budget guard to nudge/report generation that warns or truncates before posting huge state vectors; (d) add a `claw dogfood status --compact` / `claw roadmap compact-state` surface that reconstructs the current branch state from git + ROADMAP markers; (e) teach peer agents to request/reply with compact state by default and expand only on demand. Acceptance: after 10+ consecutive dogfood cycles, a new agent can recover active branch/head/pinpoint/cluster trajectory/blocker state from a bounded compact artifact under a fixed token budget, and the loop does not have to pause cycle spawning just to protect peer-agent context windows. **Status:** Open. No source code changed. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 02:00 UTC nudge. Cluster delta: context-budgeting +1, dogfood-state-compaction +1, multi-agent-continuation-token cluster founded, bounded-state-vector-for-branch-coordination cluster founded; linked to #239/#243 as the cognitive coordination complement to write/order coordination.
## Pinpoint #254 — MCP Resources lifecycle (subscribe / list_changed / updated / session-bound registry) is structurally absent from the runtime
Dogfooded 2026-04-26 11:02 KST on `feat/jobdori-168c-emission-routing` after #253's context-budget-discipline pivot: claw-code wires `resources/list` and `resources/read` over JSON-RPC stdio (`rust/crates/runtime/src/mcp_stdio.rs:874-981, 1330-1346`) but the MCP Resources lifecycle surface beyond one-shot list+read is structurally absent. Resources read at session start become stale during the session with no detection path, and resources that an MCP server creates/destroys during operation never reach the parent claw session as typed events.
Verified concrete surface (rg across `rust/crates/`): zero `resources/subscribe` / `resources/unsubscribe` request method, zero `notifications/resources/updated` / `notifications/resources/list_changed` notification handlers, zero `ResourceUpdated` / `ResourceListChanged` / `ResourceCreated` / `ResourceDestroyed` typed lifecycle events, zero `version` / `valid_until` / `etag` / `staleness` field on `McpResource` at `rust/crates/runtime/src/mcp_stdio.rs:175-188`, zero session-bound `ResourceRegistry` primitive that tracks per-session resource handles with create/destroy lifecycle, zero `subscribe` capability advertisement in the initialize handshake at line 1400, zero `/resources` / `/resource-list` / `/resource-refresh` slash command in `SlashCommandSpec`, zero `claw mcp resources` CLI subcommand. The hardened lifecycle phase enum at `mcp_lifecycle_hardened.rs:16-28` enumerates `ResourceDiscovery` once at startup but has no `ResourceRefresh` / `ResourceLifecycleChange` mid-session phase. Server-emitted JSON-RPC notifications between requests are dropped silently at the transport layer because the stdio reader only correlates by `id` and has no notification dispatch table.
Gap. The MCP spec defines resources as a long-lived discovery-and-subscribe primitive: clients can `resources/subscribe` to a `uri`, receive `notifications/resources/updated { uri }` when content changes, and receive `notifications/resources/list_changed` when the available resource set itself changes (resources created/destroyed by the server). claw-code treats resources as a one-shot snapshot: list once, read on demand, never re-validate. Concrete dogfood friction: an MCP server that exposes a live database table, a watched file, or an LLM-generated artifact has no way to tell claw the row/file/artifact has changed; the agent silently reasons over a stale snapshot until the user notices the divergence.
Cluster shape novelty. Founds **two** new clusters with #254 as solo founder: (1) **Session-bound-resource-tracking-registry cluster** — typed primitives that track resource handles created/destroyed/updated within a session boundary, distinct from the existing one-shot `list_resources_once` snapshot pattern; (2) **Resource-lifecycle-event-opacity axis** — server→client lifecycle notifications dropped at the transport layer because the JSON-RPC reader has no notification dispatch separate from id-correlated responses, distinct from #229/#238/#244 persistent-WebSocket cluster (those are bidirectional client-driven streams; this is a server-pushed-notification dispatch gap on stdio JSON-RPC).
Introduces the **FOURTH distinct discovery-pattern** in the audit catalog: **PURE-CLAWABILITY-FRICTION-FROM-DOGFOODING** — pinpoints whose primary novelty is dogfood-observed friction in the agent's own runtime rather than missing API/typed-shape coverage of an external provider surface. Distinct from META-cluster-growth (#244/#247/#248/#249/#250), complementary-pinpoint-pair-bundle (#245+#250), and NEW-SOLO-CLUSTER-FOUNDING-WITH-DAILY-DRIVER-IMPACT (#252). Sibling to #239/#243/#251/#253 which are operational/coordination-layer pinpoints, but #254 is at the **protocol-runtime layer** rather than the dogfood-coordination layer — the agent's own MCP transport silently swallows lifecycle signals it should be surfacing.
Required fix shape: (a) add `resources/subscribe` + `resources/unsubscribe` request methods on `McpStdioProcess` parallel to `list_resources` / `read_resource`; (b) add a notification dispatch path on the stdio reader that routes `notifications/resources/updated` and `notifications/resources/list_changed` to a per-server channel rather than dropping them; (c) add `pub enum ResourceLifecycleEvent { Created(McpResource) | Updated { uri } | Destroyed { uri } | ListChanged }` typed event surfaced through `LaneEvents`; (d) add a session-bound `ResourceRegistry` in the runtime that tracks active subscriptions, applies updates, and fires staleness warnings; (e) add `version` / `etag` optional fields on `McpResource`; (f) advertise `resources.subscribe = true` in the initialize handshake when the runtime supports it; (g) expose `/resources`, `/resources refresh`, `/resources subscribe <uri>` slash commands and `claw mcp resources [list|read|subscribe]` CLI subcommands. Acceptance: an MCP server that emits `notifications/resources/updated { uri: "db://orders/42" }` mid-session causes claw to update its `ResourceRegistry`, fire a `ResourceUpdated` lane event, and either re-read the resource on next reference or surface a staleness marker — instead of the agent silently reasoning over a stale snapshot.
**Status:** Open. No source code changed. Filed 2026-04-26 11:04 KST. HEAD: `17efd95` (post-#253 fast-forward verification onto gaebal-gajae's 11:00 KST DogfoodCycleSummary/StateVectorCompact pinpoint at `17efd95` — TENTH consecutive concurrent-dogfood rebase cycle, three-way parity local==origin==fork at `17efd95` confirmed before filing). Branch: feat/jobdori-168c-emission-routing. Cluster delta: session-bound-resource-tracking-registry 0→1 (founder), Resource-lifecycle-event-opacity 0→1 (founder), Pure-clawability-friction-from-dogfooding discovery-pattern 0→1 (founder, FOURTH distinct discovery-pattern after META-cluster-growth + complementary-pinpoint-pair-bundle + NEW-SOLO-CLUSTER-FOUNDING-WITH-DAILY-DRIVER-IMPACT). Smaller-scope by design (matches #253's context-budget-discipline). Distinct from #252's API-shape gap (this is runtime-protocol gap), distinct from #229/#238/#244 persistent-WebSocket cluster (this is stdio JSON-RPC notification dispatch), distinct from #239/#243/#253 dogfood-coordination layer (this is protocol-runtime layer). Linked to #253 as the discipline-pivot enabler that allowed this smaller-scope pinpoint to be foregrounded over the larger META-cluster-growth options.
## Pinpoint #255 — hikaMaeng fork proves local WebSearch needs a provider/spec registry intake lane, but the safe landing shape is design-first rather than blind cherry-pick
Dogfooded 2026-04-26 02:12 UTC on `feat/jobdori-168c-emission-routing` by fetching and statically reviewing the public fork `https://github.com/hikaMaeng/claw-code` at `/tmp/hikaMaeng-claw-code-read`. Interesting fork commits inspected: `262405e` (pluggable Tavily/Brave/Bing/custom/DDG fallback), `bd11289` (settings.json-only websearch config), `fa93cd3` (startup banner provider line), `5f2540a` (Firecrawl plus Brave gzip handling), `7f34d91` (external `searchProvider.json` specs), and `535be97` (web-search integration guide). Attribution: implementation ideas are from hikaMaeng / Sigrid Jin's fork work and should remain credited in any follow-up implementation commit.
Safe intake finding: do **not** cherry-pick the fork wholesale. The useful distilled ideas are smaller and align with #245/#250/#251: (1) separate provider selection from provider mechanics (`settings.json` chooses `websearch.provider` + secret; `searchProvider.json` describes endpoint/method/auth/result paths); (2) keep DDG/HTML parsing as the fallback path while JSON API providers use a generic executor; (3) make provider status visible in startup/UI so operators know which search backend is active; (4) preserve domain allow/block filtering, dedupe, and result truncation after provider-specific parsing; (5) handle provider transport quirks centrally (for example Brave gzip / response decoding) rather than in ad hoc call sites; (6) document custom-provider extension without requiring rebuilds.
Why ROADMAP-only in this branch: current `rust/crates/tools/src/lib.rs` already has a local DDG-backed `WebSearch` tool with tests and `CLAWD_WEB_SEARCH_BASE_URL` mock support, but the fork's later commits add config-schema surface, runtime config validation, root-level provider spec files, and CLI banner wiring as one cross-crate feature. Landing that feature safely needs an implementation lane with tests for config precedence, provider-spec parsing, no-secret logging, mock provider HTTP, and backward-compatible DDG behavior. A minimal cherry-pick would either break the existing test contract or introduce an unreviewed external spec file/runtime path.
Required fix shape for follow-up: (a) add a `WebSearchConfig { provider, api_key_ref/api_key }` runtime config view using existing settings precedence; (b) add a provider-spec schema with explicit allowlisted auth modes and result-path parsing, searched from project/user/system locations; (c) build a generic JSON provider executor plus a preserved DDG HTML executor; (d) keep post-parse domain filters/dedupe/truncate common; (e) add provider-status display only after config parse is non-fatal and redacts secrets; (f) add docs derived from `535be97` but rewritten in this repo's style and language; (g) add an external-patch-intake packet under #251 recording fork URL, commit range, diffstat, reviewed files, accepted/rejected ideas, and attribution.
Acceptance: `WebSearch` continues to pass existing DDG/mock tests with no settings file; setting `.claw/settings.json` to a supported provider plus a local mock `searchProvider.json` routes through the generic executor; missing API keys fail with typed/actionable errors before dispatch; provider name is visible without leaking secrets; and the commit body/ROADMAP preserves attribution to hikaMaeng's fork commits above. **Status:** Open. No source code changed in this intake commit. Cluster delta: #245 client-side configurable provider/parser registry gains concrete external implementation evidence; #251 external-fork intake gains its first reviewed fork packet; #250 server-managed search remains the complementary server-side half and is intentionally not mixed into this local-provider implementation lane.
## Pinpoint #257 — Completed OMX sessions can keep emitting stale alerts because clawhip does not read terminal workflow state
Dogfooded 2026-04-26 11:30 KST immediately after #256 landed. The OMX session `clawcode-issue-256-tool-use-result-contiguity` had already fixed, tested, committed, pushed, and marked its ralplan state terminal (`active: false`, `current_phase: complete`, blocker none). Despite that, clawhip emitted a 10-minute stale pane alert against the idle Codex prompt because it only saw pane inactivity and did not correlate the tmux session with the `.omx/state/sessions/.../ralplan-state.json` terminal state or the recent git push event.
Concrete failure mode: completed sessions become false-positive stale work. Operators have to manually capture the pane, verify commit/push/test status, and kill the session. This creates alert fatigue and can mask real stuck sessions. It is distinct from #253 context compaction and #239 branch leases: the missing primitive is stale-monitor lifecycle integration between tmux pane state, OMX workflow state, and git/event completion receipts.
Required fix shape: clawhip stale detection should classify sessions as `complete-idle` when (a) the associated OMX/Codex workflow state is terminal, (b) the pane contains a final status block with commit/pushed/tests/blocker none, or (c) a matching git event has landed for the session branch after the prompt started. For `complete-idle`, clawhip should auto-suppress stale alerts and optionally auto-retire/kill the tmux session after a grace period, emitting a compact cleanup receipt instead of repeated stale warnings. Acceptance: after a session lands a commit and marks workflow complete, no further stale alerts are emitted for that tmux session; it is either auto-killed or reported once as completed and retired. **Status:** Open. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 02:30 UTC nudge; live cleanup performed by killing `clawcode-issue-256-tool-use-result-contiguity` after verifying branch clean at `56f7f2e6`.
## Pinpoint #258`--allowedTools ""` (empty value) silently coerces to zero-tool agent with no warning, joining the silent-fallback / silent-coercion family at the CLI parse boundary
Dogfooded 2026-04-26 11:32 KST on `feat/jobdori-168c-emission-routing` at HEAD `a3f5a83` (post-#257 fast-forward verification onto gaebal-gajae's clawhip-stale-monitor pinpoint). Reproduction: `claw --allowedTools "" --output-format json -p "test"` accepts the empty value with zero warnings/errors and proceeds to dispatch — there is no diagnostic event distinguishing "operator typed `--allowedTools \"\"` by accident" from "operator wants every tool disabled." The CLI parse arm at `rust/crates/rusty-claude-cli/src/main.rs:979` accepts whatever the next argv slot contains, including `""`, and pushes it into `allowed_tool_values`. The wrapper `normalize_allowed_tools` at `main.rs:1826` checks `values.is_empty()` (returns `None` when zero `--allowedTools` flags were passed entirely) but does not check whether each individual `value` is empty/whitespace-only, so the empty string flows into `current_tool_registry()?.normalize_allowed_tools(values)` at `rust/crates/tools/src/lib.rs:192`. There the `for value in values` loop applies `value.split(|ch: char| ch == ',' || ch.is_whitespace()).filter(|token| !token.is_empty())` which yields zero tokens for `""`, the inner `for token in ...` loop never executes, the unsupported-tool error path is skipped, and the function returns `Ok(Some(BTreeSet::new()))` — a `Some(empty)` distinct from the omit-the-flag-entirely `None` case. Downstream, `tool_registry.definitions(allowed_tools)` at `lib.rs:248` filters every spec/runtime/plugin tool by `allowed_tools.is_none_or(|allowed| allowed.contains(...))`, and because `allowed.contains(...)` returns false for every name against an empty set, the agent receives **zero tools** — no read_file, no write_file, no bash, no grep, no glob, no MCP, no plugins. The agent boots fully, the wire request is dispatched normally (no early return), and the model receives the user prompt with an empty tool list and either hallucinates without tools or stalls when it tries to call one — meanwhile the operator sees no signal that they've just asked for a tool-less agent.
Gap. The CLI parse layer treats `--allowedTools ""` as semantically equivalent to "explicitly enumerate zero tools" rather than as "operator passed a malformed empty value." This is silent-empty-coercion at the CLI parse boundary: the input string was clearly an accident (no shell idiom passes an intentional empty argument that means the same as explicitly disabling everything), but the parser, the registry, and the dispatcher all fail to surface a single diagnostic. The behavior compounds with #201/#202/#203/#206/#207/#208 (silent-fallback at the provider boundary) and #213 (silent-zero-coercion on cached_tokens) at a structurally distinct layer — the CLI parse layer rather than the provider boundary.
Cluster delta: joins the silent-fallback / silent-drop / silent-strip / silent-coercion sibling-shape cluster at the CLI-parse-boundary axis, extending it across one more structural layer. Distinct from #213's silent-zero-coercion (response-side wire deserialization) by being request-side CLI input parse, distinct from #211's silent-misnomer (parameter rename) by being empty-value-acceptance rather than name-mismatch, and distinct from #29's silent fallback (provider routing) by being a tool-allowlist permissive-vs-restrictive boundary rather than a provider-routing fallback. Does NOT found a new cluster (per #253 context-budget discipline preferring extension over founding).
Required fix shape: (a) in `normalize_allowed_tools` (`rusty-claude-cli/src/main.rs:1826`), reject empty/whitespace-only values with a typed `AllowedToolsParseError::EmptyValue { flag_position }` returning a non-zero exit and an actionable error message ("--allowedTools requires at least one tool name; pass `--allowedTools none` to explicitly disable all tools, or omit the flag to enable all"); (b) introduce an explicit `none` literal token (or `--allowedTools=none` / `--no-tools`) for the legitimate "every tool disabled" use case so the empty-string accident is structurally distinct from the intentional-disable; (c) emit a `CliFlagWarning` structured event when `--output-format json` is active so downstream consumers can surface the diagnostic; (d) add tests covering `[""]`, `[" "]`, `["", "read"]`, `["read", ""]`, and the new `["none"]` literal. Acceptance: `claw --allowedTools "" -p "x"` exits non-zero with a typed error; `claw --allowedTools none -p "x"` proceeds with explicit zero-tool intent; existing `claw --allowedTools read,glob -p "x"` and `claw -p "x"` (no flag) behaviors are preserved.
Security relevance: the inverse failure mode (empty `--disallowedTools` or empty deny-list silently permitting all tools) is the exact shape upstream PR claw-code#2806 attempted to address (empty-config permission fallback safety, opened+closed within 3min on 2026-04-26). #258 catalogues the symmetric allow-list side at the CLI flag layer rather than the config layer, complementing the upstream PR's config-layer focus.
**Status:** Open. No source code changed. Filed 2026-04-26 11:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `a3f5a83` (post-#257 fast-forward verification). Cluster delta: silent-fallback-family extension (no new cluster founded, per #253 context-budget discipline). Smaller-scope by design (matches #253/#254/#257's discipline). Sibling: #201/#202/#203/#206/#207/#208/#213 (silent-fallback-family) at the provider boundary; #258 extends the family to the CLI parse boundary as the first member where the silent-coercion happens before any wire dispatch. Linked to upstream PR claw-code#2806 (empty-config permission fallback safety) as the symmetric config-layer half of the same anti-pattern.
## Pinpoint #259 — Dogfood status reports can publish stale branch/phase facts without provenance or freshness checks
Dogfooded 2026-04-26 12:00 KST after cycle #396: a dogfood status report posted minutes after commits #254-#258 had landed, but claimed the branch was only four commits ahead of `dev`, last commit `94f9540`, no new commits since 2026-04-23, no active session today, and no new pinpoints filed on 2026-04-26. The live branch at the same time already contained `70058a0` #254, `62adbf4` #255, `56f7f2e` #256 real code fix, `a3f5a83` #257, and `a07c0b7` #258. The report looked authoritative but was generated from stale memory rather than a fresh git/ROADMAP read.
Concrete failure mode: multi-agent dogfood coordination can regress to outdated phase summaries even while the branch is actively moving. Operators then have to manually cross-check `git log`, ROADMAP markers, and chat history to decide whether the report is actionable. This is distinct from #253 compact state-vector budgeting: #253 bounds context size; #259 requires freshness/provenance assertions before publishing a compact status.
Required fix shape: every dogfood status report should include machine-checked provenance fields (`generated_at`, `repo`, `branch`, `head`, `head_timestamp`, `roadmap_last_pinpoint`, `git_fetch_time`, `source=git+ROADMAP`, `staleness_seconds`) and refuse/label reports when the source snapshot is older than a small threshold. `claw dogfood status --compact` should fetch, parse latest ROADMAP pinpoint id, compare against local chat-memory claims, and emit `STALE_STATUS_SOURCE` if they disagree. Acceptance: a report cannot claim “no new commits/new pinpoints” while `origin/feat/jobdori-168c-emission-routing` contains newer commits/pinpoints than its own provenance head. **Status:** Open. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 03:00 UTC nudge; live branch was verified before filing and pushed on top of #258.
## Pinpoint #260`--compact --output-format json` envelope silently strips six observability fields (auto_compaction, iterations, tool_uses, tool_results, prompt_cache_events, estimated_cost) that the non-compact JSON envelope emits, with no diagnostic, no marker delta beyond `compact: true`, and no documentation that the strip occurs
Dogfooded 2026-04-26 12:05 KST on `feat/jobdori-168c-emission-routing` at HEAD `1daf636` (post-#259 fast-forward verification). The dispatch in `LiveCli::run_turn_with_output` (`rust/crates/rusty-claude-cli/src/main.rs:4637-4650`) routes `CliOutputFormat::Json if compact` to `run_prompt_compact_json` (`main.rs:4665-4688`) and `CliOutputFormat::Json` (no compact) to `run_prompt_json` (`main.rs:4690-4729`). Both paths receive the SAME `runtime::TurnSummary` from `runtime.run_turn(...)`, but the two envelopes serialize wildly different field sets. `run_prompt_json` emits eleven top-level keys: `message`, `model`, `iterations`, `auto_compaction`, `tool_uses`, `tool_results`, `prompt_cache_events`, `usage`, `estimated_cost`. `run_prompt_compact_json` emits four: `message`, `compact: true`, `model`, `usage`. **Six observability-critical fields are dropped silently** — most notably `auto_compaction` (the structured signal that the runtime auto-compacted the session mid-turn, removing N messages from history) and `iterations` (turn-loop iteration count, the only non-summary signal of how the agent reached the final assistant text). The `compact: true` marker is the ONLY diff a downstream JSON consumer can observe; nothing in the envelope, the help text, or any structured-error stream tells the operator that adding `--compact` discarded the auto-compaction event, the iteration count, the tool-use trace, the tool-result trace, the prompt-cache events, and the cost estimate. Operators who script `claw -p "x" --compact --output-format json | jq` to keep wire size small unknowingly lose the only signal that auto-compaction fired, and the only way to recover it is to remove `--compact` and re-run the prompt.
Gap. This is **silent-strip-on-response-envelope at the CLI output layer**, distinct from #136 (which only verified that `run_prompt_compact_json` exists and emits valid JSON with `compact: true`, never auditing what the envelope drops vs. its non-compact sibling) and distinct from #98 (which audited `--compact` being silently *ignored* outside the prompt-text path; #136 closed that by adding the dispatch arm — but the new envelope itself is the gap). The compact-JSON path was added to honor the flag, but the envelope was hand-coded with a minimal field set that omits exactly the fields a JSON-mode operator most needs (auto_compaction event, iteration count, cost). Worse, `auto_compaction` is the documented mechanism by which #134/#135's session-identity signals propagate — stripping it silently disables that downstream observability for any consumer that scripted around `--compact --output-format json`.
Cluster delta: joins the silent-fallback / silent-drop / silent-strip / silent-coercion sibling-shape cluster, extending it from 8 to 9 members. Distinct from #258 (CLI parse boundary, request-side), distinct from #213/#207/#208 (provider boundary, response-side wire deserialization), distinct from #203 (no streaming auto_compaction event at all). #260 is the FIRST member where the silent-strip happens at the **CLI response-envelope serialization layer** — after the runtime has fully populated the summary, the CLI itself drops the fields when assembling the JSON. Founds the **CLI-response-envelope-silent-strip sub-shape** within the silent-fallback family: the runtime computes the signal correctly; the CLI envelope serializer chooses not to surface it; no diagnostic surfaces the choice. Sibling-shape with #258 in that both extend the silent-fallback cluster at the CLI boundary, but #258 is request-side parse and #260 is response-side serialize — together they bracket the full CLI I/O perimeter for the silent-fallback family. Does NOT found a new top-level cluster (per #253 context-budget discipline preferring extension over founding).
Required fix shape: (a) align `run_prompt_compact_json` envelope so it emits the SAME field set as `run_prompt_json` minus only the fields whose value is genuinely stripped by the compact intent (the documented intent is "strip tool call details; print only the final assistant text" — so dropping `tool_uses`/`tool_results` is intentional, but dropping `auto_compaction`/`iterations`/`prompt_cache_events`/`estimated_cost` is not); concretely, add `iterations`, `auto_compaction`, `prompt_cache_events`, and `estimated_cost` to the compact-JSON envelope; (b) document the field-set delta in `--help` for `--compact` ("in JSON mode, strips `tool_uses` and `tool_results`; preserves `auto_compaction`, `iterations`, `prompt_cache_events`, `usage`, `estimated_cost`"); (c) add a regression test `run_prompt_compact_json_preserves_auto_compaction_signal` that asserts the compact-JSON envelope contains the `auto_compaction` key (null or populated) so future envelope edits cannot silently regress; (d) optionally emit a structured `EnvelopeFieldStrip` event listing dropped fields when `--output-format json` is active so downstream consumers can self-discover what the compact lane drops. Acceptance: `claw -p "x" --compact --output-format json | jq 'keys'` returns at least `["auto_compaction", "compact", "estimated_cost", "iterations", "message", "model", "prompt_cache_events", "usage"]`; the only fields stripped relative to non-compact are the documented `tool_uses`/`tool_results`; a synthetic auto-compaction event surfaces under `--compact` identically to non-compact.
**Status:** Open. No source code changed. Filed 2026-04-26 12:05 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `1daf636` (post-#259 fast-forward verification onto gaebal-gajae's stale-status-report-provenance pinpoint). Cluster delta: silent-fallback-family extension 8→9 (no new top-level cluster founded, per #253 context-budget discipline). CLI-flag-interaction-silent-precedence sub-shape introduced (response-envelope strip layer, sibling to #258's request-parse layer). Sibling: #98 (silent-flag-no-op class, predecessor at the dispatch layer, closed by #136), #136 (compact-JSON envelope existence, closed without auditing field-set parity), #203 (auto_compaction summary-only, no streaming event — #260 escalates: even the summary signal is dropped under `--compact`), #258 (CLI parse boundary silent-coercion, request-side complement to #260's response-side strip). Does not duplicate #98/#136: those audited dispatch and envelope-existence; #260 audits envelope-content-parity vs. its non-compact sibling — a structurally distinct surface.
## Pinpoint #261 — Compact dogfood summaries lack internal consistency checks for counted ranges and enumerated items
Dogfooded 2026-04-26 12:30 KST against the live status summaries after #260. Multiple compact reports correctly switched to fresh-HEAD style after #259, but still published internally inconsistent arithmetic: one report said “8 new pinpoints filed today (#252#260)” while enumerating #252, #253, #254, #255, #256, #257, #258, #259, and #260 — nine items. Another nearby report said “9 items across cycles #394#400” while also listing #252#260, again mixing range count, cycle count, and bullet count without validation. The report can be fresh and provenance-backed yet still self-contradictory.
Concrete failure mode: downstream agents use compact summaries to decide whether a cycle was handled, but a range/count mismatch forces manual recounting and can cause skipped cycle numbers or duplicate filings. This is distinct from #259: #259 verifies source freshness against git/ROADMAP; #261 verifies the report's own derived fields after freshness is established.
Required fix shape: `claw dogfood status --compact` should compute and validate `pinpoint_range_start`, `pinpoint_range_end`, `pinpoint_count`, `cycle_range`, and `listed_items_count` from the same parsed ledger, not freeform text. If the rendered text contains a numeric count or range, a pre-send validator should assert `end-start+1 == listed_items_count == pinpoint_count` and emit `STATUS_COUNT_MISMATCH` instead of publishing. Acceptance: a status report cannot say “8 items (#252#260)” while listing nine bullets; the command either corrects the count or refuses the report with the mismatched fields. **Status:** Open. Filed as ROADMAP-only dogfood pinpoint from the 2026-04-26 03:30 UTC nudge; branch verified and pushed on top of #260.
## Pinpoint #262`--max-turns N` is structurally absent from the CLI surface AND fails with two different silent shapes depending on argv position relative to `-p`: pre-`-p` raises `unknown option`, post-`-p` is silently absorbed into the prompt body via `args[index+1..].join(" ")` greedy slurp, with no diagnostic and no help-text mention
Dogfooded 2026-04-26 12:32 KST on `feat/jobdori-168c-emission-routing` at HEAD `2a0e5de` (post-#261 fast-forward verification onto gaebal-gajae's compact-summary self-consistency-check pinpoint). Reproduction matrix against `./rust/target/release/claw`:
- `claw -p "say hi" --max-turns 0` → exits with `[error-kind: missing_credentials]` (the prompt body becomes `"say hi --max-turns 0"` and dispatch proceeds normally; no flag was rejected, no diagnostic about `--max-turns`, the credential error is downstream of an already-mangled prompt).
- `claw --max-turns 0 -p "say hi"` → exits with `[error-kind: cli_parse]` `error: unknown option: --max-turns` (rejected by `format_unknown_option` at `rust/crates/rusty-claude-cli/src/main.rs:1565` because the parse loop sees `--max-turns` BEFORE `-p` and falls into the catch-all `other if rest.is_empty() && other.starts_with('-')` arm at `main.rs:993`).
- `claw --max-turns=0 -p "say hi"` → same `unknown option: --max-turns=0` rejection.
- `claw "hello world" --max-turns 0` → bare-prompt branch silently accepts (positional rest collects `["hello world", "--max-turns", "0"]` because the `other` arm at `main.rs:996-999` pushes any non-flag-after-rest onto `rest`).
Gap. **`--max-turns` does not exist** in the claw-code CLI surface: zero entries in `CLI_OPTION_SUGGESTIONS` (`main.rs:176-194`), zero match arms in `parse_args` (`main.rs:811-1004`), zero mention in `--help` output, zero typed `MaxTurns` field on `CliAction::Prompt` (`main.rs:696-749`), zero turn-budget plumbing into `LiveCli::run_turn_with_output` or `runtime.run_turn`. The only `turns` accounting in the runtime is the post-hoc `UsageTracker::turns()` counter (`main.rs:3156, 4915, 5762`) — a read-only odometer, not an enforced ceiling. This contrasts with Claude Code (the upstream CLI) which exposes `--max-turns N` as a documented turn-budget enforcement flag and which is the canonical way operators bound runaway tool-use loops in non-interactive `-p` mode.
Worse, the failure mode is **structurally asymmetric depending on argv position relative to `-p`** — a property no other silent-fallback family member exhibits. The `-p` arm at `main.rs:944` does `let prompt = args[index + 1..].join(" ")`, a greedy-slurp design that consumes EVERYTHING after `-p` as prompt body without re-entering the flag-parse loop. Any unknown flag passed AFTER `-p` is silently absorbed into the user's prompt. A `--max-turns 0` passed after `-p` is not just unsupported; it is invisibly mutated into prompt content, polluting the model input with operator-intended-machine-control-tokens that the model will see as natural language. A `--max-turns 0` passed BEFORE `-p` is at least surfaced as `unknown option`. The two outcomes — silent-prompt-pollution vs. typed-cli_parse-error — for the SAME flag differ ONLY by argv position, with no documentation that the boundary exists. The `-p` greedy-slurp is the actual silent-fallback site; `--max-turns` is just one observable instance of the class.
Cluster delta: joins the silent-fallback / silent-drop / silent-strip / silent-coercion / silent-prompt-absorption sibling-shape cluster, **extending it from 9 to 10 members** (#258 CLI-parse empty-coercion → #260 response-envelope strip → #262 CLI-parse position-sensitive-prompt-absorption). #262 is the FIRST member where the silent shape is **conditional on argv position relative to a sibling flag**, founding the **position-sensitive-parse-asymmetry sub-shape** within the silent-fallback family: the same input text produces a typed error or silent prompt-pollution depending only on argv ordering. Distinct from #258 (`--allowedTools ""` empty-string-coercion: silent always, regardless of position) by being position-conditional. Distinct from #260 (compact-JSON envelope strip: silent always, response-side) by being request-side argv-parse. Distinct from prior unknown-option behavior (which is not silent) because the silent path is reached only when the unknown flag arrives after `-p`. Audit-completeness for the silent-fallback chain at the **numeric-flag boundary** AND the **position-sensitive-CLI-parse boundary** simultaneously — two structurally distinct surfaces audited in one pinpoint. Does NOT found a new top-level cluster (per #253 context-budget discipline preferring extension over founding); the position-sensitive-parse-asymmetry is registered as a sub-shape inside the existing silent-fallback family.
Required fix shape: (a) declare `--max-turns N` as a typed CLI flag with `validate_max_turns` accepting `u32` (rejecting negative and non-numeric values with a typed `MaxTurnsParseError`), thread `max_turns: Option<u32>` through `CliAction::Prompt` and `LiveCli::run_turn_with_output`, and pass it as a hard ceiling into the runtime turn loop so `runtime.run_turn` returns a typed `TurnBudgetExhausted` event when the count is reached; (b) add `--max-turns` to `CLI_OPTION_SUGGESTIONS` (`main.rs:176-194`) and to `--help` output; (c) restructure the `-p` arm at `main.rs:944` so it does NOT greedily slurp `args[index+1..].join(" ")` but instead consumes only the next argv slot as the prompt and continues the flag-parse loop, OR explicitly require `-p` to be the LAST flag (rejecting any token starting with `-` after `-p` with `error: -p must be the final flag; saw '--max-turns' after the prompt`); (d) treat `--max-turns 0` semantically as "return immediately after dispatch with `iterations: 0` and no model call" (matching upstream Claude Code's documented zero-turn behavior, useful for cost-zero parse-validation runs); (e) emit a `CliFlagWarning` structured event when `--output-format json` is active and an unknown flag is detected after `-p`, so downstream consumers can surface the would-have-been-silent prompt-pollution diagnostic; (f) add tests covering `["-p", "x", "--max-turns", "0"]`, `["--max-turns", "0", "-p", "x"]`, `["-p", "x", "--unknown-flag"]`, `["--max-turns=5", "-p", "x"]`, `["-p", "x", "--max-turns=-1"]`, and `["hello", "--max-turns", "0"]` (bare-prompt rest-positional case). Acceptance: `claw -p "x" --max-turns 0` either rejects with a typed error OR enforces the turn budget without silently mutating the prompt; `claw --max-turns 0 -p "x"` and `claw -p "x" --max-turns 0` produce structurally equivalent outcomes (no position-sensitive divergence); `claw --help` lists `--max-turns N`; downstream JSON consumers can detect the would-have-been-silent absorption case.
**Status:** Open. No source code changed. Filed 2026-04-26 12:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `2a0e5de` (post-#261 fast-forward verification onto gaebal-gajae's compact-summary self-consistency-check pinpoint). Cluster delta: silent-fallback-family extension 9→10 (no new top-level cluster founded, per #253 context-budget discipline). Position-sensitive-parse-asymmetry sub-shape introduced (request-side argv-parse layer, sibling to #258's empty-value-coercion at the same layer and #260's response-envelope-strip at the response-side serialize layer). Sibling: #258 (`--allowedTools ""` empty-coercion at CLI parse boundary, position-invariant), #260 (`--compact --output-format json` envelope strip at CLI response-envelope layer, position-invariant), #98/#136 (predecessor `--compact` silent-no-op family at the dispatch layer). Together with #258 and #260, #262 brackets the CLI parse boundary across THREE structurally distinct silent-fallback shapes: empty-value-coercion (#258), response-envelope-strip (#260), and position-sensitive-prompt-absorption-of-unknown-flag (#262). Tenth-cycle concurrent-dogfood-rebase parity confirmed local==origin==fork at HEAD `2a0e5de` before filing.
## Pinpoint #263`--compact` help text says "text mode only" even though the CLI has a live compact-JSON dispatch path, creating a stale-contract trap for operators auditing JSON observability
Dogfooded 2026-04-26 13:02 KST on `feat/jobdori-168c-emission-routing` at HEAD `0e4fd38` (post-#262). Fresh `cargo run --quiet --bin claw -- --help` from `rust/` prints `--compact Strip tool call details; print only the final assistant text (text mode only; useful for piping)`. That help text is now false/stale: #260 already verified that `LiveCli::run_turn_with_output` dispatches `CliOutputFormat::Json if compact` into `run_prompt_compact_json`, so `--compact --output-format json` is a real live mode, not text-only. The product surface therefore gives operators the wrong contract at exactly the place they would look before testing the compact JSON envelope.
Concrete failure mode: an operator trying to inspect or script compact JSON observability sees help text claiming `--compact` is text-mode-only, while the runtime actually accepts compact JSON and emits a reduced JSON envelope. This can cause two bad outcomes: (1) users avoid testing/using compact JSON because the help says it should not exist; or (2) downstream claws treat compact JSON behavior as accidental/unsupported even though there is a dedicated code path. That stale help text also masked #260's more serious envelope-field strip: the documented contract never states which JSON fields compact mode preserves or drops because it incorrectly says JSON mode is out of scope.
Gap. This is a **help-contract drift / doc-to-runtime divergence** at the CLI surface, distinct from #260. #260 is about the runtime compact-JSON envelope silently stripping observability fields after dispatch; #263 is about the advertised CLI contract being stale before dispatch. The runtime has a feature the help denies exists. It is also distinct from #262's `--max-turns` absence: #262 is a missing flag plus position-sensitive parse asymmetry; #263 is an existing flag whose mode matrix is documented incorrectly.
Required fix shape: (a) update `--compact` help text to describe the actual mode matrix: text mode strips tool-call detail into final assistant text; JSON mode emits a compact JSON envelope; (b) document the compact-JSON field contract after #260 is fixed, explicitly naming preserved fields (`iterations`, `auto_compaction`, `prompt_cache_events`, `usage`, `estimated_cost`) and intentionally stripped fields (`tool_uses`, `tool_results`); (c) add a help-output regression that fails if `--compact` still says `text mode only` while `CliOutputFormat::Json if compact` remains supported; (d) optionally add `claw --help --json` / structured flag metadata later so mode compatibility can be generated from the same source as parser dispatch instead of hand-written prose. Acceptance: `claw --help` no longer claims `--compact` is text-only; compact JSON's supported status and field delta are discoverable before running a prompt; help output and dispatch matrix cannot drift silently.
**Status:** Open. No source code changed. Filed 2026-04-26 13:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `0e4fd38` before filing. Cluster delta: help-contract-drift +1; CLI-contract-observability gap adjacent to #260 but not a duplicate. Concrete delta this cycle: ROADMAP-only pinpoint appended after fresh help-output verification; no implementation landed.
## Pinpoint #264 — Turn-budget primitive is structurally absent from the runtime: `max_iterations` exists as an untyped builder knob with `usize::MAX` default, the live CLI never calls it, and exhaustion produces a bare string `RuntimeError` with no typed `TurnBudgetExhausted` event, no session-state turn counter, no warning event before the cap, and no zero-turn semantics — sister pinpoint to #262 covering the runtime-side of the same turn-budget concern
Dogfooded 2026-04-26 13:05 KST on `feat/jobdori-168c-emission-routing` at HEAD `d0aa18e` (post-#263 fast-forward verification onto gaebal-gajae's `--compact` help-text-vs-dispatch-mismatch pinpoint). #262 audited the CLI **parse-side** of the turn-budget gap (no `--max-turns` flag, plus position-sensitive prompt absorption when the unknown flag arrives after `-p`). #264 audits the **runtime-side**: even if the CLI flag existed and were wired through `CliAction::Prompt`, the runtime layer it would have to plumb into has no typed turn-budget primitive at all. The existing `max_iterations: usize` field at `rust/crates/runtime/src/conversation.rs:132` is a flat untyped builder knob, the `with_max_iterations` builder at line 192 takes only a raw `usize` with no validation, the field defaults to `usize::MAX` at line 181, the cap check at lines 343-352 returns a bare string-bodied `RuntimeError::new("conversation loop exceeded the maximum number of iterations")` with no typed discriminant, and `RuntimeError` itself at lines 87-93 is a single-variant struct holding only `message: String` with no `kind` enum / no `RuntimeErrorKind::TurnBudgetExhausted` variant / no machine-readable reason field. The post-completion `TurnSummary` at lines 110-118 carries `iterations: usize` as a passive odometer but has no `budget: Option<u32>` / `budget_remaining: Option<u32>` / `budget_exhausted: bool` companion fields and no in-loop event surface that warns before the cap is reached.
Verified concrete surface (rg across `rust/crates/`): zero `TurnBudget`, zero `TurnBudgetExhausted`, zero `TurnBudgetWarning`, zero `TurnsExhausted`, zero `TurnLimit`, zero `TurnCap`, zero `turn_budget`, zero `turn_limit`, zero `turn_cap` symbols anywhere in the workspace. Only two `with_max_iterations` callers exist: `conversation.rs:1768` (one unit test setting `1`) and `tools/lib.rs:3589` (subagent runtime with the hardcoded `DEFAULT_AGENT_MAX_ITERATIONS: usize = 32` const at `tools/lib.rs:3475`). The live primary CLI dispatch in `rusty-claude-cli/src/main.rs:7705` constructs `ConversationRuntime::new_with_features(...)` and **never** chains `.with_max_iterations(...)`, so the main interactive and non-interactive CLI always run with `usize::MAX` as the ceiling — the cap check on line 344 is dead code in the primary product surface and only triggers in the subagent dispatch and the one unit test. There is no telemetry/event/log emitted as the iteration count grows; `record_assistant_iteration` at line 593 records each iteration but emits no warning when iteration count crosses (say) 50% / 75% / 90% of the configured budget. There is no zero-turn semantic (a `max_iterations = 0` config would fail-fast on the first iteration with the same generic string error rather than returning a typed `iterations: 0, no_model_call: true` outcome useful for cost-zero parse-validation runs).
Gap. The runtime treats turn-count as an **untyped odometer plus a single emergency tripwire**, not as a first-class budget primitive. There is no `pub struct TurnBudget { max_turns: u32, warn_at_pct: Option<u8>, on_exhaust: ExhaustionPolicy }` shape, no `pub enum ExhaustionPolicy { Error, ReturnPartial, RequestExtension }`, no `pub enum RuntimeErrorKind { TurnBudgetExhausted { iterations: usize, max: usize }, ApiError, SessionError, HookError, … }`, no in-loop `TurnBudgetEvent::ApproachingLimit { iterations, max }` lane event, no per-session `Session::turn_counter` field that persists across turns (the iteration counter resets to 0 at the top of every `run_turn`, so a long session can run 100 turns with 32 iterations each — 3,200 model calls — without any cumulative budget tripping). The downstream effect: even if #262's CLI parse-side fix lands and a `--max-turns N` flag becomes plumbable, the runtime has no typed surface to plumb it INTO; the only available landing site is the same untyped `with_max_iterations(usize)` builder, which conflates per-turn iteration cap (the existing knob) with cumulative-session turn cap (the upstream Claude Code semantic), gives no typed exhaustion event, and silently disagrees with the subagent-only `DEFAULT_AGENT_MAX_ITERATIONS = 32` ceiling.
Cluster shape novelty: founds the **NEW Turn-budget-primitive cluster** with #264 as solo founder. The cluster catalogues missing typed primitives at the iteration/turn/session-budget axis: `TurnBudget` config struct, `TurnBudgetExhausted` typed exhaustion event, `TurnBudgetWarning` pre-exhaustion event, `Session::turn_counter` cumulative state, `ExhaustionPolicy` enum, and zero-turn semantics. Distinct from #262 (CLI parse-side; #262 is the request-shape gap, #264 is the type-shape gap underneath it) — together #262+#264 bracket the full turn-budget concern across the CLI parse boundary and the runtime primitive boundary. Distinct from the silent-fallback family (#258/#260/#262): the silent-fallback family is about silent input mutation and silent output stripping at boundaries; #264 is about a **missing typed primitive** layer that prevents typed errors from existing at all, regardless of whether the boundary is silent or loud. Distinct from #239 branch leases / #243 canonical ordering / #253 context-budget compaction (those are coordination/operational primitives at the cycle/branch level; #264 is a runtime-loop primitive at the model-iteration level). Distinct from #229/#238/#244 persistent-WebSocket cluster (those are bidirectional client-driven streams; #264 is a synchronous loop counter). #264 sits in the **runtime-typed-primitive layer**, parallel to #254 (MCP Resources lifecycle, also runtime-protocol layer) — both pinpoints catalogue missing typed primitives the runtime should expose to higher layers.
Discovery-pattern continuation: #264 is the first member of a **complementary-pinpoint-pair-bundle at the same turn-budget concern across two structural layers**, sister-shaped to #245+#250 (client/server complementary pair at the WebSearch concern). The pair #262 (CLI-parse layer) + #264 (runtime-primitive layer) catalogues both halves of the same operator capability gap and demonstrates that audit-completeness for a single user-facing flag often requires pinpointing TWO distinct internal layers rather than a single dispatch site. Does NOT extend the silent-fallback family (10 members at the close of #262); founds a fresh top-level cluster instead because the missing primitive is the **prerequisite layer** silent-fallback siblings would land typed errors INTO.
Required fix shape: (a) introduce `pub struct TurnBudget { pub max_turns: u32, pub max_iterations_per_turn: Option<u32>, pub warn_at_pct: Option<u8>, pub on_exhaust: ExhaustionPolicy }` with `Default::default()` returning unbounded; (b) introduce `pub enum ExhaustionPolicy { Error, ReturnPartial }` defaulting to `Error`; (c) replace `RuntimeError { message: String }` with `RuntimeError { kind: RuntimeErrorKind, message: String }` where `RuntimeErrorKind` is a typed enum including `TurnBudgetExhausted { iterations: usize, max: usize }`, `IterationBudgetExhausted { iterations: usize, max: usize }`, `ApiError`, `SessionError`, `HookError`, `HealthProbeFailed`, `Other`; (d) replace `with_max_iterations(usize)` with `with_turn_budget(TurnBudget)` (keep old builder as `#[deprecated]` alias to preserve subagent compatibility), wire the new builder through `ConversationRuntime` and `BuiltRuntime`; (e) add `Session::turn_counter: u64` persisted across turns and increment in `run_turn` before iteration loop; (f) add `pub enum RuntimeEvent { … TurnBudgetWarning { iterations, max, pct }, TurnBudgetExhausted { iterations, max }, IterationBudgetWarning { … } }` lane events emitted from inside the iteration loop at `warn_at_pct` and at exhaustion (so `--output-format json`/`stream-json` consumers can observe the budget approach before the typed error fires); (g) define zero-turn semantics: `TurnBudget { max_turns: 0, … }` returns `Ok(TurnSummary { iterations: 0, assistant_messages: vec![], … })` immediately without an API call, useful for parse-validation/cost-zero runs and matching the upstream Claude Code zero-turn contract that #262's CLI flag would expose; (h) wire `LiveCli::run_turn_with_output` (`main.rs:7705`) to pass a `TurnBudget` derived from the new `--max-turns` flag (#262 fix) plus a default sensible ceiling for the primary CLI surface; (i) add tests for (1) iteration cap typed error, (2) cumulative turn cap typed error across multiple `run_turn` calls on the same runtime, (3) warn-at-pct event firing exactly once per turn, (4) zero-turn fast-return, (5) `RuntimeErrorKind::TurnBudgetExhausted` round-trips through `--output-format json` `error.kind` field instead of being string-only.
Acceptance: a downstream caller can pattern-match on `RuntimeErrorKind::TurnBudgetExhausted { iterations, max }` instead of substring-matching on a generic string; `--output-format json` emits `{ "error": { "kind": "turn_budget_exhausted", "iterations": 33, "max": 32 }, … }` instead of a bare error message; `TurnBudget { max_turns: 0 }` returns immediately with `iterations: 0`; a 90%-of-budget warning event fires before exhaustion; the subagent runtime keeps its `DEFAULT_AGENT_MAX_ITERATIONS = 32` semantic via the new `TurnBudget` builder; the primary CLI runtime gains a default budget instead of `usize::MAX`; #262's CLI flag fix has a typed runtime surface to land on.
**Status:** Open. No source code changed. Filed 2026-04-26 13:05 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `d0aa18e` (post-#263 fast-forward verification onto gaebal-gajae's `--compact` help-text-vs-dispatch-mismatch pinpoint). Cluster delta: Turn-budget-primitive cluster 0→1 (founder, NEW SOLO CLUSTER); complementary-pinpoint-pair-bundle discovery-pattern extended (sister to #245+#250 WebSearch client/server pair, now #262+#264 turn-budget CLI-parse/runtime-primitive pair). Smaller-scope by design (matches #253/#254/#257/#258/#260/#261/#262/#263 context-budget discipline). Sister: #262 (CLI parse-side; #262+#264 bracket the full turn-budget concern across two structural layers). Distinct from silent-fallback family (#258/#260/#262 catalogue silent-mutation at boundaries; #264 catalogues a missing typed primitive layer that those boundaries would land typed errors INTO). Distinct from #254 MCP Resources lifecycle (also runtime-protocol layer but resource-handle axis, not iteration/turn axis). Eleventh-cycle concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `d0aa18e+#264` after push.
## Pinpoint #265 — Non-interactive output has no `stream-json` event lane even though the provider path already streams internally, forcing automation to choose between human text streaming and one-shot summary JSON
Dogfooded 2026-04-26 13:30 KST on `feat/jobdori-168c-emission-routing` at HEAD `d5568eb` (post-#264). Fresh CLI verification shows `claw --help` advertises only `--output-format text|json`, and `cargo run --quiet --bin claw -- --output-format stream-json -p "noop"` fails immediately with `[error-kind: cli_parse] error: unsupported value for --output-format: stream-json (expected text or json)`. Source inspection confirms the mode boundary is structural: `CliOutputFormat` has only `Text` and `Json` (`rust/crates/rusty-claude-cli/src/main.rs:793-805`), `run_turn_with_output` only dispatches `Text`, `Json`, and compact variants (`main.rs:4638-4648`), while the actual Anthropic client path always uses streaming internally (`MessageRequest { stream: true }` at `main.rs:7928`) and converts provider stream chunks into `AssistantEvent` values in `consume_stream` (`main.rs:7966-8095`). Those events are accumulated and returned, not exposed as a line-delimited machine stream.
Concrete failure mode: automation and downstream claws cannot observe turn progress as typed JSON events while a prompt is running. In `text` mode the operator sees live human-rendered Markdown/tool output, but parsers have to scrape terminal prose. In `json` mode the consumer receives one final envelope only after the turn completes, so long-running tool loops, post-tool stalls, prompt-cache events, tool starts/results, auto-compaction, and future #264 budget warnings/exhaustion cannot be routed until the end (or at all, depending on envelope fields). This is exactly the surface that would need to carry #264's `TurnBudgetWarning` before exhaustion and #260's compact-envelope observability fields during execution; without a stream-json lane, those typed runtime events have nowhere deterministic to go.
Gap. This is an **event/log opacity gap at the CLI output layer**, distinct from #260 and #263. #260 is about fields silently missing from the final compact JSON envelope; #263 is stale help text for an existing compact-JSON path; #265 is the absence of a machine-readable streaming output mode despite the provider stream and internal `AssistantEvent` pipeline already existing. It is also distinct from #264: #264 defines the missing turn-budget primitive/events; #265 identifies the CLI event lane those events need to surface through in non-interactive automation.
Required fix shape: (a) add `CliOutputFormat::StreamJson` parsed from `--output-format stream-json` and documented in help; (b) add a `run_prompt_stream_json` dispatch path that emits JSON Lines with stable event names (`message_start`, `text_delta`, `tool_use`, `tool_result`, `usage_delta`, `prompt_cache`, `auto_compaction`, `turn_budget_warning`, `turn_budget_exhausted`, `message_stop`, `error`, `final_summary`); (c) ensure human Markdown rendering is disabled or explicitly separated when `stream-json` is active so stdout remains valid JSONL; (d) include stable sequence numbers and timestamps so consumers can reconstruct order without scraping; (e) add tests that `--output-format stream-json` is accepted, stdout is JSONL-only, tool-use and final-summary events both appear, and runtime errors are emitted as typed `error` events before non-zero exit. Acceptance: a downstream claw can run `claw --output-format stream-json -p "..."` and react to tool/budget/compaction/error events before the final assistant message, with no terminal-prose scraping.
**Status:** Open. No source code changed. Filed 2026-04-26 13:30 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `d5568eb` before filing. Cluster delta: CLI-event-stream-observability +1; prerequisite output lane for #260/#264 follow-up implementation. Concrete delta this cycle: ROADMAP-only pinpoint appended after live help/parse/source verification.
## Pinpoint #266`RuntimeErrorKind` typed-error-kind enum is structurally absent: `RuntimeError` is a single-field `{ message: String }` newtype with zero typed discriminants, and the CLI compensates by reverse-engineering the discriminant downstream via a 22-branch substring-matching `classify_error_kind(message: &str) -> &'static str` function — the runtime throws away typed information at construction, then the CLI scrapes it back via prose-pattern-matching
Dogfooded 2026-04-26 13:35 KST on `feat/jobdori-168c-emission-routing` at HEAD `8975354` (post-rebase fast-forward onto gaebal-gajae's #265 `--output-format stream-json` lane-absent pinpoint). #264 audited the **Turn-budget primitive** runtime layer and noted in passing that `RuntimeError` lacks a `kind: RuntimeErrorKind` field and named `TurnBudgetExhausted` as a future variant alongside `ApiError`/`SessionError`/`HookError`. #266 is the **dedicated structural audit of that typed-error-taxonomy gap itself** — sister pinpoint to #264, founding the **Typed-error-kind-enumeration cluster** with #266 as solo founder.
Verified concrete surface (all paths absolute from `rust/crates/`): `RuntimeError` is defined at `runtime/src/conversation.rs:87-93` as `pub struct RuntimeError { message: String }` with a single `RuntimeError::new(impl Into<String>)` constructor at lines 91-97 and `Display`/`std::error::Error` impls at lines 100-106. There is **zero** `RuntimeErrorKind` enum, **zero** `RuntimeError::kind()` accessor, **zero** `kind: RuntimeErrorKind` field, **zero** typed discriminant, **zero** machine-readable reason, and **zero** structured payload (`iterations`, `max`, `path`, `operation`, `retryable`, etc.). `rg "pub enum RuntimeError\|RuntimeErrorKind" rust/crates/` returns no matches anywhere in the workspace. The sibling type `ToolError` at `conversation.rs:64-83` shares the same single-field `{ message: String }` shape — the typed-error gap is symmetric across both runtime error types.
Construction-site count: `rg 'RuntimeError::new' rust/crates/` returns **20 call sites** total (12 inside `runtime/src/conversation.rs`, 8 inside `rusty-claude-cli/src/main.rs`). Every single one passes a free-form `String` or `format!(...)` expression — no construction site emits a typed discriminant. Representative examples: `conversation.rs:324` (`format!("conversation loop exceeded the maximum number of iterations")`), `conversation.rs:740` (`"assistant stream produced no content"`), `main.rs:7976`/`7997`/`8007`/`8125` (`format_user_visible_api_error(...)` — API failures collapsed into prose), `main.rs:8000` (`"post-tool continuation nudge exhausted"`), `main.rs:8951`/`8968` (filesystem operations: `error.to_string()`).
Downstream counter-evidence — `classify_error_kind(message: &str) -> &'static str` at `rusty-claude-cli/src/main.rs:270-348` (78 lines, **22 substring-match branches**, called from `main.rs:215` (panic-handler tagging), `:243`/`:245`/`:249` (top-level error printer for both `text` and `json` modes), and `:2982` (mid-run error envelope). Branch enumeration: `missing_credentials`, `filesystem_io_error`, `missing_manifests`, `missing_worker_state`, `session_not_found`, `session_load_failed`, `no_managed_sessions`, `cli_parse` (×7 distinct substring patterns: `unrecognized argument`, `unknown option`, `prompt subcommand requires`, starts-with `empty prompt:`, `unsupported value for --`, `missing value for --`, `unsupported permission mode`, `invalid value for --`, `model string cannot be empty`, `unexpected extra arguments after \\`claw`), `slash_command_requires_repl`, `invalid_model_syntax`, `unsupported_command`, `unsupported_resumed_command`, `confirmation_required`, `api_http_error` (substring-OR over `api failed` / `api returned`), and a fall-through to `unknown`. Each branch carries inline comments referencing the historical pinpoint that motivated it (`// #169`, `// #170`, `// #171`, `// #247`, `// #130b`) — proving the function has accreted patterns one-pinpoint-at-a-time as new error prose shapes leaked in.
Downstream consumption: `--output-format json` envelope at `main.rs:215`/`:245`/`:249` emits `{ "type": "error", "error": "<bare prose>", "kind": "<classify_error_kind result>" }` where the `kind` field is recovered AT THE CLI BOUNDARY via the substring scrape, **not propagated from a typed runtime field**. The runtime never had the discriminant; the CLI invents it back. ROADMAP §4.44 (lines 758-785) and ROADMAP #130 (lines 4978-5122) both explicitly note this gap as a typed-error contract debt — #130's New evidence section at line 5120 calls out exactly this: "the typed-error contract is thus twice-broken on this path: (a) the io::ErrorKind information is discarded at the `?` in `run_export()`, AND (b) the flat `io::Error::Display` string is then fed to a classifier that has no patterns for filesystem errno strings." Neither §4.44 nor #130 audited the **`RuntimeErrorKind` enum itself** as the structurally-absent primitive; both treated the gap as classifier-pattern-missing rather than as **typed-discriminant-missing-at-the-source**.
Gap. The runtime treats the error class as a **stringly-typed value** rather than a typed enum. The classifier function is a **lossy reverse-decompiler** of information that should have been carried as a typed field from `RuntimeError` construction through the CLI emit. Two structural failures: (1) **forward-direction loss** — every `RuntimeError::new(...)` call site already knows the kind at the source (e.g., `conversation.rs:324` knows it's iteration-exhaustion, `main.rs:8000` knows it's post-tool-nudge-exhaustion, `main.rs:7976` knows it's API failure) but throws that knowledge away by collapsing into a `String`; (2) **reverse-direction fragility** — the CLI substring-scrape can mis-classify any error whose prose accidentally matches a pattern from a different error class (e.g., a legitimate API error containing the literal text `"unknown option"` would be misclassified as `cli_parse`), and silently degrades to `"unknown"` for any error class that has not yet been patched into the classifier. The 22-branch accretion is itself counter-evidence: every new error class needs both a `RuntimeError::new("<unique prose>")` site AND a corresponding `classify_error_kind` substring branch, with no compiler enforcement that the two stay in sync. New error classes that ship without a classifier branch silently fall through to `"unknown"`.
Cluster shape novelty: founds the **NEW Typed-error-kind-enumeration cluster** with #266 as solo founder. The cluster catalogues missing typed discriminants at the runtime-error-taxonomy axis: `RuntimeErrorKind` enum, `ToolErrorKind` enum, per-variant typed payloads (`TurnBudgetExhausted { iterations, max }`, `IterationBudgetExhausted { iterations, max }`, `ApiError { status, retryable }`, `SessionError { kind, path }`, `HookError { hook_name, exit_code }`, `FilesystemError { path, operation, errno }`, `ParseError { argv_index, raw }`), and the structural removal of `classify_error_kind` as a CLI-side reverse-decompiler in favor of typed field propagation. Sister to #264 (Turn-budget primitive cluster); the two pinpoints form a **second complementary-pinpoint-pair-bundle** following the #245+#250 (WebSearch client/server) and #262+#264 (turn-budget CLI-parse/runtime-primitive) pattern. #264 catalogues a single missing typed-event primitive; #266 catalogues the missing typed-discriminant-axis that ALL runtime errors (turn-budget, API, session, hook, filesystem, etc.) need in order to express themselves typedly. #266 is the **prerequisite layer** #264's `TurnBudgetExhausted` variant would land into.
Distinct from #260/#263/#265 (output-mode/help-text-contract gaps at the CLI output layer). Distinct from §4.44 (which proposed a typed envelope at the JSON boundary but did not audit the absent runtime enum at the source). Distinct from #130/#130b (which catalogued context-loss at filesystem `?` propagation and classifier-pattern-missing at the CLI, but did not catalogue the absent enum). Distinct from the silent-fallback family (#207/#208/#222/#231/#236/#246/#249/#258/#260/#262 — silent input/output mutation at boundaries; #266 is about the **type system itself missing a discriminant axis**).
Discovery-pattern continuation: this is the **third complementary-pinpoint-pair-bundle** in the dogfood corpus (#245+#250, #262+#264, now #264+#266), and the **second consecutive cycle pair-bundle** (#264 filed 13:05 KST, #266 filed 13:35 KST same cycle-day) — confirming complementary-pair-bundles as a stable discovery-pattern that systematically expands when a single-layer pinpoint is filed. Pair-bundle ratio: 3 of 67 pinpoints in the #200-range (≈4.5%) are bundled — small but consistent. Founds NEW cluster (Typed-error-kind-enumeration) rather than extending silent-fallback (which closes at 11 with #265).
Required fix shape: (a) introduce `pub enum RuntimeErrorKind { TurnBudgetExhausted { iterations: u32, max: u32 }, IterationBudgetExhausted { iterations: u32, max: u32 }, ApiError { status: Option<u16>, retryable: bool }, SessionError { kind: SessionErrorKind, path: Option<PathBuf> }, HookError { hook_name: String, exit_code: Option<i32> }, FilesystemError { path: PathBuf, operation: FilesystemOp, errno: Option<i32> }, ParseError { argv_index: Option<usize>, raw: Option<String> }, ToolStreamExhausted, EmptyAssistantStream, PostToolNudgeExhausted, ConfirmationRequired, Other }` at `runtime/src/conversation.rs`; (b) replace `RuntimeError { message: String }` with `RuntimeError { kind: RuntimeErrorKind, message: String }` adding `RuntimeError::kind(&self) -> &RuntimeErrorKind` accessor; (c) add typed constructors `RuntimeError::turn_budget_exhausted(iterations, max)`, `::api(status, retryable, message)`, `::session(kind, path, message)`, etc., and keep `RuntimeError::new(message)` as a `#[deprecated]` alias that constructs `kind: RuntimeErrorKind::Other` so existing call sites compile while the migration proceeds; (d) audit all 20 `RuntimeError::new` call sites and migrate each to a typed constructor — `conversation.rs:324``turn_budget_exhausted(...)`, `conversation.rs:740``tool_stream_exhausted()`, `conversation.rs:745``empty_assistant_stream()`, `main.rs:7976`/`:7997`/`:8007`/`:8125``api(...)`, `main.rs:8000``post_tool_nudge_exhausted()`, etc.; (e) extend `--output-format json` envelope to emit `error.kind` from `RuntimeError::kind()` with stable serde-renamed snake_case discriminant strings (`turn_budget_exhausted`, `api_error`, `session_error`, etc.), and emit per-variant typed fields (`error.iterations`, `error.max`, `error.path`, `error.retryable`, etc.); (f) replace `classify_error_kind(message: &str)` with `classify_error_kind(error: &RuntimeError) -> &'static str { error.kind().as_str() }` — the function survives as a serde-rename helper but no longer substring-scrapes prose; (g) add `RuntimeErrorKind::as_str()` and `FromStr` round-trip and golden-fixture tests for every variant proving the JSON envelope round-trips through `error.kind`; (h) deprecate the substring branches in `classify_error_kind` over a one-cycle window to give external consumers time to migrate from prose-scraping to typed-field-reading.
Acceptance: a downstream caller can pattern-match on `error.kind()` returning `RuntimeErrorKind::TurnBudgetExhausted { iterations, max }` instead of substring-matching `message.contains("conversation loop exceeded")`; `--output-format json` emits `{ "error": { "kind": "turn_budget_exhausted", "iterations": 33, "max": 32, "message": "…" }, … }` with typed payload fields per variant; the 22-branch substring classifier shrinks to a single `error.kind().as_str()` call; new error classes added to the runtime are compile-time-visible at every emit point because the enum requires exhaustive matching; a new error variant added without a classifier branch becomes a compiler error rather than silently degrading to `"unknown"`; #264's `TurnBudgetExhausted` variant has a typed home; #130/#130b's filesystem context-loss has a `RuntimeErrorKind::FilesystemError { path, operation, errno }` typed home rather than a classifier substring branch.
**Status:** Open. No source code changed. Filed 2026-04-26 13:35 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `8975354` before filing (post-rebase fast-forward onto gaebal-gajae's #265 `--output-format stream-json` lane-absent pinpoint). Cluster delta: Typed-error-kind-enumeration cluster 0→1 (founder, NEW SOLO CLUSTER); complementary-pinpoint-pair-bundle discovery-pattern extended to 3 bundles total (#245+#250 WebSearch, #262+#264 turn-budget, #264+#266 turn-budget-runtime/typed-error-axis). Smaller-scope by design (matches #253/#254/#257/#258/#260/#261/#262/#263/#264/#265 context-budget discipline). Sister: #264 (Turn-budget primitive runtime-side; #264 names `RuntimeErrorKind` as a future variant; #266 is the dedicated structural audit of that absent enum itself). Distinct from §4.44 typed-error contract proposal (which targets the JSON envelope boundary; #266 targets the runtime enum that the envelope would serialize FROM). Distinct from #130/#130b classifier-pattern-missing (which is downstream of the absent enum; #266 catalogues the upstream root cause). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `8975354+#266` after push.
## Pinpoint #267`prompt TEXT` subcommand has the same post-prompt greedy-slurp control-token absorption shape as `-p`, so flags after `prompt` become model input instead of parse errors
Dogfooded 2026-04-26 14:02 KST on `feat/jobdori-168c-emission-routing` at HEAD `fae9fd9` (post-#266). While probing the help-advertised `claw [--model MODEL] [--output-format text|json] prompt TEXT` path, source inspection showed the `prompt` subcommand arm does `let prompt = rest[1..].join(" ")` at `rust/crates/rusty-claude-cli/src/main.rs:1239`, then immediately returns `CliAction::Prompt` without validating whether any later token is a flag-looking control token. This mirrors #262's `-p` greedy slurp (`args[index+1..].join(" ")`) but on the documented `prompt TEXT` subcommand surface rather than the short `-p` compat alias.
Concrete failure mode: `claw prompt "say hi" --max-turns 0`, `claw prompt "say hi" --output-format json`, or `claw prompt "say hi" --definitely-unknown` are structurally parsed as prompt text (`"say hi --max-turns 0"`, etc.) rather than as either (a) recognized flags that continue parsing, or (b) typed `cli_parse` errors for unsupported trailing flags. In contrast, the same unknown flag before the first positional token is rejected by the global `other if rest.is_empty() && other.starts_with('-')` arm. The help text advertises `prompt TEXT` as the safe explicit non-interactive form, but the explicit form still makes the boundary after `TEXT` invisible: machine-control tokens after the prompt become model input.
Gap. This is distinct from #262 but sibling-shaped. #262 filed the missing `--max-turns` flag plus position-sensitive absorption after `-p` and bare prompt forms. #267 covers the long-form documented `prompt TEXT` subcommand. The `prompt` arm is a separate parser site (`rest[1..].join(" ")`) with separate acceptance criteria and deserves its own regression because a fix that only rewrites the `-p` arm would leave the documented subcommand silently absorbing flags. This extends the position-sensitive-parse-asymmetry sub-shape from short-option prompt mode to the explicit `prompt` subcommand surface.
Required fix shape: (a) define a delimiter contract for `prompt TEXT`: either require all flags before `prompt` and reject any `rest[1..]` token that starts with `-` unless escaped via `--`, or parse `prompt` as consuming exactly one TEXT argv and then resume global flag parsing; (b) support `--` as an explicit literal-prompt delimiter so users can intentionally include flag-looking text (`claw prompt -- "explain --max-turns"`); (c) emit a typed `CliFlagWarning`/`cli_parse` JSON error when a flag-looking token appears after the prompt without `--`; (d) add parser tests for `prompt x --max-turns 0`, `prompt x --output-format json`, `prompt x --definitely-unknown`, `prompt -- "x --max-turns 0"`, and `--output-format json prompt x`. Acceptance: the documented `prompt TEXT` path no longer silently mutates trailing control tokens into model input; fixes for #262 cannot pass while leaving this long-form parser site greedy.
**Status:** Open. No source code changed. Filed 2026-04-26 14:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `fae9fd9` before filing. Cluster delta: position-sensitive-parse-asymmetry sub-shape +1 documented-subcommand member; sibling to #262, not duplicate. Concrete delta this cycle: ROADMAP-only pinpoint appended after source verification of the `prompt` arm.
## Pinpoint #268 — MCP `tools/list` is never re-fetched on session resume: the runtime trusts the static `.claw.json` server list at `/mcp` time and the cached/qualified-name tool catalog at runtime build time, with zero staleness detection or live refresh path between server-restart events that change the tool catalog and the next `claw --resume` invocation
Dogfooded 2026-04-26 14:08 KST on `feat/jobdori-168c-emission-routing` at HEAD `d90b5f0` (post-rebase fast-forward onto gaebal-gajae's #267 `prompt TEXT` greedy-slurp pinpoint). #254 audited the **MCP Resources lifecycle** absence (`subscribe`/`list_changed`/`updated` for resources). #268 is the **sister pinpoint on the tool axis**: even the existing one-shot `tools/list` discovery is structurally bound to runtime startup and is never re-fetched on resume, so a session that restarts an MCP server (adding/removing/renaming tools) and then runs `claw --resume <session>` proceeds against the previous boot's stale tool catalog with no staleness signal. Founds the **NEW Session-resume-tool-catalog-staleness cluster** with #268 as solo founder, complementary to #254's resource-axis lifecycle gap.
Verified concrete surface (all paths absolute from `rust/crates/`): the resume entrypoint at `rusty-claude-cli/src/main.rs:2974` (`fn resume_session(session_path, commands, output_format)`) loads the persisted `Session` via `current_session_store().load_session(reference)` at `:5620-5634` (`fn load_session_reference`) and dispatches each `/<cmd>` through `run_resume_command(session_path, &session, &command)` at `:3467`. **The resume path never calls `build_runtime_mcp_state` and never instantiates `RuntimeMcpState::new` and never calls `manager.discover_tools_best_effort()`**`rg "build_runtime_mcp_state|RuntimeMcpState::new" rust/crates/rusty-claude-cli/src/main.rs` returns the only two construction sites at `:7267` (`build_runtime_plugin_state_with_loader`) and `:4311` (the impl), and neither is reachable from `resume_session`. The resume-mode `/mcp` slash arm at `main.rs:3596-3613` calls `commands::handle_mcp_slash_command(args, &cwd)` which at `commands/src/lib.rs:2341-2347` calls `loader.load()` and then `render_mcp_summary_report(cwd, runtime_config.mcp().servers())` — i.e., it dumps the **configured-server list from `.claw.json`** without spawning any MCP process or issuing any `tools/list` request. The function never touches `McpServerManager`, never spawns stdio, never sends an `initialize` handshake, never sends `tools/list`. There is **zero** `tool_catalog`/`tool_snapshot`/`cached_tools`/`tool_list_at`/`tool_revision`/`tool_etag` field on `Session` at `runtime/src/session.rs:91-105` (the persisted struct fields are: `version`, `session_id`, `created_at_ms`, `updated_at_ms`, `messages`, `compaction`, `fork`, `workspace_root`, `prompt_history`, `last_health_check_ms`, `model`, `persistence`). The startup-time discovery report at `main.rs:4119-4205` (`impl RuntimeMcpState { fn new(...) }`) calls `runtime.block_on(manager.discover_tools_best_effort())` ONCE and stores the result in `RuntimeMcpState { runtime, manager, pending_servers, degraded_report }` — the in-memory snapshot is held for the lifetime of the process and never re-issued. There is **zero** `refresh_tools` / `reload_tools` / `refetch_catalog` / `recheck_servers` method on `RuntimeMcpState` or `McpServerManager`; `rg "refresh_tool|reload_tool|refetch|recheck_server" rust/crates/runtime/src/` returns no matches.
Downstream symptom matrix: (1) **Server tools changed between sessions** — user adds/removes a tool on an MCP server (e.g., `git`/`gh`/local-tooling MCP servers commonly add tools across versions). On `claw --resume <session>` the resumed `/mcp` view shows the old configured-server list with no tool count and no live `tools/list` cross-check; on continued prompts the tool registry built at runtime startup contains either the now-stale tool set OR the freshly-discovered set with no audit trail of which prompts ran against which catalog. (2) **MCP server replaced with a different binary at the same `command:` path** — the new binary advertises a different tool set; the resumed session has zero detection path. (3) **MCP server now-unavailable** — a server that was reachable at session-start but is offline at resume: there is no liveness probe in resume mode, only a configured-list dump, so `/mcp` reports the server as configured without flagging it as unreachable until a `tools/call` fails downstream. (4) **Tool descriptor drift** (description, JSON-schema input shape, qualified name): a tool that kept its name but changed its `input_schema` between server versions: the runtime tool registry built at the FIRST session start at `main.rs:4129-4133` (`mcp_runtime_tool_definition`) snapshots `tool.tool.input_schema.clone()` once; subsequent re-builds at the NEXT session start would pick up the new schema, but mid-session the agent is reasoning over the boot-time schema with no `version`/`etag`/`schema_revision` field on `ManagedMcpTool` to detect drift.
Gap. Three structural absences on the same axis: (a) **resume-mode tool-list refresh**`run_resume_command` never instantiates `RuntimeMcpState`, so resumed sessions cannot even attempt a fresh `tools/list`; the resumed `/mcp` slash command at `main.rs:3596-3613` dispatches to a config-only renderer rather than a live-MCP renderer. (b) **mid-session tool-list refresh** — even the long-running session at first start instantiates `RuntimeMcpState` exactly once at `main.rs:7267` (`build_runtime_plugin_state_with_loader`) and never re-calls `discover_tools_best_effort()` afterwards; if the agent is alive when an MCP server's `tools/list_changed` notification fires (per the MCP spec's `notifications/tools/list_changed`), there is no notification-dispatch path on the JSON-RPC reader (the same notification-dispatch absence #254 catalogues for `notifications/resources/list_changed`). (c) **persisted tool-catalog snapshot** — the `Session` JSONL file at `.claw/sessions/<fingerprint>/<id>.jsonl` does not record which tool catalog was active when each turn ran, so post-hoc audit cannot tell which catalog version the assistant assumed. The composite gap means an MCP server that legitimately advertises `notifications/tools/list_changed` per the MCP 2025-03-26 spec is silently treated as having a frozen tool catalog from process boot.
Cluster shape novelty. Founds the **NEW Session-resume-tool-catalog-staleness cluster** with #268 as solo founder, distinct from #254's resource-axis lifecycle absence (which targets `resources/subscribe`+`resources/list_changed`+`resources/updated`+`ResourceRegistry` on the data-handle axis). #268 targets the **tool-handle axis**: `tools/list_changed` notification handler, `RuntimeMcpState::refresh_tool_catalog`, resume-mode `RuntimeMcpState` re-instantiation, persisted `tool_catalog_revision` on `Session`. #254 and #268 form the **fourth complementary-pinpoint-pair-bundle** in the dogfood corpus (after #245+#250 WebSearch, #262+#264 turn-budget, #264+#266 typed-error-axis), tracking the **two missing axes of MCP capability lifecycle** — resources and tools — that the spec advertises as live-subscribable but the runtime treats as one-shot-snapshot.
Distinct from #207/#208/#222/#231/#236/#246/#249/#258/#260/#262/#265 silent-fallback-input-mutation (those are CLI-layer prompt/output silent mutation; #268 is missing-refresh-of-discovery-data at the protocol-runtime layer). Distinct from #254 (resources axis vs tools axis). Distinct from #266 (typed-error enum vs missing-refresh primitive — orthogonal axes). Distinct from #259 session-state schema gaps (which catalogue what's in the JSONL; #268 catalogues what `Session` should have but does not — the tool-catalog-revision field). Distinct from #229/#238/#244 persistent-WebSocket-stream cluster (those are bidirectional client-driven streams; #268 is server-pushed-notification-handler absence on stdio JSON-RPC, structurally identical to #254's gap but on the tool axis).
Discovery-pattern continuation: this is the **fourth complementary-pinpoint-pair-bundle** (#245+#250, #262+#264, #264+#266, now #254+#268). #254 was filed earlier this dogfood-day at 11:02 KST; #268 closes the resources↔tools axis-pair as both being structurally one-shot. Pair-bundle ratio: 4 of 68 pinpoints in the #200-range (≈5.9%) bundled — confirms complementary-pair-bundles as a **stable discovery-pattern that systematically expands when an axis gap is filed and an orthogonal sister axis exists**. #268 also extends the **PURE-CLAWABILITY-FRICTION-FROM-DOGFOODING** discovery-pattern (#254's founding pattern) — the agent's own MCP runtime treats catalog discovery as boot-once rather than spec-compliant subscribe/refresh, so the agent silently reasons over a boot-time tool view that diverges from server reality across process lifetimes.
Required fix shape: (a) add `notifications/tools/list_changed` notification handler on the JSON-RPC stdio reader (parallel to #254's resources handler) routing to a per-server channel; (b) add `pub enum ToolCatalogLifecycleEvent { ToolListChanged | ToolAdded(McpTool) | ToolRemoved { qualified_name: String } | ToolSchemaChanged { qualified_name: String, old_schema: JsonValue, new_schema: JsonValue } }` typed event surfaced through `LaneEvents`; (c) add `RuntimeMcpState::refresh_tool_catalog(&mut self) -> Result<McpToolDiscoveryReport, ...>` that re-runs `manager.discover_tools_best_effort()` and diffs against the previous snapshot, emitting `ToolCatalogLifecycleEvent`s for the delta; (d) instrument the resume entrypoint at `main.rs:2974` (`resume_session`) to instantiate `RuntimeMcpState` (or a lightweight liveness-only variant) when the session has any MCP server configured, refresh the catalog, and surface the diff in the resume-mode `/mcp` output rather than dumping the static config; (e) add `revision: u64` and optional `etag: Option<String>` to `ManagedMcpTool`/`McpTool` so persisted session JSONL turns can record `tool_catalog_revision` per turn; (f) extend `Session` with `pub last_tool_catalog_revision: Option<u64>` (and bump `SESSION_VERSION` from `1` to `2` per #259); (g) advertise `tools.listChanged = true` in the initialize handshake at `mcp_stdio.rs:1400` when the runtime supports it; (h) expose `/mcp tools refresh` slash command and `claw mcp tools refresh` CLI subcommand; (i) emit a typed `mcp_tool_catalog_stale` warning to `--output-format json` when resume detects the catalog has diverged from the snapshot embedded in the last session turn. Acceptance: an MCP server that adds a tool between session-end and `claw --resume <session>` causes the resumed `/mcp` output to show `+1 tool added: srv__new_tool` rather than the static configured-server list with no live cross-check; an MCP server that emits `notifications/tools/list_changed` mid-session causes a `ToolListChanged` lane event and refreshes the tool registry rather than being silently dropped; the persisted session JSONL records `tool_catalog_revision` per turn so post-hoc audit can identify which catalog snapshot the assistant reasoned over.
**Status:** Open. No source code changed. Filed 2026-04-26 14:08 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `d90b5f0` before filing (post-rebase fast-forward onto gaebal-gajae's #267 `prompt TEXT` greedy-slurp pinpoint). Cluster delta: Session-resume-tool-catalog-staleness cluster 0→1 (founder, NEW SOLO CLUSTER); complementary-pinpoint-pair-bundle discovery-pattern extended to 4 bundles total (#245+#250 WebSearch, #262+#264 turn-budget, #264+#266 typed-error-axis, #254+#268 MCP-resources/tools-lifecycle-axis-pair). Sister: #254 (MCP Resources lifecycle on the data-handle axis; #268 is the tool-handle axis sister). Smaller-scope by design (matches #253/#254/#257/#258/#260/#261/#262/#263/#264/#265/#266/#267 context-budget discipline). Distinct from #266 (typed-error enum vs missing-refresh primitive — orthogonal axes). Distinct from #259 session-state schema gaps (#259 catalogues what's in JSONL; #268 catalogues what `Session` should have but does not — the `last_tool_catalog_revision` field). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `d90b5f0+#268` after push.
## Pinpoint #269 — Dogfood status transport lacks channel-aware payload budgeting and delivery receipts: long compact reports can truncate mid-stanza or fail as `Message failed` while the cycle still treats the report as posted
Dogfooded 2026-04-26 14:30 KST from the live `#clawcode-building-in-public` dogfood loop immediately after #268. The status reporter attempted to publish a growing same-day summary; the visible Discord output truncated mid-sentence at #263 (`**#263** — \`--compact\` help text claims text…`), then a later helper message self-reported `Truncated mid-sentence at #263`, and the cron emitted `Cron job "clawcode-dogfood-cycle-reminder" failed: ⚠️ ✉️ Message failed` followed by another timeout. The loop still printed meta-prose saying the Discord report was posted, even though the transport evidence showed partial delivery/failure.
Concrete failure mode: the dogfood status path can exceed a channel/provider payload limit and either (a) deliver only a prefix, cutting a pinpoint stanza in half, or (b) fail after attempting delivery, while the cycle's own state/reporting does not carry a typed `delivery_status` / `delivered_bytes` / `truncated_at` receipt. Operators then see conflicting truth: a report says "posted", the channel contains a partial report, and cron says `Message failed` or times out. This is distinct from #253 (state-vector context-budget discipline) and #261 (derived count/range self-consistency): those validate what the summary *says*; #269 validates whether the rendered payload fits the target channel and whether delivery actually succeeded.
Gap. There is no channel-aware pre-send budget gate for dogfood status payloads, no stanza-safe chunker, no checksum/part numbering, and no authoritative delivery receipt bound to the cycle id. A compact summary can be internally fresh (#259) and arithmetically consistent (#261) yet still be operationally unusable because the transport cuts it mid-stanza or the message send fails after side effects. The status generator also lacks a fail-closed rule: a failed/partial send should mark the cycle as `delivery_failed` or `partial_delivery`, not publish/echo a success summary.
Required fix shape: (a) add per-channel payload budget metadata (`max_chars`, `safe_chars`, markdown overhead, attachment/thread fallback) to dogfood report rendering; (b) preflight-render the report and split into stanza-safe chunks before send, never in the middle of a pinpoint bullet; (c) add part numbering and a short report id/checksum (`dogfood-status d90b5f0 part 1/3`) so downstream claws can detect missing chunks; (d) record/send a typed delivery receipt with `status: delivered|partial|failed`, `message_ids`, `bytes_sent`, `chunks_sent`, `truncated_at`, and provider error; (e) if any chunk fails, emit a compact failure notice and do not mark the report as posted; (f) regression-test a report containing #257-#268-sized entries against the Discord character budget and assert the split boundaries align to pinpoint stanzas. Acceptance: a same-day summary can never silently truncate mid-pinpoint; a send failure produces a typed `delivery_failed` receipt with no contradictory "posted" success prose; cron timeout can distinguish `timed_out_before_send` vs `timed_out_after_partial_send` (#246 sibling).
**Status:** Open. No source code changed. Filed 2026-04-26 14:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `62b20c7` before filing. Cluster delta: dogfood-transport-delivery-receipt +1; sibling to #246 (cron timeout ambiguity), #253 (state-vector budgeting), and #261 (summary self-consistency), but distinct transport/payload-budget layer. Concrete delta this cycle: ROADMAP-only pinpoint appended from live channel failure evidence.
## Pinpoint #270 — Help-text flag listing omits `--reasoning-effort`, `--base-commit`, `--allow-broad-cwd`, and the `-p` short-prompt alias even though `parse_args` accepts and validates them, so operators auditing `claw --help` see no contract for runtime-tunable flags that are fully wired and documented inline only via inline error messages
Dogfooded 2026-04-26 14:33 KST on `feat/jobdori-168c-emission-routing` at HEAD `364566c` (post-rebase fast-forward onto gaebal-gajae's #269 dogfood-transport-payload-budget pinpoint). Fresh `cargo run --quiet --bin claw -- --help` output was captured in full; the `Flags:` block lists exactly seven flags (`--model`, `--output-format`, `--compact`, `--permission-mode`, `--dangerously-skip-permissions`, `--allowedTools`, `--version/-V`) and the `Examples:` block uses only those. Source inspection of `parse_args` at `rust/crates/rusty-claude-cli/src/main.rs:875-942` shows the dispatcher additionally accepts `--reasoning-effort {low|medium|high}` (lines 916-936, with both space-separated and `=`-form, validated against the literal set), `--base-commit COMMIT` (lines 905-915, both forms), `--allow-broad-cwd` (lines 939-942, boolean toggle), and `-p PROMPT` as a Claw Code compat short-prompt alias (line 943-onward, greedy `args[index+1..].join(" ")`). Live verification: `cargo run --quiet --bin claw -- --reasoning-effort medium prompt "noop"` parses past the CLI parse stage and fails downstream with `[error-kind: missing_credentials]` (Anthropic auth missing), confirming the flag is dispatch-accepted; `--reasoning-effort yolo` yields `[error-kind: cli_parse] invalid value for --reasoning-effort: 'yolo'; must be low, medium, or high` (test at `main.rs:11288-11294`). None of these four surface tokens appear in any `writeln!` invocation inside `print_help_to` (`main.rs:9328-9480`).
Concrete failure mode: an operator running `claw --help` to audit available knobs cannot discover that reasoning effort, custom base-commit-for-diff, broad-cwd permission, or `-p` shorthand exist. They learn about `--reasoning-effort` only by typing it wrong and seeing the validation error message; about `--base-commit` only by reading source or `MERGE_CHECKLIST.md`/git scripts; about `-p` only by reading examples in third-party docs or by observing other automation. This is the second member of the help-contract-drift cluster founded by #263 (`--compact` help text said "text mode only" while runtime had a live compact-JSON dispatch path): #263 was a stale-mode-matrix on a flag that *is* listed; #270 is whole-flag absence — flags that are fully wired (parse, validate, propagate, downstream-consume) and never documented in the help surface they should anchor. Both are stale-contract-vs-runtime divergences at the CLI surface, but on different sub-axes (advertised-flag-stale-mode vs unadvertised-flag-fully-wired).
Gap. This is a **help-contract drift / fully-wired-but-undocumented-flag** divergence at the CLI surface, distinct from #263. #263 catalogues a documented flag whose mode matrix is incorrect; #270 catalogues four flags that the dispatcher fully accepts and the runtime fully consumes but `print_help_to` never lists in its `Flags:` block. It is also distinct from #262 (missing `--max-turns` flag — that flag is *neither* in help *nor* in dispatch), distinct from #267 (`prompt TEXT` greedy-slurp parse-asymmetry — that is a parse-side contract gap, not a help-listing gap), and distinct from #265 (`stream-json` output mode absent from both help and dispatch). #270 is specifically the **dispatch-accepted-but-help-omitted** sub-shape, which the help-contract-drift cluster needs to cover symmetrically alongside #263's documented-but-stale sub-shape.
Required fix shape: (a) extend the `Flags:` block in `print_help_to` to list `--reasoning-effort {low|medium|high}` with the validated value set inline (matching the error message at `main.rs:921-924`), `--base-commit COMMIT` with a one-line description tying it to `/diff` and merge-status workflows, `--allow-broad-cwd` with the security-scope semantics (when broad-cwd traversal is allowed and what it overrides), and `-p PROMPT` as the documented Claw Code compat one-shot alias with a pointer to the canonical `prompt TEXT` subcommand and to #267's greedy-slurp caveat; (b) add an `Examples:` line for at least `--reasoning-effort` and `--base-commit` matching their actual usage shape; (c) add a parser/help-parity regression test that asserts every `match`-arm string literal in `parse_args` for top-level flags appears at least once in the captured `print_help_to` output (mechanically forces help-vs-dispatch sync for future flags); (d) extend that test to also cover `=`-form variants so the next `--foo=` flag added cannot drift; (e) emit a typed `cli_parse` warning when help is generated but a registered flag has no help line (compile-time/test-time enforcement). Acceptance: `claw --help` lists every dispatch-accepted top-level flag; the parity test fails when a new flag is added to `parse_args` without a help line; #263 and #270 together close the help-contract-drift cluster on both the stale-mode and fully-omitted sub-axes.
**Status:** Open. No source code changed. Filed 2026-04-26 14:34 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `364566c` before filing (post-rebase fast-forward onto gaebal-gajae's #269 dogfood-transport-payload-budget pinpoint). Cluster delta: help-contract-drift cluster 1→2 (#263 founder + #270 second-member, closes the documented-but-stale ↔ fully-wired-but-undocumented sub-axis pair); complementary-pinpoint-pair-bundle discovery-pattern extended to 5 bundles total (#245+#250 WebSearch, #262+#264 turn-budget, #264+#266 typed-error-axis, #254+#268 MCP-resources/tools-lifecycle, now #263+#270 help-contract-drift-stale-mode-vs-omitted). Smaller-scope by design (matches #253/#254/#257/#258/#260/#261/#262/#263/#264/#265/#266/#267/#268/#269 context-budget discipline). Sister: #263 (help-contract-drift founder; #263 stale-mode-matrix on a listed flag, #270 whole-flag absence on dispatch-accepted flags). Distinct from #262 (flag missing from BOTH help and dispatch). Distinct from #265 (`stream-json` output mode absent from BOTH help and dispatch). Distinct from #267 (parse-asymmetry, not help-listing). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `364566c+#270` after push.
## Pinpoint #271 — Dogfood status generation lacks a repo-identity/source-of-truth guard, so the same claw-code nudge can drift from `ultraworkers/claw-code` to `code-yeongyu/claw-code` and publish an authoritative-looking report for the wrong project
Dogfooded 2026-04-26 15:00 KST from the live `#clawcode-building-in-public` loop after #270. The nudge explicitly targeted the active `claw-code` dogfood branch (`feat/jobdori-168c-emission-routing`, canonical remotes `origin=https://github.com/ultraworkers/claw-code`, `fork=https://github.com/Yeachan-Heo/claw-code`), and the branch had just advanced through #269/#270. Minutes later a Jobdori status report switched context to `code-yeongyu/claw-code` (private Rust port), described `main` as dormant since 2026-04-02, reported no `ROADMAP.md`, and filed a new pinpoint about `dev/rust` branch drift. That report was structurally plausible but belonged to a different repository/project, not the live dogfood branch that Clawhip was nudging.
Concrete failure mode: a dogfood cycle can satisfy the shape of the requested report while silently changing the repo identity underneath it. Operators then see an authoritative status block with `Repo: code-yeongyu/claw-code`, `Active sessions: 0`, and stale branch analysis, interleaved with the actual `ultraworkers/claw-code` ROADMAP cycle. This creates stale-branch confusion and queue pollution: the wrong repo gets analyzed, the active ROADMAP branch is skipped for that cycle, and subsequent status summaries may mix pinpoints from two unrelated claw-code lineages.
Gap. There is no mandatory repo-identity assertion in dogfood status generation. Reports include freeform `repo` text, but they are not checked against a canonical tuple such as `{remote_url, branch, worktree_path, roadmap_path, expected_head_prefix}` before publishing. This is distinct from #259 (freshness/provenance against git+ROADMAP for the chosen repo) and #269 (transport delivery receipts): #271 validates that the chosen repo is the **intended** repo before any freshness or delivery checks run. A report can be fresh, internally consistent, and delivered successfully while still being for the wrong repository.
Required fix shape: (a) define a canonical dogfood target identity for each Clawhip nudge (`repo_owner/name`, `remote_url`, `branch`, `worktree_path`, required backlog file such as `ROADMAP.md`, and optional fork remote); (b) before generating status or filing a pinpoint, assert the current cwd/remotes/branch/backlog-file match that identity; (c) emit `DOGFOOD_REPO_MISMATCH` and refuse to publish if the active repo is `code-yeongyu/claw-code` or any sibling while the nudge targets `ultraworkers/claw-code`; (d) include the verified tuple in every status report as machine fields, not prose; (e) add regression coverage where two repos named `claw-code` exist and the status command must reject the wrong one despite similar names. Acceptance: a `claw-code` dogfood nudge cannot produce a status report for `code-yeongyu/claw-code`; wrong-repo analysis fails closed with a typed mismatch receipt instead of entering the public status stream.
**Status:** Open. No source code changed. Filed 2026-04-26 15:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `61be826` before filing. Cluster delta: dogfood-repo-identity-guard +1; sibling to #259 (freshness/provenance), #253 (state-vector context budget), and #269 (transport delivery), but distinct source-of-truth selection layer. Concrete delta this cycle: ROADMAP-only pinpoint appended from live wrong-repo report evidence.
## Pinpoint #272`--max-turns 0` zero-turn semantics are unspecified at the spec layer: three valid interpretations (parse-error, unlimited-sentinel, run-zero-iterations-fast-return) coexist with no canonicalization, the existing pinpoints #262 and #264 each prescribe the same fast-return resolution in passing without anchoring to upstream prior art, and the interaction matrix between `max_turns: 0` and `--allowedTools`, `--dangerously-skip-permissions`, session-state recording, hook execution, and `system_prompt` event emission is undefined
Dogfooded 2026-04-26 15:04 KST on `feat/jobdori-168c-emission-routing` at HEAD `29c262c` (post-rebase fast-forward onto gaebal-gajae's #271 dogfood-repo-identity-guard pinpoint). Reproduction matrix against `./rust/target/release/claw` (current built artifact, no source change required to demonstrate the gap):
- `claw --max-turns 0 -p "say hi"``[error-kind: cli_parse] error: unknown option: --max-turns` (rejected pre-`-p`, sister to #262 parse-side absence).
- `claw -p "say hi" --max-turns 0``[error-kind: missing_credentials]` (silently absorbed into prompt body, sister to #262 position-sensitive prompt-pollution).
- `claw --max-turns 0 prompt "say hi"``unknown option: --max-turns` (subcommand path also rejects).
- `claw --max-turns=0 -p "say hi"``unknown option: --max-turns=0` (`=`-form same).
This confirms #262's surface verdict (the flag does not exist) and #264's runtime verdict (no typed primitive to plumb into). What remains uncovered, and what #272 catalogues, is the **spec/contract layer underneath both**: even granting #262's CLI flag and #264's `TurnBudget` struct land, the literal value `0` has at least three operationally distinct interpretations, none of which is anchored to a referenced precedent or bound by an interaction matrix in the existing audit triangle.
Three competing `--max-turns 0` semantics, each operationally valid, mutually inconsistent:
(1) **Parse-time error** (`MaxTurnsParseError: must be ≥ 1`): treats `0` as out-of-range like a negative number; matches the strict-validator family (`--reasoning-effort yolo` at `main.rs:11288-11294` rejects out-of-set values typed). Operationally useful for catching shell-substitution bugs (`--max-turns $UNSET_VAR` expanding to `0`). Cost: blocks the cost-zero parse-validation use case the existing pinpoints both endorse.
(2) **Unlimited sentinel** (`max_turns: 0``usize::MAX`): C-stdlib / many Rust APIs use `0` as "no limit" (e.g. `std::io::Read::take(0)` semantics vary by trait, `tokio::sync::mpsc::channel(0)` rejects, but `tower::limit::ConcurrencyLimit::new(0)` admits zero permits). The existing `max_iterations: usize::MAX` default at `runtime/conversation.rs:181` already encodes "unlimited" as a max-int sentinel, and a careless port of that idiom could land `0`-as-unlimited. Operationally hostile: a user typing `--max-turns 0` to validate-only would instead unleash an unbounded loop.
(3) **Run-zero-iterations fast-return** (`Ok(TurnSummary { iterations: 0, assistant_messages: vec![], no_model_call: true })`): the resolution both #262 fix-shape (d) and #264 fix-shape (g) prescribe in passing. Operationally useful for cost-zero parse-validation ("does my CLI invocation parse, do my hooks load, do my tools register?") without consuming model tokens. But the existing pinpoints prescribe this resolution **without citing the upstream contract** (Anthropic Claude Code's documented `--max-turns 0` behavior, OpenAI Codex's analogous flag, or any Rust runtime precedent), and **without specifying the interaction matrix** below.
Verified concrete surface (rg across `rust/crates/`): zero `MaxTurnsZero`, zero `ZeroTurnFastReturn`, zero `no_model_call`, zero documented `0`-handling for `with_max_iterations` callers. The existing test at `runtime/conversation.rs:1768` constructs `with_max_iterations(1)` (smallest positive integer tested), not `with_max_iterations(0)` — so even the runtime primitive #264 catalogues has no test coverage for the zero-edge case. The subagent default `DEFAULT_AGENT_MAX_ITERATIONS: usize = 32` at `tools/lib.rs:3475` is also untested at boundary `0`.
Undefined interaction matrix (each cell needs a documented contract before any `--max-turns 0` semantics is canonical): (i) `--max-turns 0 --allowedTools "Read,Bash"` — does the empty turn still validate the allow-list (could surface `--allowedTools` parse errors typed) or skip validation (cheaper but loses the use case)? (ii) `--max-turns 0 --dangerously-skip-permissions` — does the zero-turn run still record the dangerous-permission flag in session metadata for audit, or is the session never created? (iii) `--max-turns 0` with a `.claw-session.jsonl` resume — does it append a no-op turn record (preserving session continuity), or silently no-op (saving a row but losing the audit trail of the zero-turn invocation)? (iv) `--max-turns 0` with `PreToolUse` / `PostToolUse` hooks registered — do hooks fire (giving observability of "dispatch reached") or skip (matching the no-tool-call contract)? (v) `--max-turns 0 --output-format json` — does the JSON envelope include `iterations: 0, assistant_messages: []` (consistent with #260's compact-JSON envelope shape) or emit a degenerate `null`/empty body? (vi) `--max-turns 0` and `system_prompt` event lane — does the event still emit (so consumers see the system prompt that *would* have been sent) or skip emission?
Gap. The spec layer is the **prerequisite that #262 and #264's fix-shapes both implicitly assume but neither documents**. #262 prescribes "return immediately after dispatch with `iterations: 0` and no model call" as the right semantic in fix-shape (d), but does not anchor that choice to a referenced upstream contract, does not address negative values explicitly (`--max-turns -1`: error or alias for unlimited?), and does not address `u32::MAX` (does any positive integer mean unlimited, or does the field have to become `Option<u32>`?). #264 prescribes the same `Ok(TurnSummary { iterations: 0, … })` in fix-shape (g) but inherits the same un-anchored decision and adds no interaction-matrix coverage. **No pinpoint in the existing turn-budget cluster catalogues the canonicalization act itself**: choosing semantic (3) over (1) and (2), citing the precedent, and binding the choice across the six interaction cells above.
Cluster shape novelty: completes the **turn-budget audit triangle** (#262 = CLI-parse layer, #264 = runtime-typed-primitive layer, #272 = spec/contract layer). The triangle now covers all three structural slots a single user-facing flag must occupy before it can ship: the request-shape gap (parse), the type-shape gap (primitive), and the meaning-shape gap (spec). Distinct from #262 (parse-side absence; #262 is "flag does not exist on the parser", #272 is "even if it existed, the value `0` has no canonical meaning"). Distinct from #264 (runtime-primitive absence; #264 is "the type cannot represent the budget", #272 is "the budget value `0` resolves to three different runtime behaviors"). Distinct from #266 (`RuntimeErrorKind` typed-error enum gap; #266 catalogues missing typed-error discriminants, #272 catalogues missing typed-success-with-zero-iterations contract).
Discovery-pattern continuation: founds the **Spec/contract-canonicalization-gap** sub-shape inside the turn-budget cluster, the FIRST pinpoint where the gap is not absent-flag (#262), absent-primitive (#264), or absent-error-kind (#266) but **absent-canonical-meaning-for-an-edge-value**. This sub-shape is portable: any other knob with a numeric or sentinel-shaped value (`--max-iterations`, `--timeout 0`, `--retries 0`, `--max-output-tokens 0`) has the same three-way ambiguity until canonicalized. Extends the audit-triangle pattern itself to a **three-layer-completeness** primitive: a single user-facing capability requires audit at parse-layer + primitive-layer + spec-layer before any of the three is shippable. Sister-shaped to #245+#250 (WebSearch client+server pair) and #262+#264 (turn-budget parse+primitive pair) but extends those bundles from 2-tuple to 3-tuple.
Required fix shape: (a) write a `TURN_BUDGET_SPEC.md` (or similar canonical-contract document under `docs/specs/`) that anchors `--max-turns 0` semantics to a referenced upstream precedent (Anthropic Claude Code's documented zero-turn behavior, with link), explicitly resolves the choice as semantic (3) `run-zero-iterations-fast-return`, and explicitly rejects semantics (1) and (2) with rationale; (b) define negative-value handling: `--max-turns -1` rejected at parse-time as `cli_parse` error (mirrors `--reasoning-effort yolo` typed-rejection precedent), with no alias semantics; (c) define unlimited-budget handling: introduce `--max-turns unlimited` as an explicit string sentinel OR document that no value means unlimited and unbounded loops require omitting the flag (avoid u32::MAX-as-sentinel which breaks downstream JSON consumers); (d) document the six-cell interaction matrix above with one paragraph per cell, each binding to a typed event/receipt: zero-turn run with `--allowedTools` validates the allow-list (cell i), records dangerous-permission flag in session-meta (cell ii), appends a typed `ZeroTurnInvocation` row to session.jsonl (cell iii), skips hook execution (cell iv) consistent with no-tool-call contract, emits `iterations: 0, assistant_messages: []` JSON envelope (cell v), and emits the `system_prompt` event so consumers can audit the would-have-been-sent prompt (cell vi); (e) replace `max_turns: u32` with `max_turns: TurnLimit` enum where `pub enum TurnLimit { Unlimited, ZeroFastReturn, Bounded(NonZeroU32) }` so the type system enforces the spec at compile-time and `0` cannot be confused with `unlimited` at any call site; (f) add tests for each interaction-matrix cell (i-vi) plus `TurnLimit::ZeroFastReturn` round-trip through `--output-format json`; (g) cross-reference the spec document from the `--max-turns` help text (#262 fix-shape (b) addition) and from the `TurnBudget` doc-comment (#264 fix-shape (a)) so future readers find canonical meaning before runtime behavior. Acceptance: `--max-turns 0` has exactly one documented behavior across CLI/runtime/JSON layers; the type system prevents semantics (1) and (2) from being silently introduced by a refactor; the six interaction cells each have a typed receipt; #262 and #264's fix-shapes can land knowing which `0` they are encoding; future zero-edge knobs (`--retries 0`, `--timeout 0`) have a canonicalization template to follow.
**Status:** Open. No source code changed. Filed 2026-04-26 15:04 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `29c262c` before filing (post-rebase fast-forward onto gaebal-gajae's #271 dogfood-repo-identity-guard pinpoint). Cluster delta: turn-budget cluster 2→3 (#262 parse-layer + #264 primitive-layer + #272 spec-layer = audit-triangle complete on a single user-facing flag); Spec/contract-canonicalization-gap sub-shape introduced (NEW sub-shape inside turn-budget cluster, portable to any sentinel-shaped numeric flag); complementary-pinpoint-pair-bundle discovery-pattern extended from 5 bundles to a first **three-tuple** (#262+#264+#272). Smaller-scope by design (matches #253-#271 context-budget discipline). Sister: #262 (parse-side; #262+#272 bracket the parse boundary's request-shape and meaning-shape gaps), #264 (primitive-side; #264+#272 bracket the runtime layer's type-shape and meaning-shape gaps), #266 (runtime-error-enum gap, parallel typed-meaning-axis but on errors not successes). Distinct from #271 (repo-identity guard at the dogfood layer; orthogonal to the turn-budget audit triangle). Distinct from silent-fallback family (catalogues silent input/output mutation; #272 catalogues missing canonical meaning at edge value). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `29c262c+#272` after push.
## Pinpoint #273`claw status --output-format json` reports branch and dirty/clean state but omits HEAD SHA, upstream remote URL, ahead/behind counts, fetch timestamp, and source-of-truth repo identity, so machine consumers cannot detect stale/wrong-repo status from the product status surface itself
Dogfooded 2026-04-26 15:31 KST on `feat/jobdori-168c-emission-routing` at HEAD `ba6c5bc` (post-#272). Fresh `cargo run --quiet --bin claw -- --output-format json status` from `rust/` emits a useful workspace object (`cwd`, `project_root`, `git_branch`, `git_state`, changed/staged/unstaged/untracked counts, config/memory counts), but it omits the actual commit identity and provenance fields needed by dogfood automation: no `head_sha`, no `head_message`, no `head_timestamp`, no `upstream_branch`, no `upstream_remote_url`, no ahead/behind counts, no `last_fetch_at`, no canonical repo/source-of-truth slug, no `roadmap_last_pinpoint`, and no staleness marker. Text mode has the same gap: `Git branch feat/jobdori-168c-emission-routing` and `Git state clean`, but no commit/remotes/freshness.
Concrete failure mode: downstream claws can call the product-owned `claw status` surface and still cannot prove they are on the same HEAD as origin/fork, cannot distinguish `ultraworkers/claw-code` from a similarly named sibling repo, cannot detect that a local worktree is behind by one or more ROADMAP filings, and cannot cite the exact commit used for a dogfood report without shelling out to `git rev-parse`, `git remote -v`, `git rev-list --left-right --count`, and ROADMAP parsing. This is exactly the metadata that #259 (fresh status provenance) and #271 (repo identity guard) require, but it is missing from the canonical local status command that automation would naturally consume.
Gap. `claw status` is currently a local workspace cleanliness snapshot, not a provenance/freshness snapshot. That is fine for a human pre-commit check, but insufficient for recurring dogfood/status automation. This is distinct from #259, which requires dogfood status reports to include provenance; #273 identifies that the underlying product status surface does not provide those fields. It is distinct from #271, which requires repo-identity guards in dogfood generation; #273 identifies the missing repo identity fields in the product's JSON status. It is distinct from #269 transport delivery: #273 is pre-delivery status truth.
Required fix shape: (a) extend `claw status --output-format json` `workspace` with `git_head_sha`, `git_head_short`, `git_head_message`, `git_head_timestamp`, `upstream_branch`, `upstream_remote_name`, `upstream_remote_url`, `ahead`, `behind`, `last_fetch_at` (nullable), and `source_of_truth_repo` derived from the existing `OFFICIAL_REPO_URL`; (b) add optional `roadmap_last_pinpoint` / `roadmap_path` when `ROADMAP.md` exists at project root; (c) add `staleness_seconds` or `freshness_status` when upstream data is available, and `freshness_status: unknown_no_fetch` when it is not; (d) mirror the key fields in text mode compactly; (e) regression-test clean, ahead, behind, detached, no-upstream, and wrong-remote fixtures. Acceptance: a dogfood reporter can consume only `claw status --output-format json` plus ROADMAP content and refuse stale/wrong-repo reports without ad hoc git shelling.
**Status:** Open. No source code changed. Filed 2026-04-26 15:31 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `ba6c5bc` before filing. Cluster delta: product-status-provenance +1; sibling to #259 (report provenance) and #271 (repo identity guard), but distinct product status surface layer. Concrete delta this cycle: ROADMAP-only pinpoint appended after live `claw status` JSON/text verification.
## Pinpoint #274 — MCP tool calls and results render with the generic untyped fallback formatter while native tools get rich field-aware renderers, so every `mcp__server__tool` invocation prints a 96-char JSON summary plus a raw pretty-printed result block instead of the structured icon/path/lines/preview affordances `bash`/`Read`/`Write`/`Edit`/`Glob`/`Grep`/`WebSearch` enjoy
Dogfooded 2026-04-26 15:32 KST on `feat/jobdori-168c-emission-routing` at HEAD `f36f283` (post-#273). Static audit of `rust/crates/rusty-claude-cli/src/main.rs` `format_tool_call_start` (line 8504) and `format_tool_result` (line 8557) confirms the rendering arms match exclusively on native tool aliases: `"bash" | "Bash"`, `"read_file" | "Read"`, `"write_file" | "Write"`, `"edit_file" | "Edit"`, `"glob_search" | "Glob"`, `"grep_search" | "Grep"`, `"web_search" | "WebSearch"`. MCP qualified names follow the `mcp__{server}__{tool}` shape produced by `runtime::mcp::mcp_tool_name` (e.g., `mcp__claude_ai_Example_Server__weather_tool`); none of these match any specialized arm and both functions fall through to the wildcard `_ =>` branch — `summarize_tool_payload(input)` (96-char JSON-compaction truncate) for the call-start banner and `format_generic_tool_result(icon, name, &parsed)` (pretty-printed JSON dump capped at 60 lines / 4000 chars) for the result.
Concrete failure mode: an MCP server like `mcp__filesystem__read_file` performs the same logical operation as the native `Read` tool, but the user sees `╰─ mcp__filesystem__read_file ─╮` with `{"path":"…"}` JSON-summarized to 96 chars and a raw JSON pretty-print of the file content as result, instead of the native `📄 Reading <path>…` start banner with structured `✓ read_file: <line-count> lines` rendering. Same goes for MCP search tools (no `🔎` icon, no match summary), MCP write tools (no `✏️ Writing <path> (<lines> lines)` banner, no diff preview), and MCP shell tools (no `format_bash_result` exit-code/stdout/stderr structuring). Worse, MCP tool input schemas are typically known to the client (`tools/list` returns `inputSchema`), so the renderer has the metadata to extract semantic fields like `path`, `query`, `command`, `content`, `pattern` — it just doesn't.
Gap. The renderer treats MCP tools as opaque black boxes even though they cover the same semantic categories as native tools (file ops, search, shell, web). This is distinct from #254 (MCP Resources lifecycle absence — server-side concept), distinct from #258/#266/#272 (CLI parse / typed-error / spec gaps), distinct from #268 (`tools/list` re-fetch on resume — staleness, not rendering), and distinct from #261 (compact-summary internal consistency — doesn't touch tool rendering). It founds a NEW `mcp-vs-native-tool-rendering-parity` cluster on the rendering axis and pairs structurally with #268 along the MCP-axis (#268 = catalog freshness, #274 = rendering parity), forming an MCP cross-axis bundle.
Required fix shape: (a) introduce a `ToolRenderingProfile` enum keyed off the runtime tool category (FileRead, FileWrite, FileEdit, Search, Shell, WebSearch, Generic) and have both native and MCP tools advertise their category at registration time; (b) when an MCP tool's `inputSchema` declares well-known field names (`path`, `paths`, `query`, `pattern`, `command`, `content`), extract them in `format_tool_call_start` via a schema-aware path-extractor that supersedes the current `extract_tool_path` hardcoded key list; (c) thread server identity through the renderer so MCP tool banners can show `⚡ {server} · {tool}` instead of the raw `mcp__server__tool` underscore-glob; (d) emit a `tool_call_render_kind` field in the JSON envelope (`native_typed`, `mcp_typed`, `mcp_generic`, `untyped_fallback`) so dogfood audits can count parity coverage over time; (e) regression-test that an MCP tool whose schema declares `path: string` renders the same `📄 Reading {path}…` start banner as native `Read`, that an MCP shell tool with `command: string` renders the same exit-code/stdout/stderr structure as native `bash`, and that unknown-schema MCP tools fall back to `format_generic_tool_result` cleanly without breaking layout. Acceptance: an MCP `read_file` and a native `Read` of the same path produce visually equivalent terminal output, and the JSON envelope's per-tool `render_kind` field is populated for every tool call.
**Status:** Open. No source code changed. Filed 2026-04-26 15:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `f36f283` before filing. Cluster delta: founds NEW `mcp-vs-native-tool-rendering-parity` cluster (1 member); pairs with #268 (MCP catalog freshness) on the MCP-axis as a cross-axis bundle (rendering × staleness). Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit of `format_tool_call_start` / `format_tool_result` rendering arms vs `runtime::mcp::mcp_tool_name` qualified-name shape — zero MCP-aware match arms, full fallback to generic untyped path.
## Pinpoint #275`claw doctor --output-format json` splits repo provenance across unrelated `install source`, `workspace`, and `system` checks, so automation cannot consume one authoritative workspace provenance object even though the needed fragments are partially present
Dogfooded 2026-04-26 16:02 KST on `feat/jobdori-168c-emission-routing` at HEAD `fdf8890` (post-#274). Fresh `cargo run --quiet --bin claw -- --output-format json doctor` shows the data needed for provenance is scattered: the `install source` check has `official_repo: https://github.com/ultraworkers/claw-code`, the `workspace` check has `cwd`, `project_root`, `git_branch`, and `git_state`, while the `system` check has `git_sha: fdf88903`. There is no single object tying those together as "this workspace is repo X, branch Y, head Z, clean/dirty, upstream A, source-of-truth B". The text report has the same split: official repo under Install source, branch/clean under Workspace, Git SHA under System.
Concrete failure mode: a dogfood/status consumer that uses `doctor` instead of `status` can see official source-of-truth and a local Git SHA, but cannot know whether that SHA belongs to the workspace branch being inspected, whether the workspace remote actually matches the official repo, whether the local branch is ahead/behind origin/fork, or whether the official repo check is merely a static install warning unrelated to the current cwd. This is the doctor-surface sibling of #273: #273 found `claw status` lacks provenance fields entirely; #275 finds `claw doctor` has fragments but no normalized provenance object.
Gap. Diagnostic surfaces duplicate and diverge: `status` is branch/clean focused, `doctor` is health-check focused, and neither emits a canonical `workspace_provenance` object. This forces downstream claws to shell out to git or scrape multiple `doctor.checks[]` entries and infer joins by name. It is distinct from #259/#271 (dogfood report provenance/repo guard) and distinct from #273 (status surface missing fields); #275 targets the doctor health surface's fragmented schema.
Required fix shape: (a) add a top-level `workspace_provenance` object to `doctor` JSON containing `project_root`, `cwd`, `git_head_sha`, `git_branch`, `git_state`, `remote_urls`, `upstream_branch`, `ahead`, `behind`, `official_repo`, and `repo_identity_status: matches_official|fork_of_official|mismatch|unknown`; (b) have `status` and `doctor` share the same provenance struct/renderer so fields cannot drift; (c) in text mode, add one compact Provenance section instead of scattering related fields across Install source/Workspace/System; (d) add tests proving a wrong remote reports `repo_identity_status: mismatch` without needing downstream string scraping. Acceptance: automation can read one `doctor.workspace_provenance` object and decide whether the current cwd is the intended claw-code worktree at the expected HEAD.
**Status:** Open. No source code changed. Filed 2026-04-26 16:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `fdf8890` before filing. Cluster delta: product-diagnostic-provenance +1; sister to #273 (`status` provenance) but distinct `doctor` fragmented-schema surface. Concrete delta this cycle: ROADMAP-only pinpoint appended after live `doctor` JSON/text verification.
## Pinpoint #276`--allowedTools` help line advertises only the camelCase form and an opaque "repeatable; comma-separated" prose, while `parse_args` additionally dispatches the kebab-case alias `--allowed-tools` and both `=`-form variants (`--allowedTools=VAL` / `--allowed-tools=VAL`), so the listed-flag's alias-and-value-shape coverage in help is structurally incomplete vs the parser — third help-contract-drift sub-axis distinct from #263 (listed-flag-stale-mode-matrix) and #270 (whole-flag-omitted)
Dogfooded 2026-04-26 16:14 KST on `feat/jobdori-168c-emission-routing` at HEAD `0240cad` (post-rebase fast-forward onto gaebal-gajae's #275 `claw doctor` provenance fragmentation pinpoint). Fresh `cargo run --quiet --bin claw -- --help` lists `--allowedTools TOOLS Restrict enabled tools (repeatable; comma-separated aliases supported)` at `rust/crates/rusty-claude-cli/src/main.rs:9418` and `claw [--model MODEL] [--allowedTools TOOL[,TOOL...]]` at `:9334` — both surfaces use only the camelCase `--allowedTools` token. Source inspection of `parse_args` at `:979-994` shows the dispatcher accepts four shape variants for the same flag: `--allowedTools VAL` (line 979 first arm), `--allowed-tools VAL` (same arm OR-pattern), `--allowedTools=VAL` (line 986 prefix arm), and `--allowed-tools=VAL` (line 990 second prefix arm). Live verification with the freshly rebuilt `target/debug/claw`: all four shapes parse past the CLI parse stage and fail downstream with `[error-kind: missing_credentials]`, confirming each variant is fully dispatch-accepted. Repeated invocation (`--allowedTools read --allowedTools glob`) also dispatches successfully, confirming the help's bare "repeatable" prose corresponds to a real wired surface but with no documented composition rule (does the second occurrence replace, append, or set-union the first?).
Concrete failure mode: an operator scripting `claw` with shell tooling that prefers kebab-case-only conventions (e.g. POSIX-style argv generators, automation that derives flag names from snake_case fields via `s/_/-/g`, or downstream claws that mirror Anthropic Claude Code's documented `--allowed-tools` form) sees `claw --help` advertise only `--allowedTools` and either (a) avoids the kebab-case alias under the false belief it does not exist; or (b) discovers it accidentally by typing it; or (c) reads source. The `=`-form for both casings is the same: an operator habituated to `--flag=value` shell syntax cannot tell from help that `--allowedTools=read,glob` is a real path and may avoid it. The "repeatable" prose has no example showing whether `--allowedTools read --allowedTools glob` set-unions to `{read, glob}` or whether the second occurrence overwrites — fresh verification of `normalize_allowed_tools` at `main.rs:1826` and `current_tool_registry()?.normalize_allowed_tools(values)` at `tools/src/lib.rs:192` shows the values are accumulated into a flat `Vec<String>` and each is comma-split-and-flattened into a `BTreeSet`, so the actual semantic is set-union — but the help prose neither states this nor cites the alternative.
Gap. The help-contract-drift cluster's third sub-axis: a listed flag whose dispatch surface accepts more shapes (alias casings, `=`-form, repetition with a defined composition rule) than the help line advertises. This is distinct from #263 (listed-flag-stale-mode-matrix: a documented flag whose mode-compatibility prose contradicts dispatch — `--compact` claims text-only while JSON dispatch path exists) and distinct from #270 (whole-flag-omitted: a fully wired flag with zero help line — `--reasoning-effort`, `--base-commit`, `--allow-broad-cwd`, `-p`). #276 catalogues the **listed-flag-incomplete-shape-coverage** sub-shape: the flag IS in help, but the help understates which forms parse, what the value-shape composition rule is, and whether common alias conventions (kebab-case, `=`-form, repetition semantics) apply. The cluster now has 3 members (#263 + #270 + #276) covering three structurally distinct help-vs-dispatch divergence shapes: stale mode (listed, wrong info), omission (unlisted, full info elsewhere), and incomplete shape (listed, partial info).
Distinct from #258 (`--allowedTools ""` empty-value silent-coercion at the CLI parse boundary; #258 is a runtime acceptance-of-malformed-input gap, #276 is a help-text-vs-dispatch-shape-coverage gap on the same flag). Distinct from #267 (`prompt TEXT` greedy-slurp parse-asymmetry; that's a parse-side contract gap on a different surface). Distinct from #265 (`stream-json` output mode absent from BOTH help and dispatch). Distinct from #262 (`--max-turns` flag missing from BOTH help and dispatch).
Discovery-pattern continuation: completes the help-contract-drift cluster's three-sub-axis audit triangle (stale-mode #263 + omitted-flag #270 + incomplete-shape #276), structurally analogous to the turn-budget audit triangle (#262 parse + #264 primitive + #272 spec) — both clusters now occupy three distinct structural slots a single CLI surface can fail at. Extends the **complementary-pinpoint-pair-bundle** discovery-pattern from 5 pair-bundles + 1 three-tuple (turn-budget #262+#264+#272) to 5 pair-bundles + 2 three-tuples (now help-contract-drift #263+#270+#276). The two three-tuples are sister-shaped: each catalogues that audit-completeness for a single user-facing CLI surface requires pinpointing THREE distinct sub-axes rather than two.
Required fix shape: (a) extend the `Flags:` block at `print_help_to:9418` to advertise both casings (`--allowedTools, --allowed-tools TOOLS`) on a single line, matching the OR-pattern in `parse_args:979`; (b) document `=`-form support inline (`--allowedTools=TOOLS, --allowed-tools=TOOLS` accepted) — the existing prose offers no signal that `=`-form parses; (c) document the repetition composition rule explicitly: `repeatable: each occurrence set-unions into the allow-list; pass once per logical group or comma-separate within one occurrence` — eliminating the ambiguity between replace/append/union semantics; (d) add an `Examples:` line showing kebab-case + repetition: `claw --allowed-tools read --allowed-tools glob "summarize Cargo.toml"`; (e) add a help-vs-dispatch alias-coverage regression test that asserts every flag-name-string-literal in `parse_args` (including OR-patterns and `starts_with` prefix arms) appears at least once in the captured `print_help_to` output — mechanically forces help-vs-dispatch alias sync for future flags; (f) extend the test to cover `=`-form variants by asserting that any flag accepting `--foo VAL` form also has its `--foo=VAL` form documented when both arms exist; (g) audit other listed flags for the same incomplete-shape-coverage sub-shape: `--output-format` (does the parser accept `--output-format=json`? — yes per `:889`, but help shows only the space-form), `--permission-mode` (same — `:893` accepts `=`-form, help shows only space-form), `--model` (verify), `--base-commit` after #270 fix lands. Acceptance: `claw --help` discloses every shape variant the parser accepts for every listed flag; the regression test fails when a new alias/equals-form arm is added without a help update; #263 + #270 + #276 together close the help-contract-drift cluster on all three structurally distinct help-vs-dispatch divergence shapes.
**Status:** Open. No source code changed. Filed 2026-04-26 16:14 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `0240cad` before filing (post-rebase fast-forward onto gaebal-gajae's #275 `claw doctor` provenance fragmentation pinpoint). Cluster delta: help-contract-drift cluster 2→3 (#263 stale-mode-matrix + #270 whole-flag-omitted + #276 listed-flag-incomplete-shape-coverage = three-sub-axis audit triangle complete on the help surface); complementary-pinpoint-pair-bundle discovery-pattern extended to 5 pair-bundles + 2 three-tuples (turn-budget #262+#264+#272 + help-contract-drift #263+#270+#276). Smaller-scope by design (matches #253-#275 context-budget discipline). Sister: #263 (stale-mode-matrix sub-axis), #270 (whole-flag-omitted sub-axis); #276 occupies the third structurally distinct sub-axis (listed-flag-incomplete-shape-coverage). Distinct from #258 (silent-coercion of empty value at parse boundary on the same flag; orthogonal layer — runtime acceptance vs help advertisement). Distinct from turn-budget cluster three-tuple (#262+#264+#272: parse/primitive/spec layers; #263+#270+#276 are three sub-axes within a single layer — the help surface). Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `0240cad+#276` after push.
## Pinpoint #277 — Dogfood reminder delivery can fail with bare `Unknown Channel` because the nudge/report path does not pre-resolve and validate channel targets against the live provider directory before attempting send
Dogfooded 2026-04-26 16:30 KST from the live `#clawcode-building-in-public` loop after #276. The dogfood reminder/status loop emitted `Cron job "clawcode-dogfood-cycle-reminder" failed: Error: Unknown Channel` in the same channel that was otherwise actively receiving git hooks, Jobdori reports, and Clawhip nudges. The visible error contains no target channel id/name, no provider account, no guild id, no route key, no whether the channel was deleted vs not in cache vs wrong provider vs permission denied, and no retry/fallback target.
Concrete failure mode: a recurring dogfood nudge can die before delivery because the configured target cannot be resolved at send time, but the only surfaced signal is a bare provider error. Operators cannot tell whether the cron used a stale channel id, a name instead of id, the wrong Discord account/guild, a missing allowlist route, or a transient directory/cache miss. This is distinct from #269: #269 covers payload-size truncation and post-send delivery receipts; #277 covers pre-send target resolution and channel identity validation before any payload is sent.
Gap. There is no typed `channel_resolution` preflight in the dogfood delivery path. A robust cycle should resolve `{provider, guild_id, channel_id, channel_name, route_key}` before rendering/sending the report, cache the resolved identity with a freshness timestamp, and fail closed with a typed diagnostic if the target is unknown. The current surface lets a low-level `Unknown Channel` bubble up with no context, which makes the next action ambiguous and risks repeated cron failures against the same bad target.
Required fix shape: (a) add a channel-target preflight before dogfood reminder send that resolves the configured target to a canonical channel id/guild/provider tuple; (b) emit a typed `delivery_target_resolution_failed { provider, configured_target, guild_id, reason, directory_freshness_ms, fallback_targets }` event instead of bare `Unknown Channel`; (c) distinguish `not_found`, `permission_denied`, `wrong_provider`, `wrong_guild`, `cache_stale`, and `deleted_or_archived`; (d) include the resolved target tuple in successful delivery receipts (#269 sibling) so later reports can prove which channel was used; (e) add regression coverage where a stale channel id fails at preflight with a typed diagnostic and does not attempt message send. Acceptance: dogfood cron never surfaces a naked `Unknown Channel`; it reports the exact configured target, resolution failure class, and safe next action.
**Status:** Open. No source code changed. Filed 2026-04-26 16:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `cad7bb1` before filing. Cluster delta: dogfood-delivery-target-resolution +1; sibling to #269 (payload/delivery receipt) and #246 (cron timeout ambiguity), distinct pre-send channel identity layer. Concrete delta this cycle: ROADMAP-only pinpoint appended from live `Unknown Channel` cron failure evidence.
## Pinpoint #278`Session::from_jsonl` and `Session::from_json` parse the `version` field, store it verbatim, but never compare it against `SESSION_VERSION`, so a session file with any `u32` version (past, future, or corrupted) loads successfully and is silently treated as the current schema with default-filled-or-dropped fields
Dogfooded 2026-04-26 16:34 KST on `feat/jobdori-168c-emission-routing` at HEAD `4e4edc8` (post fast-forward onto gaebal-gajae's #277 channel-resolution preflight pinpoint). Static audit of `rust/crates/runtime/src/session.rs` shows `const SESSION_VERSION: u32 = 1;` at line 12, and three call sites that touch the field: `Session::new` at line 162 sets `version: SESSION_VERSION` on creation; `Session::from_json` at lines 338-343 parses `version` via `required_u32` and returns `SessionError::Format("version out of range")` only when `u32::try_from` fails; `Session::from_jsonl` at lines 406+445 initializes a local `let mut version = SESSION_VERSION;` and overwrites it from the `session_meta` record's `version` field via `required_u32`. In every loader path the parsed value is stored in the returned `Session.version` field without any comparison against `SESSION_VERSION`. `grep -rn "SESSION_VERSION" rust/crates/runtime/src/session.rs` returns exactly three hits — declaration + two assignment sites — and zero comparison/migration/reject sites. `grep -iE "migrat|incompat|upgrade|downgrade|reject|mismatch|future|forward" rust/crates/runtime/src/session.rs` returns zero matches.
Concrete failure mode: a session JSONL produced by a future claw-code release with `version: 2` and new schema fields (e.g. an unforeseen `tool_call_render_kind` per #274, a `workspace_provenance` block per #275, a `health_check_history` array, or a renamed `compaction` shape) is loaded by an older claw-code build. The older build silently accepts `version: 2`, drops every unknown record type via `"unsupported JSONL record type at line {}: {other}"` (which is structurally fine for forward-compat), but stores `self.version = 2` in memory, then on next save writes back a `session_meta` record with `version: 2` mixed with v1-shape data. Symmetrically, a corrupted or hand-edited session with `version: 999` or `version: 0` loads without warning and round-trips as if it were the live schema. There is no operator-visible signal — no warning log, no typed error, no `--strict-version` opt-in — that the on-disk schema does not match the binary's expectations. Combined with #259/#271/#273/#275 (provenance fragmentation across dogfood/status/doctor surfaces), this means a session file's schema-of-record is not auditable from any product surface either.
Gap. There is no version-mismatch policy. The on-disk `version` field is treated as an opaque tag rather than a contract. A correct loader for a versioned format must either (i) accept only `version == SESSION_VERSION` and reject everything else with a typed `SchemaVersionMismatch { found, expected }` error, (ii) maintain an explicit migration table that upgrades older versions to the current shape and refuses unknown future versions, or (iii) document a forward-compat policy with explicit field-level handling rules. claw-code does none of these; the field is parsed for storage only. This is structurally identical to a database without a schema_version column being silently bumped — the data still loads, but downstream consumers cannot tell whether they got the schema they expected.
Distinct from #259 (dogfood report provenance — runtime emission, not on-disk persistence). Distinct from #271 (repo-identity guard — workspace remote provenance, not session schema). Distinct from #273/#275 (status/doctor diagnostic-surface provenance fragmentation — product surface field layout, not session loader semantics). Distinct from #266 (typed-error-kind for credentials missing — error-kind for one specific runtime decision, not a missing-error-kind for schema versioning). Distinct from the entire help-contract-drift cluster (#263+#270+#276 — CLI surface vs dispatcher), the turn-budget triangle (#262+#264+#272 — parse/primitive/spec layers), the MCP-axis cluster (#254+#268+#274+#275 — MCP runtime/catalog/rendering/doctor), and the provenance quartet (#259+#271+#273+#275 — provenance surfacing). #278 founds a NEW `persisted-schema-version-policy` cluster on the persistence-layer axis — the first pinpoint to target what the loader does (or fails to do) with the `version` field on disk rather than what the diagnostic surface emits about state.
Discovery-pattern continuation: extends the structural-gap-without-source-change discovery-pattern (silent-fallback cluster style — accept input that should be rejected) into the on-disk persistence layer. Pairs structurally with the silent-fallback cluster: silent-fallback accepts malformed CLI/runtime input without a typed error; #278 accepts wrong-version on-disk state without a typed error. Both are structurally-absent-error-kind gaps but in different layers (input vs persisted state). Pairs orthogonally with the provenance quartet: provenance is about emitting state-of-record at runtime; #278 is about validating state-of-record at load. Together they form a state-of-record cross-axis bundle (emission × validation).
Required fix shape: (a) add a typed `SessionError::SchemaVersionMismatch { found: u32, expected: u32, policy: VersionPolicy }` variant where `VersionPolicy` is `Strict | MigrationAvailable | ForwardCompatibleReadOnly`; (b) at the top of `from_json` and after the `session_meta` parse in `from_jsonl`, compare the parsed `version` against `SESSION_VERSION` and short-circuit when not equal under the configured policy; (c) introduce a small `migrate_session(version: u32, raw: &JsonValue) -> Result<Session, SessionError>` table even if the only entry today is `1 -> 1` identity, so future versions land with one well-known extension point; (d) when loading a future version under a `ForwardCompatibleReadOnly` policy, refuse to write back to the same path (preserve the original) and surface a one-time warning; (e) extend tests to cover `version: 0` (rejected/migrated), `version: 2` (rejected or read-only), `version: 999` (rejected), and a missing `version` field (rejected with a clear message rather than silently defaulting to `SESSION_VERSION`); (f) add a session-version provenance line to `claw status` and `claw doctor` output (closes the #273/#275 surface gap for this specific schema-of-record dimension) so operators can inspect on-disk schema age vs binary schema age without scraping JSONL. Acceptance: a session file with any version other than `SESSION_VERSION` produces a typed, surface-visible diagnostic before any in-memory state is mutated; the migration table is the single extension point for future bumps; `claw status` and `claw doctor` show `session_schema_version` as a first-class provenance field.
**Status:** Open. No source code changed. Filed 2026-04-26 16:34 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `4e4edc8` before filing (post fast-forward onto gaebal-gajae's #277 channel-resolution preflight). Cluster delta: founds NEW `persisted-schema-version-policy` cluster (1 member) on the persistence-layer axis. Cross-axis bundle with silent-fallback cluster (input vs persisted state, structurally-absent-error-kind) and with provenance quartet #259+#271+#273+#275 (emission vs load-validation of state-of-record). Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit of `Session::from_json`/`Session::from_jsonl` version-handling arms — three call sites for `SESSION_VERSION` (1 declaration + 2 assignments, 0 comparisons), zero migration/mismatch sites in the entire `runtime/src/session.rs` file. Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `4e4edc8+#278` after push.
## Pinpoint #279 — Session loader silently drops unknown fields inside known JSON/JSONL records with no extension policy or preservation map, so future schema data can be erased even if #278 adds version checks later
Dogfooded 2026-04-26 17:02 KST on `feat/jobdori-168c-emission-routing` at HEAD `6c154c9` (post-#278). Static audit of `rust/crates/runtime/src/session.rs` shows the loader rejects unknown top-level JSONL record `type` values (`unsupported JSONL record type` at lines 476-480), but for known records it cherry-picks recognized fields and drops all extras. `Session::from_json` reads `version`, `messages`, timestamps, `compaction`, `fork`, `workspace_root`, `prompt_history`, and `model`, then constructs `Session` with no `extensions` / `unknown_fields` preservation. `from_jsonl` does the same for `session_meta`, `message`, `compaction`, and `prompt_history`: any future field inside a known record (for example #268's `tool_catalog_revision`, #272's `ZeroTurnInvocation`, or #273's workspace provenance) is ignored on load and omitted on the next save/render.
Concrete failure mode: a future v2 session can add fields under known record types and be loaded by a v1 binary without an error, warning, or preservation. If the session is later saved/rotated/compacted, those fields disappear. #278 catches that the `version` tag itself is never compared, but even with a version comparison policy the field-level behavior needs a separate contract: should unknown fields in known records be rejected, preserved, quarantined, or ignored? Today the answer is implicit silent data loss.
Gap. The session persistence format lacks an extension/unknown-field policy. It is neither strict (fail on unknown fields in known records) nor forward-compatible (preserve unknown fields for round-trip) nor explicitly lossy (emit a warning/receipt when dropping them). This is distinct from #278, which targets the `version` field not being compared; #279 targets the per-record field preservation policy after a record type is accepted. It is also distinct from #259/#273 provenance emission gaps: this is load/save behavior for persisted state-of-record.
Required fix shape: (a) define a schema policy for unknown fields in known session records: strict reject for incompatible versions OR lossless preservation via `extensions: BTreeMap<String, JsonValue>`; (b) if rejecting, include record type, line number, and field names in a typed `SessionSchemaError::UnknownFields`; (c) if preserving, round-trip unknown fields through `to_json`, `render_jsonl_snapshot`, compaction, and rotation; (d) add `schema_extensions_dropped` warning telemetry if any field is intentionally ignored; (e) regression-test JSON and JSONL sessions containing a future `tool_catalog_revision` field inside `session_meta` and a future per-message field, proving they either fail typed or survive a load-save round-trip. Acceptance: a future schema field in a known record can never disappear silently.
**Status:** Open. No source code changed. Filed 2026-04-26 17:02 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `6c154c9` before filing. Cluster delta: persisted-schema-version-policy 1→2 (#278 version-comparison + #279 unknown-field policy); sibling to #278, distinct field-preservation layer. Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit of `Session::from_json` / `from_jsonl` field selection.
## Pinpoint #280 — Hook execution progress events (`PreToolUse`/`PostToolUse` Started/Completed/Cancelled) are eprintln-text-only when `emit_output=true` and dropped on the floor entirely when `--output-format json` is selected, so the per-turn JSON envelope shows `tool_uses`/`tool_results`/`auto_compaction`/`usage` but zero hook execution evidence even though hooks fire and can deny/mutate/cancel tools
Dogfooded 2026-04-26 17:05 KST on `feat/jobdori-168c-emission-routing` at HEAD `bdcf3fa` (post fast-forward onto gaebal-gajae's #279 unknown-fields silent-drop pinpoint). Static audit of the JSON-mode dispatch path: `LiveCli::run_prompt_json` at `rust/crates/rusty-claude-cli/src/main.rs:4690-4729` calls `prepare_turn_runtime(false)` (line 4691) — the bool argument is `emit_output`. `prepare_turn_runtime` at `:4567-4587` then calls `build_runtime(..., emit_output, ...)` at `:4574-4583`, which at `:7726-7728` only attaches `with_hook_progress_reporter(Box::new(CliHookProgressReporter))` when `emit_output` is `true`. In the JSON-mode path, `emit_output=false`, so the runtime gets NO progress reporter at all. `Conversation::run_pre_tool_use_hook` at `rust/crates/runtime/src/conversation.rs:224-241` and `run_post_tool_use_hook` at `:243-273` both branch on `if let Some(reporter) = self.hook_progress_reporter.as_mut()`: with no reporter, the hook still EXECUTES (the `None` arm passes `None` for the reporter slot to `HookRunner::run_pre_tool_use_with_context`/`run_post_tool_use_with_context`), but every `HookProgressEvent::{Started, Completed, Cancelled}` emission inside `runtime/src/hooks.rs` (lines 347, 366, 376, 388, 400) is conditioned on a `reporter.on_event(...)` call that never happens. Even in the text path where `emit_output=true`, `CliHookProgressReporter::on_event` at `main.rs:7735-7762` writes to `eprintln!` only — never to stdout, never to the JSON envelope, never to a `--output-format json`-consumable structured stream. The JSON envelope at `:4699-4724` exposes ten top-level fields (`message`, `model`, `iterations`, `auto_compaction`, `tool_uses`, `tool_results`, `prompt_cache_events`, `usage`, `estimated_cost`) and zero hook evidence.
Concrete failure mode: an operator scripts `claw -p "do thing" --output-format json | jq` against a workspace whose `~/.claw/settings.json` defines a `PreToolUse` hook that **denies** specific tool patterns, **mutates** tool input (the `updated_input_json` path at `runtime/src/hooks.rs:149`), or **cancels** the turn (`is_cancelled`/`is_failed`/`is_denied` arms at `conversation.rs:409-440`). The conversation outcome reflects the hook decision (the tool may be denied with `PermissionOutcome::Deny { reason: "PreToolUse hook cancelled tool ..." }` at `conversation.rs:413-416`), and the deny reason flows into a `tool_result` synthesized at the permission layer — but the JSON envelope never records that a hook fired, what hook (`pre_tool_use`/`post_tool_use`/`pre_tool_use_failure`/`post_tool_use_failure`), what command, what exit code, what abort path, what stdout/stderr preview, or even how many hook invocations happened across the turn. The operator's only signal that a hook touched the turn is reverse-inference from a `tool_result` reason string, which is itself a #266-style untyped-prose surface. For an automation that wants to verify "the security-audit PreToolUse hook ran on every Write/Bash invocation in this turn," the JSON envelope is structurally incapable of answering — there is no `hooks: [...]` array, no `hook_invocations: N`, no `pre_tool_use_outcomes`, no per-invocation receipt. Symmetrically, when `emit_output=true` (text mode), hook events go to stderr as `[hook pre_tool_use] BashTool: pretty.sh` prose, which is human-readable but not structured and competes with model streaming text on the same terminal.
Gap. Hook execution observability is wired into the runtime (`HookProgressEvent` enum at `runtime/src/hooks.rs:40-58`, `HookProgressReporter` trait at `:59-61`, three reporter call sites at `:347/366/388/400`) but only ever surfaces as opt-in stderr text, never as a structured channel that aligns with the existing `tool_uses`/`tool_results` JSON-envelope schema. The structural shape is identical to #107 (doctor-side hook subsystem opacity) but at a strictly different layer: #107 is about `claw doctor` hook-config visibility (audit-once); #280 is about per-turn `--output-format json` hook-execution visibility (per-prompt evidence). It is also adjacent to but distinct from #265 (`stream-json` mode entirely absent — no streaming lane at all) — #265 is about the missing output mode, #280 is about a missing field within the existing JSON output mode. Distinct from #260 (`--compact --output-format json` strips six fields: hooks were never one of those six in EITHER non-compact or compact envelope; this is a third structural absence, not a strip-on-compact). Distinct from #109 (config validator warnings stderr-only) — different subsystem, same plumbing pattern (structured-data-relegated-to-stderr-prose). Distinct from #259/#273/#275 provenance quartet — provenance emits state-of-record fields; #280 is per-turn execution-event evidence.
Founds NEW **`Hook-execution-event-envelope-coverage`** cluster on the per-turn-observability axis (1 member, #280 solo founder). Pairs with the silent-fallback family on the structurally-absent-evidence axis: silent-fallback accepts malformed input without a typed error; #280 accepts hook execution without a typed receipt. Pairs with #107 (doctor-side hook opacity) on the hooks-subsystem-observability axis to form a **hook-observability-pair** spanning both diagnostic surfaces (audit-once doctor + per-turn JSON envelope). Pairs with #265 (stream-json absent) on the structured-output-axis to form a **JSON-output-completeness-pair**: #265 catalogues the missing streaming output mode entirely, #280 catalogues a missing field within the one-shot JSON envelope that does exist. Extends the **complementary-pinpoint-pair-bundle** discovery-pattern: #280 forms the seventh pair-bundle (#107 + #280 hook-observability-spanning-doctor-and-runtime).
Distinct from #266 (typed-error-kind enumeration — the runtime needs typed discriminants on `RuntimeError`; #280 is about hook-execution evidence as a positive-path field, not error-kind taxonomy on the failure path). Distinct from #278 (`SESSION_VERSION` never compared on load — persistence layer; #280 is the runtime/CLI envelope layer) and from gaebal-gajae's #279 (unknown-fields silent-drop in known JSONL records — also persistence; #280 is per-turn output emission, not on-disk storage).
Required fix shape: (a) extend `run_prompt_json` envelope at `main.rs:4699-4724` with a top-level `hooks: [HookInvocationReceipt]` array where each entry is `{ event: "pre_tool_use" | "post_tool_use" | "pre_tool_use_failure" | "post_tool_use_failure", tool_name, tool_use_id, command, started_at, completed_at, duration_ms, outcome: "completed" | "cancelled" | "denied" | "failed", exit_code: Option<i32>, stdout_preview: Option<String>, stderr_preview: Option<String>, updated_input_changed: bool, permission_override: Option<String> }`; (b) add a JSON-mode `HookProgressReporter` impl that buffers `HookProgressEvent`s into a `Vec<HookInvocationReceipt>` instead of writing to stderr, attached unconditionally in `prepare_turn_runtime` (drop the `emit_output` gate at `main.rs:7726`); (c) align the `hooks` field with the existing `tool_uses`/`tool_results` array schema so consumers who already index by `tool_use_id` can correlate hook receipts to the tool they fired around; (d) extend `run_prompt_compact_json` (`main.rs:4665-4688`) to include either the full `hooks` array or a `hook_invocations: N` count so the compact envelope does not silently strip a fourth observability dimension on top of the six already documented in #260; (e) add a top-level `hooks_summary: { pre_tool_use_count, post_tool_use_count, denied_count, failed_count, cancelled_count }` for cheap aggregate consumers; (f) regression-test a workspace whose `settings.json` defines a `PreToolUse` hook that denies one tool and a `PostToolUse` hook that runs on success, asserting the `--output-format json` envelope contains exactly two hook receipts with the correct outcomes and stdin/stdout previews; (g) document the `hooks` field in `--help` and the wire-format docs alongside `tool_uses`/`tool_results`; (h) close the loop with #107 by having `claw doctor --output-format json` add a `hooks_subsystem` block describing configured hooks (audit-once) so #107 + #280 collectively close the hook-observability-spanning-doctor-and-runtime pair. Acceptance: an operator scripting `claw -p "x" --output-format json | jq '.hooks[]'` can enumerate every hook invocation that fired during the turn with its outcome, and a security-audit hook can prove it ran without scraping stderr or reverse-engineering tool_result reason strings.
**Status:** Open. No source code changed. Filed 2026-04-26 17:05 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `bdcf3fa` before filing (post fast-forward onto gaebal-gajae's #279 unknown-fields silent-drop pinpoint). Cluster delta: founds NEW `Hook-execution-event-envelope-coverage` cluster (1 member, #280 solo founder); pair-bundle with #107 (doctor-side hook opacity); seventh complementary-pinpoint-pair-bundle in the discovery-pattern catalogue (turn-budget #262+#264 + WebSearch #245+#250 + 4 prior pairs + hook-observability #107+#280); cross-cluster with silent-fallback family (structurally-absent-evidence axis), with #265 (JSON-output-completeness pair: missing-mode vs missing-field-in-existing-mode), and with #260 (third strip dimension on top of compact-envelope's six). Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit of `prepare_turn_runtime`/`build_runtime`/`CliHookProgressReporter`/`Conversation::run_pre_tool_use_hook` showing reporter is conditionally `None` in JSON mode and the JSON envelope at `:4699-4724` carries zero hook fields. Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `bdcf3fa+#280` after push.
## Pinpoint #281 — Dogfood filing is not a two-phase transaction: a subagent can commit/push ROADMAP successfully, crash before Discord reporting, and leave the public cycle in an ambiguous half-committed state with no recovery receipt
Dogfooded 2026-04-26 17:30 KST from the live #407 recovery incident. The #407 subagent successfully filed #280, pushed commit `cf32b83` to origin/fork, then crashed during Discord posting after a gateway restart / WebSocket closure. Jobdori had to manually recover by checking origin/fork parity, reading ROADMAP to learn what #280 was, and posting a recovery message. The git state was correct, but the public coordination state was incomplete until manual repair.
Concrete failure mode: ROADMAP commit/push and channel report are currently independent side effects with no shared transaction id or durable outbox. If the agent dies between them, downstream claws see neither a guaranteed success nor a guaranteed failure: git may contain the filing, chat may not, and the next nudge may duplicate or skip the item depending on which source it trusts. This is distinct from #269 (payload chunking/delivery receipts) and #277 (channel target resolution): those cover message delivery mechanics; #281 covers the atomicity boundary between repository mutation and public report publication.
Gap. There is no dogfood filing transaction ledger with phases like `planned`, `roadmap_committed`, `pushed_origin`, `pushed_fork`, `report_posted`, `recovered`. There is no durable outbox entry containing the report body before send, no idempotency key keyed by commit SHA/pinpoint id, and no automatic recovery worker that posts the missing report after restart. Manual recovery worked only because Jobdori noticed the crash and re-read git.
Required fix shape: (a) before committing, create a durable filing transaction record with `pinpoint_id`, branch, expected commit message, report body, target channel, and idempotency key; (b) update it after commit, origin push, fork push, and Discord send, including message id(s); (c) on agent/gateway restart, scan for records stuck at `pushed_*` but not `report_posted` and publish an idempotent recovery report; (d) include the transaction id in both commit body and Discord post so duplicates can be suppressed; (e) add tests simulating crash-after-push-before-post to prove the report is recovered exactly once. Acceptance: a successful ROADMAP push can never remain silently unreported after a crash; recovery is automatic and machine-auditable.
**Status:** Open. No source code changed. Filed 2026-04-26 17:32 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `cf32b83` before filing. Cluster delta: dogfood-filing-transactionality +1; sibling to #269/#277 delivery-layer gaps, distinct git↔chat two-phase atomicity layer. Concrete delta this cycle: ROADMAP-only pinpoint appended from #407 crash-after-push-before-post evidence.
## Pinpoint #282`--cwd` flag is parsed only by the `system-prompt` subcommand; the primary `claw -p` runtime dispatch path has no `--cwd` override and silently uses the process's `env::current_dir()` for tool execution, bash, file_ops, and git_context
Dogfooded 2026-04-26 17:35 KST. Static audit of `rust/crates/rusty-claude-cli/src/main.rs` shows exactly two occurrences of the literal string `"--cwd"`: one in `parse_system_prompt_args` at `:1950` (for `claw system-prompt --cwd <path>` only) and one inside a test fixture at `:10307` exercising that same subcommand. The 25+ other call sites that resolve a working directory all call `env::current_dir()` unconditionally (`:501, :514, :553, :1820, :1834, :1888, :1898, :2263, :2327, :3354, :3572, :3607, :3628, :3638, :3671, :3687, :3812, :3889, :3946, :4513, :4978, :5177, :5198, …`), meaning `claw -p "x"`, `claw run-prompt`, `claw query-tools`, `claw doctor`, `claw render-diff` and friends all operate on whatever directory the shell was in when the binary launched, with no flag to override.
Gap. Operators wrapping claw-code in dispatchers, schedulers, IDE agents, MCP brokers, and per-task worktree harnesses (the exact shape of the dogfood loop, of `oh-my-o p e n c o d e`, of CI runners, of subagent spawners) cannot point a single long-lived claw process at different working directories per invocation. They have to either `cd` in a wrapping shell (which is racy across concurrent invocations of one process and impossible across threads of one process) or spawn a fresh process with `Command::current_dir`. The official upstream Claude Code CLI documents `--cwd` as a top-level flag (`https://docs.anthropic.com/en/docs/agents-and-tools/claude-code/cli-reference`) so dispatchers built against upstream silently drop the directory hint when pointed at claw-code, and the bash tool, file_ops, git_context, hook execution, and session save/load all anchor on the wrong directory with no error and no warning. PARITY.md does not list `--cwd` in its supported-flags table, but the `system-prompt` subcommand's local `--cwd` parser creates a misleading half-implementation that hides the runtime-path gap from grep-based audits.
Distinct from #149 (working-directory permission policy), #178 (allow-broad-cwd), and #277 (channel target resolution): those govern whether a cwd is permitted, whether broad cwds are gated, and where a report is sent. #282 is upstream of all three — without a `--cwd` flag on the runtime path, the policy never receives an explicit caller-supplied directory to evaluate; it only sees the inherited process cwd. Distinct from MCP-axis (#254/#268/#274/#275) because `--cwd` governs the host-side process's filesystem anchor before any MCP server is contacted.
Required fix shape: (a) extend the global argument parser at `:824-1264` (where `allow_broad_cwd`, `--output-format`, `--date` and friends are wired) to recognize a top-level `--cwd <path>` flag that is canonicalized once and threaded into every command variant (`Run`, `Query`, `Doctor`, `RenderDiff`, `RunPrompt`, `RunPromptJson`, etc.) via a single `cwd: Option<PathBuf>` field on the dispatch struct; (b) replace the 25+ raw `env::current_dir()` calls inside the runtime crate with a `resolve_cwd(global_cwd_override)` helper that prefers the explicit override and falls back to `env::current_dir()` only when none was supplied; (c) audit `bash.rs`, `file_ops.rs`, `git_context.rs`, `hooks.rs`, `branch_lock.rs`, and `compact.rs` so each receives the resolved cwd as a parameter rather than re-querying `env::current_dir()` on its own; (d) reuse `enforce_broad_cwd_policy` against the explicit override so #178's policy gate triggers on caller-supplied paths; (e) add an integration test running `claw -p "pwd via bash tool" --cwd /tmp/scratch` from a different process cwd and asserting the bash tool's `pwd` output is `/tmp/scratch`, not the launcher's directory; (f) document `--cwd` in `--help`, `claw doctor --output-format json`'s capability block, and PARITY.md's supported-flags table so the upstream gap is no longer silent. Acceptance: a long-lived claw process or per-task dispatcher can point each `claw -p` invocation at a distinct working directory by flag without spawning a new OS process or mutating the parent process's cwd.
**Status:** Open. No source code changed. Filed 2026-04-26 17:38 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `db27ac2` before filing (post fast-forward onto gaebal-gajae's #281 dogfood-filing-transactionality pinpoint). Cluster delta: founds NEW `cwd-flag-runtime-dispatch-gap` cluster (1 member, #282 solo founder); cross-cluster with silent-fallback family (claw-code silently inherits parent process cwd instead of erroring on missing `--cwd`), with PARITY.md half-implementation pattern, and with #178/#149 cwd-policy-gate axis (upstream of policy enforcement). Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit confirmed exactly two `"--cwd"` literal occurrences (one in `parse_system_prompt_args`, one in a test fixture), zero on the primary runtime dispatch path. Concurrent-dogfood-rebase parity will be confirmed local==origin==fork at HEAD `db27ac2+#282` after push.
## Pinpoint #283`auto_compaction_input_tokens_threshold` is only settable via environment variable `CLAUDE_CODE_AUTO_COMPACT_INPUT_TOKENS`; no config-file key and no CLI flag expose it, so the 100 000-token default is a silent constant from an operator's perspective
Static audit of `rust/crates/runtime/src/conversation.rs`. `new_with_features` (`:166-189`) calls `auto_compaction_threshold_from_env()` unconditionally at construction time; the builder method `with_auto_compaction_input_tokens_threshold` (`:198-201`) exists but is never called in the CLI dispatch path — `grep -n "with_auto_compaction_input_tokens_threshold" rust/crates/rusty-claude-cli/src/main.rs` returns zero results. `auto_compaction_threshold_from_env` reads `CLAUDE_CODE_AUTO_COMPACT_INPUT_TOKENS` from the process environment (`:690-697`); if absent or unparseable it falls back to `DEFAULT_AUTO_COMPACTION_INPUT_TOKENS_THRESHOLD = 100_000` (`:18, :703`). `RuntimeFeatureConfig` (`:56-68` of `config.rs`) has no `compaction_threshold` field; the config loader at `:300-340` of `config.rs` never attempts to populate one. `build_runtime_with_plugin_state` (`:7680-7740` of `main.rs`) builds `ConversationRuntime::new_with_features` from `feature_config` and never calls the builder method afterward.
Gap. An operator who wants to raise or lower the compaction threshold for a project (e.g., a repo with a large context that should compact at 200 000 tokens, or a tight CI harness that should compact at 50 000 tokens) has three choices: (a) set the env var before every invocation — fragile across wrappers that launch new processes without inheriting the callers env; (b) live with 100 000 — may be wrong for model or context size; (c) compile a custom binary. No `settings.json` key, no `.clawconfig` field, no `--compaction-threshold` CLI flag. The builder method proves the design allows per-runtime override but the CLI path never routes any input to it. Distinct from #282 (`--cwd` gap): #282 is about filesystem context; #283 is about conversation compaction policy. Distinct from #109/ConfigValidator: no validation failure occurs — the default simply fires silently.
Required fix shape: (a) add `auto_compaction_threshold: Option<u32>` to `RuntimeFeatureConfig` (`:56` of `config.rs`); (b) populate it from a `settings.json`/`.clawrc` key (e.g., `autoCompactionInputTokensThreshold`) in the config loader alongside existing feature flags; (c) add a top-level `--compaction-threshold <N>` CLI flag in the global arg parser, parsed into `CliArgs`; (d) in `build_runtime_with_plugin_state`, call `.with_auto_compaction_input_tokens_threshold(...)` with precedence: CLI flag > config file > env var > compiled default; (e) surface the resolved threshold in `claw doctor --output-format json` under a `compaction` block so operators can inspect which source won; (f) validate in `ConfigValidator` that the threshold is a positive integer and warn on values under 10 000 (probable misconfiguration). Acceptance: `claw -p "x" --compaction-threshold 200000` uses 200 000; a `settings.json` with `"autoCompactionInputTokensThreshold": 150000` uses 150 000; env var still overrides config but not CLI; `claw doctor` shows `compaction.threshold_source` as one of `cli`, `config`, `env`, `default`.
**Status:** Open. No source code changed. Filed 2026-04-26 18:00 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `b05561c` (post-rebase onto origin/main, #282 already present). Cluster delta: extends `silent-fallback-family` (threshold is silently inherited from env-only path, config/CLI lanes absent); builder method `with_auto_compaction_input_tokens_threshold` present but unreachable from CLI path. Concrete delta this cycle: ROADMAP-only pinpoint appended after static audit confirmed zero calls to `with_auto_compaction_input_tokens_threshold` in `main.rs`, no compaction field in `RuntimeFeatureConfig`, and env-var as the sole runtime override. Concurrent-dogfood-rebase parity will be confirmed local==origin==fork after push.
## Pinpoint #284`/ultraplan` is documented as deep multi-step planning but the REPL implementation only prints a static three-line placeholder, so users complaining it is difficult to use are hitting a contract/behavior void rather than a UX copy issue
Dogfooded 2026-04-26 18:24 KST after Sigrid reported that users are complaining `ultraplan` is difficult to use. Static audit found the documented contract in `USAGE.md`: `/ultraplan [task]` is described as "Deep planning with multi-step reasoning" and promises "a structured plan with numbered steps, reasoning for each step, and expected outcomes." The actual REPL dispatch path in `rust/crates/rusty-claude-cli/src/main.rs` routes `SlashCommand::Ultraplan { task }` to `self.run_ultraplan(task.as_deref())`, and `run_ultraplan` only executes `println!("{}", format_ultraplan_report(task));`. `format_ultraplan_report` returns a static three-line report: `Task`, `Action break work into a multi-step execution plan`, and `Output plan should cover goals, risks, sequencing, verification, and rollback`. No internal prompt is run, no planning tool is invoked, no progress reporter is attached, no persisted plan artifact is created, and no numbered plan is produced.
Concrete failure mode: a user runs `/ultraplan refactor auth`, expecting an actual deep plan because docs and help say so, but receives meta-instructions about what a plan should contain. The command looks successful yet produces no usable plan. This explains "difficult to use" complaints better than a wording issue: the command is discoverable and documented, but its behavior is a placeholder masquerading as a product feature.
Gap. There is no product contract boundary separating implemented slash commands from scaffold/stub commands for `/ultraplan`. `STUB_COMMANDS` filters some unimplemented commands from help/completion, but `/ultraplan` is not filtered because it has a handler; the handler is still functionally stubbed. This is distinct from #280 hook-envelope opacity and #283 hidden env-only config: #284 is a user-facing slash-command promise vs runtime behavior gap.
Required fix shape: choose one of two explicit paths. Product path: implement `/ultraplan` by calling the internal prompt/runtime with an `InternalPromptProgressReporter::ultraplan`, persist the generated plan in session state, include numbered steps/risks/verification/rollback, and support JSON/resume behavior if appropriate. Honesty path: demote `/ultraplan` to a clearly labeled placeholder/stub, remove the "deep planning" promise from USAGE/help, and point users to the actual planning workflow. Acceptance: running `/ultraplan <task>` either returns a real structured plan or clearly refuses as not implemented; it must not succeed with a static meta-template.
**Status:** Open. No source code changed. Filed 2026-04-26 18:25 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `1a7b8ea` before filing. Cluster delta: slash-command-contract-vs-runtime +1; concrete user-signal source: Sigrid report of `ultraplan` usability complaints. Concrete delta this cycle: ROADMAP-only follow-up appended from docs/code audit.
## Pinpoint #285 — Provider/model/websearch selection is split across hard-coded registries and env vars instead of a single settings-file contract, blocking user-requested multi-provider/multi-model config and swappable search backends
Dogfooded 2026-04-26 18:27 KST from Sigrid's channel request to make search engine selection and multi-provider/multi-model declarations configurable in settings. Static audit shows the runtime settings layer only parses a single optional `model`, `aliases`, permission settings, MCP, plugins, sandbox, OAuth, provider fallbacks, and trusted roots in `rust/crates/runtime/src/config.rs`. There is no structured `providers`, `models`, or `websearch` config section. Provider routing is still mostly code/env driven: `rust/crates/api/src/providers/mod.rs` has a hard-coded `MODEL_REGISTRY`, prefix checks (`claude`, `grok`, `openai/`, `qwen`, `kimi`), and env-var based base URL/auth resolution. `ProviderClient::from_model` dispatches from the model string and env metadata rather than a loaded provider graph. `WebSearch` ignores runtime settings entirely: `build_search_url` uses `CLAWD_WEB_SEARCH_BASE_URL` if set, otherwise DuckDuckGo HTML search.
Concrete failure mode: a user wants one settings file to declare `providers.lmstudio = { type: "openai", url: "http://.../v1" }`, `models[] = { name, provider, maxContext }`, default `model`, default permission mode, and `websearch = { provider: "tavily", apiKey: ... }`. Today the model name can be set, but the provider endpoint/auth/model metadata/search backend cannot be expressed as first-class config. Users must rely on global env vars, hard-coded model prefix heuristics, and hidden DuckDuckGo/base-url behavior, which makes local LM Studio/vLLM/Ollama, hosted OpenAI-compatible providers, and Tavily/Brave/search-provider swaps difficult to reason about and impossible to inspect via `claw doctor` as one coherent source of truth.
Gap. Claw Code lacks a declarative provider graph and websearch backend contract in `settings.json`. This is distinct from #283 (one compaction threshold only env-settable): #285 is the broader provider/search capability plane. It also intersects with #273/#275 provenance because `status`/`doctor` cannot report the real provider source-of-truth if it lives partly in env and partly in model-prefix code.
Required fix shape: (a) add schema-backed settings sections for `providers`, `models`, and `websearch` with safe secret handling (support env indirection for API keys instead of encouraging raw key commits); (b) define precedence `CLI > local/project/user config > env > built-in defaults`; (c) make `ProviderClient` resolve from the merged config graph, including custom OpenAI-compatible base URLs, auth env/key refs, max context, max output, and reasoning/tool quirks; (d) make `WebSearch` dispatch through configured providers such as DuckDuckGo, Tavily, Brave, or custom base URL; (e) surface the resolved provider/model/search backend in `claw status --output-format json` and `claw doctor`; (f) add tests for LM Studio-style OpenAI-compatible config, multi-model selection, and Tavily-style search backend config without leaking raw API keys in output. Acceptance: the user-requested provider/model/search shape can be placed in settings, resolved deterministically, and audited without relying on undocumented env-only behavior.
**Status:** Open. No source code changed. Filed 2026-04-26 18:28 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `92a598e` before filing. Cluster delta: declarative-provider-websearch-config +1; concrete user-signal source: Sigrid request in #clawcode-building-in-public. Concrete delta this cycle: ROADMAP-only follow-up appended from config/provider/websearch audit.
## Pinpoint #286 — Parallel `Agent` execution can leave forever-running manifests because background thread lifecycle is not durable across process/gateway death and has no heartbeat/stale reaper
Dogfooded 2026-04-26 18:32 KST after Sigrid requested heavy dogfooding around parallel execution and async execution because users report mistakes there. Static audit of `rust/crates/tools/src/lib.rs` shows `execute_agent` writes an `AgentOutput` manifest with `status: "running"`, `derivedState: "working"`, and a `lane.started` event, then calls `spawn_agent_job`. `spawn_agent_job` launches a detached `std::thread::Builder::spawn` closure and immediately returns `Ok(())`; the `JoinHandle` is discarded. The only transition out of `running` happens inside the in-process thread via `run_agent_job``persist_agent_terminal_state(..., "completed"|"failed")`, or if spawn itself fails before the thread starts.
Concrete failure mode: if the parent process/gateway crashes, restarts, OOMs, or is killed after the `running` manifest is written but before the detached thread persists terminal state, the manifest remains `running` forever. There is no durable job queue, PID/thread identity, heartbeat timestamp, lease, resume record, or stale reaper. `derive_agent_state("running", ..)` always returns `working`, so downstream parallel/team coordination sees the lane as active rather than `orphaned`, `lost`, or `needs_recovery`. This is exactly the class of parallel/async mistake users notice: a lane looks alive because a JSON file says `running`, not because any worker is actually executing.
Gap. Agent parallelism has a fire-and-forget in-process thread model but reports as durable background execution. Tests cover spawn failure and fake completion/failure, but they do not simulate crash-after-running-manifest-before-terminal-state, dropped `JoinHandle`, process restart, stale heartbeat, or reaper classification. This is distinct from #281 dogfood git↔Discord transactionality: #286 is runtime lane lifecycle durability for parallel worker execution.
Required fix shape: (a) persist a durable agent job record with `agent_id`, owner process id/start time, heartbeat timestamp, and phase before spawning; (b) either retain/track `JoinHandle`s in a supervisor or move execution to a durable worker queue; (c) update heartbeat during long `run_turn` execution; (d) on startup/tool access, scan manifests stuck in `running` beyond a lease and classify them as `orphaned_worker` / `needs_recovery` instead of `working`; (e) expose stale/orphaned lane state in Agent/Team status and lane events; (f) regression-test crash-after-manifest-before-terminal-state by creating a running manifest with stale heartbeat and verifying the reaper emits a typed blocker. Acceptance: a parallel Agent lane cannot remain silently `running` forever after its executor disappears.
**Status:** Open. No source code changed. Filed 2026-04-26 18:33 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `639e1e3` before filing. Cluster delta: parallel-agent-lifecycle-durability +1; concrete user-signal source: Sigrid request to dogfood parallel/async execution mistakes. Concrete delta this cycle: ROADMAP-only pinpoint appended from Agent spawn/lifecycle audit.
## Pinpoint #287 — Auto-compaction is reactive-after-success instead of preflight-before-request, so oversized resumed sessions can hit context-window failure and “session broke / auto-compact did not work” before compaction ever runs
Dogfooded 2026-04-26 18:38 KST after Sigrid reported frequent session breakage where sessions are not maintained and auto-compaction does not appear to work. Static audit of `rust/crates/runtime/src/conversation.rs` shows `run_turn` calls `maybe_auto_compact()` only after the assistant/tool loop completes successfully and after provider usage has been recorded. `maybe_auto_compact` checks `self.usage_tracker.cumulative_usage().input_tokens` against `auto_compaction_input_tokens_threshold`; that usage is reconstructed from prior assistant message usage and updated from successful provider events, not from a preflight estimate of the prompt/session that is about to be sent. If the next request is already too large and the provider returns `context_window_blocked` before a successful usage event, `maybe_auto_compact` is never reached. CLI error formatting then tells the user to run `/compact` manually, which is exactly the visible failure mode: session continuity breaks first, auto-compact never fires.
Concrete failure mode: a long/resumed session grows near or beyond model context. The next turn is sent without preflight compaction because current auto-compaction is only post-turn. The provider rejects the request for context window size, `run_turn` returns `Err`, the runtime shuts down plugins, and no compaction is persisted. The user sees a broken session/context-window error and must manually recover with `/compact`, despite auto-compaction being advertised as protecting long sessions.
Gap. Auto-compaction lacks a pre-request guard based on `estimate_session_tokens(&session) + estimated_new_prompt_tokens + requested_output_tokens` and lacks a retry path that compacts and resends after a typed context-window failure. This is distinct from #283 (threshold config is env-only): #287 is the timing/trigger semantics that make auto-compaction fail in the exact oversized-session case users expect it to handle. It also intersects with session-maintenance complaints because failed turns do not persist a compacted recovery state.
Required fix shape: (a) add a preflight auto-compact phase before provider dispatch using estimated session/request size and model context metadata; (b) include the threshold, estimated session tokens, estimated request tokens, and context window in a typed `auto_compaction_preflight` event/status surface; (c) after `context_window_blocked`, optionally run a safe compact-and-retry once, with an explicit receipt; (d) persist the compacted session before retry so session continuity is recoverable even if the retry fails; (e) surface whether compaction was skipped because the session was below threshold, no messages were removable, or compaction would not fit; (f) add regression coverage where a resumed oversized session compacts before request and does not hit provider context-window rejection first. Acceptance: an oversized maintained session gets compacted or fails with a typed “not compactable” reason before provider context-window failure, never with silent “auto-compact did not run.”
**Status:** Open. No source code changed. Filed 2026-04-26 18:39 KST. Branch: feat/jobdori-168c-emission-routing. HEAD: `79eeaae` before filing. Cluster delta: session-continuity-auto-compaction-semantics +1; concrete user-signal source: Sigrid report of frequent session breakage and auto-compaction not working. Concrete delta this cycle: ROADMAP-only pinpoint appended from auto-compaction trigger audit.

View File

@@ -9192,136 +9192,44 @@ fn permission_policy(
}
fn convert_messages(messages: &[ConversationMessage]) -> Vec<InputMessage> {
let mut converted = Vec::new();
let mut index = 0;
while index < messages.len() {
let message = &messages[index];
match message.role {
MessageRole::Assistant => {
let tool_use_ids = message
.blocks
.iter()
.filter_map(|block| match block {
ContentBlock::ToolUse { id, .. } => Some(id.clone()),
_ => None,
})
.collect::<Vec<_>>();
let (tool_result_blocks, next_index) = if tool_use_ids.is_empty() {
(Vec::new(), index + 1)
} else {
collect_immediate_tool_results(messages, index + 1)
};
let has_all_tool_results = !tool_use_ids.is_empty()
&& tool_use_ids.iter().all(|id| {
tool_result_blocks.iter().any(|block| {
matches!(block, InputContentBlock::ToolResult { tool_use_id, .. } if tool_use_id == id)
})
});
let paired_tool_result_blocks = if has_all_tool_results {
tool_result_blocks
.into_iter()
.filter(|block| {
matches!(block, InputContentBlock::ToolResult { tool_use_id, .. } if tool_use_ids.contains(tool_use_id))
})
.collect::<Vec<_>>()
} else {
Vec::new()
};
let content = message
.blocks
.iter()
.filter_map(|block| match block {
ContentBlock::Text { text } => Some(InputContentBlock::Text {
text: text.clone(),
}),
ContentBlock::ToolUse { id, name, input } if has_all_tool_results => {
Some(InputContentBlock::ToolUse {
id: id.clone(),
name: name.clone(),
input: serde_json::from_str(input)
.unwrap_or_else(|_| serde_json::json!({ "raw": input })),
})
}
ContentBlock::ToolUse { .. } | ContentBlock::ToolResult { .. } => None,
})
.collect::<Vec<_>>();
if !content.is_empty() {
converted.push(InputMessage {
role: "assistant".to_string(),
content,
});
}
if has_all_tool_results && !paired_tool_result_blocks.is_empty() {
converted.push(InputMessage {
role: "user".to_string(),
content: paired_tool_result_blocks,
});
index = next_index;
} else {
index += 1;
}
}
MessageRole::Tool => {
// Anthropic requires tool_result blocks to appear in the user message
// immediately following their assistant tool_use. A bare Tool-role
// message here is orphaned (for example after a resume/edit/compaction
// boundary) and would be rejected with a provider 400.
index += 1;
}
MessageRole::System | MessageRole::User => {
let content = message
.blocks
.iter()
.filter_map(|block| match block {
ContentBlock::Text { text } => Some(InputContentBlock::Text {
text: text.clone(),
}),
ContentBlock::ToolUse { .. } | ContentBlock::ToolResult { .. } => None,
})
.collect::<Vec<_>>();
if !content.is_empty() {
converted.push(InputMessage {
role: "user".to_string(),
content,
});
}
index += 1;
}
}
}
converted
}
fn collect_immediate_tool_results(
messages: &[ConversationMessage],
start: usize,
) -> (Vec<InputContentBlock>, usize) {
let mut blocks = Vec::new();
let mut index = start;
while let Some(message) = messages.get(index) {
if message.role != MessageRole::Tool {
break;
}
blocks.extend(message.blocks.iter().filter_map(|block| match block {
ContentBlock::ToolResult {
tool_use_id,
output,
is_error,
..
} => Some(InputContentBlock::ToolResult {
tool_use_id: tool_use_id.clone(),
content: vec![ToolResultContentBlock::Text {
text: output.clone(),
}],
is_error: *is_error,
}),
ContentBlock::Text { .. } | ContentBlock::ToolUse { .. } => None,
}));
index += 1;
}
(blocks, index)
messages
.iter()
.filter_map(|message| {
let role = match message.role {
MessageRole::System | MessageRole::User | MessageRole::Tool => "user",
MessageRole::Assistant => "assistant",
};
let content = message
.blocks
.iter()
.map(|block| match block {
ContentBlock::Text { text } => InputContentBlock::Text { text: text.clone() },
ContentBlock::ToolUse { id, name, input } => InputContentBlock::ToolUse {
id: id.clone(),
name: name.clone(),
input: serde_json::from_str(input)
.unwrap_or_else(|_| serde_json::json!({ "raw": input })),
},
ContentBlock::ToolResult {
tool_use_id,
output,
is_error,
..
} => InputContentBlock::ToolResult {
tool_use_id: tool_use_id.clone(),
content: vec![ToolResultContentBlock::Text {
text: output.clone(),
}],
is_error: *is_error,
},
})
.collect::<Vec<_>>();
(!content.is_empty()).then(|| InputMessage {
role: role.to_string(),
content,
})
})
.collect()
}
#[allow(clippy::too_many_lines)]
@@ -9525,7 +9433,7 @@ mod tests {
PromptHistoryEntry, SlashCommand, StatusUsage, DEFAULT_MODEL, LATEST_SESSION_REFERENCE,
STUB_COMMANDS,
};
use api::{ApiError, InputContentBlock, MessageResponse, OutputContentBlock, Usage};
use api::{ApiError, MessageResponse, OutputContentBlock, Usage};
use plugins::{
PluginManager, PluginManagerConfig, PluginTool, PluginToolDefinition, PluginToolPermission,
};
@@ -12992,93 +12900,6 @@ UU conflicted.rs",
assert_eq!(converted[1].role, "assistant");
assert_eq!(converted[2].role, "user");
}
#[test]
fn converts_parallel_tool_results_into_immediate_single_user_message_256() {
let messages = vec![
ConversationMessage::assistant(vec![
ContentBlock::ToolUse {
id: "tool-1".to_string(),
name: "read".to_string(),
input: "{\"path\":\"a\"}".to_string(),
},
ContentBlock::ToolUse {
id: "tool-2".to_string(),
name: "read".to_string(),
input: "{\"path\":\"b\"}".to_string(),
},
]),
ConversationMessage::tool_result(
"tool-1".to_string(),
"read".to_string(),
"a".to_string(),
false,
),
ConversationMessage::tool_result(
"tool-2".to_string(),
"read".to_string(),
"b".to_string(),
false,
),
];
let converted = super::convert_messages(&messages);
assert_eq!(converted.len(), 2);
assert_eq!(converted[0].role, "assistant");
assert_eq!(converted[1].role, "user");
assert!(matches!(
converted[0].content.as_slice(),
[
InputContentBlock::ToolUse { id: id1, .. },
InputContentBlock::ToolUse { id: id2, .. }
] if id1 == "tool-1" && id2 == "tool-2"
));
assert!(matches!(
converted[1].content.as_slice(),
[
InputContentBlock::ToolResult { tool_use_id: id1, .. },
InputContentBlock::ToolResult { tool_use_id: id2, .. }
] if id1 == "tool-1" && id2 == "tool-2"
));
}
#[test]
fn drops_orphan_tool_use_and_tool_result_before_anthropic_dispatch_256() {
let messages = vec![
ConversationMessage::assistant(vec![
ContentBlock::Text {
text: "before tool".to_string(),
},
ContentBlock::ToolUse {
id: "orphan".to_string(),
name: "bash".to_string(),
input: "{\"command\":\"pwd\"}".to_string(),
},
]),
ConversationMessage::user_text("resume prompt"),
ConversationMessage::tool_result(
"orphan".to_string(),
"bash".to_string(),
"late".to_string(),
false,
),
];
let converted = super::convert_messages(&messages);
assert_eq!(converted.len(), 2);
assert_eq!(converted[0].role, "assistant");
assert!(matches!(
converted[0].content.as_slice(),
[InputContentBlock::Text { text }] if text == "before tool"
));
assert_eq!(converted[1].role, "user");
assert!(matches!(
converted[1].content.as_slice(),
[InputContentBlock::Text { text }] if text == "resume prompt"
));
}
#[test]
fn repl_help_mentions_history_completion_and_multiline() {
let help = render_repl_help();