Comparing dba4f281f0..0121f20a09 - claw-code - Spatulaa Git

MayaTheShy/claw-code

mirror of https://github.com/instructkr/claw-code.git synced 2026-06-11 11:19:28 -04:00

Author	SHA1	Message	Date
YeonGyu-Kim	0121f20a09	roadmap: #222 filed — Models list endpoint typed taxonomy is structurally absent: zero `GET /v1/models` and zero `GET /v1/models/{id}` surface across `rust/crates/api/src/providers/anthropic.rs` and `rust/crates/api/src/providers/openai_compat.rs` (rg returns zero hits for `/v1/models`, `list_models`, `fetch_models`, `get_models`, `available_models`, `model_catalog`, `ModelInfo`, `ModelList`, `ListModelsResponse`, `OwnedBy`, `ModelObject`, `ModelCatalog` across `rust/`), zero `Model` / `ModelInfo` / `ModelList` / `ListModelsResponse` typed taxonomy in `rust/crates/api/src/types.rs`, zero `list_models<'a>(&'a self) -> ProviderFuture<'a, ModelList>` and zero `retrieve_model<'a>(&'a self, model_id: &'a str) -> ProviderFuture<'a, ModelInfo>` methods on the `Provider` trait at `rust/crates/api/src/providers/mod.rs:17-30` (only `send_message` and `stream_message` exist, both per-request), zero `list_models` dispatch on the `ProviderClient` enum at `rust/crates/api/src/client.rs:8-14` (three variants Anthropic/Xai/OpenAi, all closed under per-request synchronous dispatch), zero `claw models` / `claw model list` / `claw list-models` CLI subcommand surface at `rust/crates/rusty-claude-cli/src/main.rs`, zero `/models` slash command in the SlashCommandSpec table at `rust/crates/commands/src/lib.rs`, zero validation against an authoritative source on `set_model` at `rust/crates/rusty-claude-cli/src/main.rs:4989-5037` (user can type `/model claude-banana-9000` and the runtime accepts it, swaps the active model to that string, and only fails at request time when the upstream provider returns 404 / invalid_model_error), and the existing `/providers` slash command at `rust/crates/commands/src/lib.rs:716-720` is just a literal alias for `/doctor` at `rust/crates/commands/src/lib.rs:1386-1389` despite advertising `summary: "List available model providers"` (advertised-but-rerouted shape — actively misleading at the UX layer, distinct from #220 's advertised-but-unbuilt shape because the parse arm dispatches to a different command entirely instead of returning a clear unsupported error) — the canonical model-discovery affordance is invisible across every CLI / REPL / slash-command / Provider-trait / ProviderClient-enum / data-model surface, leaving claw-code's local hardcoded 13-entry `MODEL_REGISTRY` (3 anthropic + 5 grok + 1 kimi + 4 prefix routes for openai/gpt/qwen/kimi at `rust/crates/api/src/providers/mod.rs:52-134` and 166-225) and its 6-entry `model_token_limit` match arm (`rust/crates/api/src/providers/mod.rs:277-301` covering claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5-20251213, grok-3, grok-3-mini, kimi-k2.5, kimi-k1.5 — returns None for current production IDs claude-opus-4-7, claude-haiku-4-6, gpt-5.2, o3, o4-mini, kimi-k3, qwen3-max, grok-4, deepseek-reasoner) as the only model-name knowledge the runtime has access to, with no way to refresh it, no way to discover new model IDs that providers publish, no way to validate user-supplied model strings, no way to cross-link to the `pricing_for_model` cost estimator (#209 substring-matching gap), no way to cross-link to the `model_token_limit` preflight check (#210 max_tokens shadow-fork gap silently no-ops on unknown models), no way to cross-link to the future `is_batch_request` flag (#221 batch-dispatch gap requires knowing which models support batch), and USAGE.md:426-440 documents only six model rows out of nine MODEL_REGISTRY entries (kimi alias missing from the documented table, four prefix routes mentioned only in passing prose, zero documentation of /v1/models endpoint usage / zero documentation of model-catalog discovery / zero documentation of "what to do when your provider ships a new model that isn't in claw-code's hardcoded registry") — the canonical model-discovery affordance is the most universally-available endpoint in the LLM API ecosystem (older than `/v1/chat/completions` itself, older than `/v1/embeddings`, older than `/v1/messages`, the literal first endpoint after auth on every OpenAI-compat provider since 2020 and on Anthropic since 2024-12-04, GA-shipped first-class typed surfaces in every Python/TypeScript SDK in the ecosystem) and claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero `/v1/models` integration AND a misleading `/providers` slash command that aliases to `/doctor` — both gaps are unique to claw-code in the surveyed ecosystem (Jobdori cycle #374 / extends #168c emission-routing audit / explicit follow-on candidate from #221 's seven-layer-endpoint-family-absence shape — the third of three named candidates: Files API typed taxonomy / Embeddings API typed taxonomy / Models list endpoint typed taxonomy, and the most clawability-impacting because it's the upstream root cause of three downstream gaps already catalogued in this audit / sibling-shape cluster grows to twenty-one: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221/#222 / wire-format-parity cluster grows to twelve: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221+#222 / capability-parity cluster grows to four: #218+#220+#221+#222 / discovery-and-validation cluster: #222 alone but it's the upstream root cause of #209 's pricing-fallback gap, #210 's max_tokens shadow-fork gap, and #221 's batch-dispatch gap / eight-layer-endpoint-family-absence-with-misleading-alias shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + CLI-subcommand-surface + slash-command-surface-with-misleading-alias + set_model-validation + downstream-consumers-with-stale-data) is the largest single advertised-vs-actual gap catalogued, distinct from prior single-field (#211/#212/#214) / response-only (#213/#207) / header-only (#215 ) / three-dimensional (#216 ) / classifier-leakage (#217 ) / four-layer (#218 ) / false-positive-opt-in (#219 ) / five-layer-feature-absence (#220 ) / seven-layer-endpoint-family-absence (#221 ) members; the advertised-but-rerouted shape is novel — strict-superset of #220 's advertised-but-unbuilt because the parse arm dispatches to a different command instead of returning a clear unsupported error, applies to any future SlashCommandSpec entry where the `summary` field describes a feature different from what the parse arm dispatches to / external validation: Anthropic Models API reference at https://docs.anthropic.com/en/api/models-list documenting `GET /v1/models` GA 2024-12-04 with paginated `before_id` / `after_id` / `limit` and `ModelInfo { id, type: "model", display_name, created_at }` shape, Anthropic retrieve reference at https://docs.anthropic.com/en/api/models documenting `GET /v1/models/{model_id}` for single-model lookup, OpenAI Models API at https://platform.openai.com/docs/api-reference/models documenting the literal first endpoint after auth with `Model { id, object: "model", created, owned_by }` and `ModelList { object: "list", data: Vec<Model> }`, OpenAI Python SDK `client.models.list()` and `client.models.retrieve(model_id)` first-class typed surface, Anthropic Python SDK `client.models.list()` parallel surface GA-shipped 2024-12-04 alongside the API endpoint, Anthropic TypeScript SDK `client.models.list()`, AWS Bedrock ListFoundationModels API documenting Bedrock-anthropic-relay equivalent with `FoundationModelSummary` provider+model+modalities+active flag, Azure OpenAI Models reference with deployment-aware catalog, Vertex AI `projects.locations.models.list` for Vertex-published Anthropic/Gemini/3rd-party models, DeepSeek/Moonshot/Alibaba-DashScope/xAI parallel `/v1/models` OpenAI-compat shape, OpenRouter Models API at https://openrouter.ai/api/v1/models — the canonical "live model catalog with pricing" reference and the model that anomalyco/opencode-via-models.dev uses for pricing-data freshness, simonw/llm `llm models` and `llm models default <model>` first-class CLI subcommand backed by per-plugin model registration with models.dev-equivalent freshness, simonw/llm plugin-registration architecture for ad-hoc model addition, Vercel AI SDK 6 `provider.languageModels()` and `provider.embeddingModels()` first-class typed catalog APIs, LangChain `init_chat_model(model_provider, model_name)` reflective discovery via provider-defined catalogs and `BaseChatModel.aget_models` async catalog query, models.dev (https://models.dev ) — community-maintained authoritative model catalog with pricing + capability flags + provider routing, used by anomalyco/opencode for pricing-data freshness with explicit fallback metadata when a model id isn't in the catalog (the canonical "external authoritative source for model metadata" reference), anomalyco/opencode `models.dev` integration with periodic refresh and explicit `{ provider: unknown, reason: not_in_pricing_table }` fallback metadata, charmbracelet/crush typed catalog with provider+model+input/output-pricing, continue.dev config-file-driven catalog with auto-refresh from provider endpoints, zed-industries/zed bundled JSON catalog with periodic upstream refresh, TabbyML/tabby model catalog via plugin registration, llama.cpp server `/v1/models` local-model catalog via OpenAI-compat shape, LM Studio `/v1/models` local-model catalog, Ollama `/api/tags` and `/v1/models` local-model catalog with both Ollama-native and OpenAI-compat shapes, llamafile bundled-model catalog, LiteLLM models reference covering 100+ models at proxy level, portkey.ai gateway-level catalog, helicone.ai observability-platform model catalog with per-model usage stats, prompthub.us model-catalog-as-service, OpenTelemetry GenAI semconv `gen_ai.request.model` and `gen_ai.response.model` documented as required attributes for spans (every observability backend treats model as a first-class structured signal requiring authoritative-source validation), OpenAPI 3.1 spec for `/v1/models` at https://github.com/openai/openai-openapi as canonical machine-readable schema, Anthropic API stability versioning at https://docs.anthropic.com/en/api/versioning with `anthropic-version` header semver-stable since 2023-06-01 and models endpoint stable since 2024-12-04. Thirty-two ecosystem references, three first-class models-endpoint specs (Anthropic, OpenAI, OpenRouter), GA timeline of 16 months on Anthropic's side and 6+ years on OpenAI's side, eight first-class CLI/SDK implementations (Anthropic Python+TypeScript, OpenAI Python, simonw/llm, Vercel AI SDK, LangChain, Zed, charmbracelet/crush), seven first-class local-model catalogs (Ollama, LM Studio, llama.cpp server, llamafile, Tabby, Continue.dev, LiteLLM proxy), one community-maintained authoritative pricing source (models.dev) used by the closest peer coding agent. claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero `/v1/models` integration AND a misleading `/providers` slash command that aliases to `/doctor` — both gaps are unique to claw-code in the surveyed ecosystem, the model-discovery gap is the upstream root cause of three downstream cost-and-correctness gaps already catalogued in this audit (#209 / #210 / #221 ), and the misleading-alias-shape is novel within the cluster — #222 closes the upstream root cause of three downstream gaps and unblocks live-catalog-driven cost-estimation, max-tokens-validation, batch-capability-detection, and CLI-vs-slash-command-symmetry that the runtime's clawability doctrine treats as canonical baseline expectations.	2026-04-26 02:15:43 +09:00
YeonGyu-Kim	9acd4f14da	roadmap: #221 filed — Message Batches API is structurally absent: zero `/v1/messages/batches` endpoint, zero `/v1/batches` endpoint, zero `MessageBatch` / `BatchedRequest` / `BatchedResult` / `BatchProcessingStatus` / `BatchRequestCounts` typed taxonomy across `rust/crates/api/src/types.rs` (zero hits for `batches`, `MessageBatch`, `BatchedRequest`, `custom_id`, `processing_status`), zero `submit_batch` / `retrieve_batch` / `retrieve_batch_results` / `cancel_batch` / `list_batches` methods on the `Provider` trait at `rust/crates/api/src/providers/mod.rs:17-30` (only `send_message` and `stream_message` exist, both per-request synchronous), zero batch dispatch on `ProviderClient` enum at `rust/crates/api/src/client.rs:8-14` (three variants Anthropic/Xai/OpenAi all closed under sync send_message + stream_message), zero `BatchSubmittedEvent` / `BatchInProgressEvent` / `BatchEndedEvent` typed events on the runtime telemetry sink, zero `claw batch` / `claw batches` CLI subcommand surface at `rust/crates/rusty-claude-cli/src/main.rs`, zero `/batch` slash command in `SlashCommandSpec` table at `rust/crates/commands/src/lib.rs`, zero `pending_batches` field in `claw status --json` output, zero `is_batch_request` flag on `pricing_for_model` cost estimator (so even if Batch API were wired, cost would over-charge by 2x), zero `batch_input_tokens_per_million_usd` / `batch_output_tokens_per_million_usd` fields in the `Pricing` struct — the API has been GA on Anthropic since 2024-10-08 (18 months ago at filing time, with explicit 'anthropic-beta: message-batches-2024-09-24' opt-in header documented) and on OpenAI since 2024-04-15 (24 months ago at filing time), uniformly offers 50% input-and-output token discount, accepts up to 100,000 requests per batch with 256MB total payload (Anthropic) or unlimited via Files API (OpenAI), 24-hour completion SLO; combining with #219 's also-missing prompt-caching opt-in (90% input savings) gives a compounded ~95% input-cost asymmetry on bulk ingest scenarios — the single largest cost-reduction lever in the entire API parity audit, missing at the endpoint-family level rather than the per-field level (Jobdori cycle #373 / extends #168c emission-routing audit / sibling-shape cluster grows to twenty: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220/#221 / wire-format-parity cluster grows to eleven: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220+#221 / capability-parity cluster grows to three: #218+#220+#221 / cost-parity cluster grows to eight: #204+#207+#209+#210+#213+#216+#219+#221 — #221 compounds with #219 to ~95% bulk-ingest cost asymmetry, the largest cost gap in the cluster / seven-layer-endpoint-family-absence shape (endpoint-URL + data-model-taxonomy + Provider-trait-method + ProviderClient-enum-dispatch + Worker-registry-status-enum + CLI-subcommand-surface + pricing-tier-flag) is the largest single capability absence catalogued, exceeding #220 's five-layer-feature-absence / endpoint-family-level absence shape is novel — applies to follow-on candidates 'Files API typed taxonomy is absent' (the OpenAI batch path's prerequisite endpoint, also absent), 'Embeddings API typed taxonomy is absent' (`/v1/embeddings` cross-cutting), 'Models list endpoint typed taxonomy is absent' (`/v1/models` / Anthropic Models API) / external validation: Anthropic Message Batches API reference at https://docs.anthropic.com/en/api/messages-batches documenting five operations on `/v1/messages/batches` + GA 2024-10-08 + 50% discount + 100k-requests-per-batch + 256MB-total-payload + 24-hour-SLO + `custom_id` correlation field, Anthropic launch announcement at anthropic.com/news/message-batches-api documenting '50% off both input and output tokens' positioning, Anthropic Pricing page documenting Batch API column with 50% across Sonnet 3.5/4/4.5/4.6 + Opus 3/4/4.6 + Haiku 3.5, Anthropic Python SDK `client.messages.batches.create(requests=[...])` first-class typed surface, Anthropic TypeScript SDK parallel surface, AWS Bedrock InvokeModelBatch / batch-inference docs (Bedrock-anthropic-relay path), OpenAI Batch API reference at platform.openai.com/docs/api-reference/batch documenting GA 2024-04-15 + 50% discount + JSONL-via-Files-API + completion_window:'24h', OpenAI launch announcement at openai.com/index/openai-introduces-batch-api documenting 'process batches asynchronously and receive results within 24 hours at a 50% discount', DeepSeek/Moonshot/Alibaba-DashScope/xAI batch-inference parallel surfaces, OpenRouter batch passthrough, simonw/llm `--batch` flag, Vercel AI SDK `generateBatch` + provider-specific batch passthrough, LangChain `Runnable.batch()` + `Runnable.abatch()` first-class Python+TypeScript parity, LangSmith batch-aware tracing, llmindset.co.uk independent cost-calculus validation, Medium 'process 10,000 queries without breaking the bank' tutorial, Steve Kinney's Anthropic-Batch-with-Temporal workflow-orchestration article, ai.moda Anthropic-Batch+Caching 95%-compounded-savings analysis (proves #219+#221 together close the largest cost gap), VentureBeat industry-press coverage, Reddit r/ClaudeAI launch thread, zed-industries/zed#19945 (peer ecosystem with same gap), RooCodeInc/Roo-Code#8667 (peer ecosystem with same gap), n8n Anthropic-batch-processing workflow, startground.com batch-deals tracker, silicondata.com 2026-pricing per-model batch breakdown, Hacker News batch-mechanics discussions, OpenTelemetry GenAI semconv `gen_ai.request.batch_id` + `gen_ai.batch.processing_status` + `gen_ai.batch.request_counts` documented attributes, IANA `application/x-ndjson` + `application/jsonl` MIME-type registrations / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero batch-dispatch capability despite the API being GA on both major providers for 18+ months — parity floor against every other CLI/SDK/coding-agent in 2024-2025, the largest single cost-reduction lever in the entire emission-routing audit, and the largest endpoint-family-level capability gap catalogued so far)	2026-04-26 01:45:20 +09:00
YeonGyu-Kim	d46c423c1d	roadmap: #220 filed — Image/vision input is structurally impossible across the entire data model: zero `image` content-block taxonomy variant on `InputContentBlock` (`types.rs:80-94` has only Text/ToolUse/ToolResult — three of three exhaustive variants, zero Image, zero Document, zero MediaType, zero ImageSource, zero base64/file_id slot, zero `media_type` field anywhere in `rust/crates/api/src/`), zero parse arm for `/image <path>` and `/screenshot` slash commands despite their advertised summaries ("Add an image file to the conversation" at `commands/lib.rs:585`, "Take a screenshot and add to conversation" at `commands/lib.rs:578`) being in the canonical SlashCommandSpec table since project inception, both gated under STUB_COMMANDS at `main.rs:8381-8382` (UX patch over missing-feature, not missing-feature fix), `ResolvedAttachment` at `tools/lib.rs:2660-2666` carries path/size/is_image triple but no bytes / no base64 / no media_type / no upload affordance / no transport-ready payload despite `is_image_path` at line 5276 correctly classifying png/jpg/jpeg/gif/webp/bmp/svg extensions and the SendUserMessage/Brief tool surfacing `isImage: true` in JSON envelope (asserted at line 8969); `build_chat_completion_request` (`openai_compat.rs:845`) and `translate_message` (`openai_compat.rs:946`) have three-arm exhaustive matches over Text/ToolUse/ToolResult with no Image arm and no `{type: "image", source: {type: "base64", media_type, data}}` Anthropic-canonical wire shape and no `{type: "image_url", image_url: {url: "data:image/...;base64,..."}}` OpenAI-compat wire shape; the markdown renderer at `render.rs:379-426` handles `Tag::Image` and `TagEnd::Image` for output rendering (asymmetric capability — model emits image markdown → rendered as colored `[image:url]` link, user attaches image → silent black hole at API boundary); the runtime's own worker_boot test fixture at `worker_boot.rs:1324`+`:1349` literally hard-codes "Explain this KakaoTalk screenshot for a friend" as the canonical task-classification example for worker prompt-mismatch recovery — claw-code uses screenshot analysis as a runtime-classifier signal while having zero capability to actually send a screenshot to the model; TUI-ENHANCEMENT-PLAN.md:57 backlogs the gap as "No image/attachment preview" but the gap is far worse than no preview — there is no transport, no codec, no envelope, no anything from the byte stream to the wire (Jobdori cycle #372 / extends #168c emission-routing audit / sibling-shape cluster grows to nineteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219/#220 / wire-format-parity cluster grows to ten: #211+#212+#213+#214+#215+#216+#217+#218+#219+#220 / capability-parity cluster (strict-superset including user-facing surfacing): #218+#220 / five-layer-structural-absence shape (data-model-variant + slash-command-parse-arm + attachment-metadata-threading + request-builder-translation + OS-integration-helper) is the largest single feature absence yet catalogued, exceeding #218 's four-layer; advertised-but-unbuilt shape is novel — UX-layer cousin of #219 's false-positive-opt-in shape — applicable to other STUB_COMMAND entries with capability-claim summaries / claw-code is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero image-input capability despite Anthropic Vision GA on 2024-03-04 (25 months ago at filing time, default-on for all Claude 3.5+ models with 5MB-per-image / 32MB-per-request / 100-images-per-request limits) and OpenAI Vision GA on 2024-05-13 (23 months ago) and Google Gemini multimodal GA on 2024-02-15 (26 months ago), making this a regression against the upstream claude-code CLI claw-code is porting from / external validation: Anthropic Vision API reference at platform.claude.com/docs/en/build-with-claude/vision documenting the canonical {type, source: {type, media_type, data}} content block, Anthropic Messages API reference, Anthropic Files API beta with file_id reference for repeated-image-use efficiency, AWS Bedrock prompt-caching docs with image-block coverage and 20-images-per-request stricter limit and same cachePoint:{} pattern from #219 , OpenAI Vision API reference documenting the {type:image_url, image_url:{url}} data-URL shape used by GPT-4o/4o-mini/5-vision/o1-vision/o3-vision/DeepSeek-VL2/Qwen-VL/QwQ-VL/MiniMax-VL/Moonshot kimi-VL, Google Gemini multimodal API documenting {inline_data:{mime_type, data}} shape, anomalyco/opencode#16184 (look_at tool image-file-from-disk handling bug), anomalyco/opencode#15728 (Read tool image-handling bug), anomalyco/opencode#8875 (custom-provider attachment-allowlist gap), anomalyco/opencode#17205 (text-only-model token-burn on image attachment) — all four are integration-quality gaps in opencode while claw-code is missing the capability entirely (~85% vs 0% parity asymmetry, the largest in the cluster), charmbracelet/crush vision-input via terminal paste, simonw/llm --attachment flag, Vercel AI SDK experimental_attachments + image content blocks, LangChain HumanMessage content blocks, LangGraph image-message routing, OpenAI Python and Anthropic Python SDK first-class image-typed messages, anthropic-quickstarts vision examples, claude-code official CLI paste-image and screenshot shortcuts (the upstream this is a regression against), OpenTelemetry GenAI semconv gen_ai.input.attachments and gen_ai.input.images.count multimodal observability attributes, IANA MIME-type registry RFC 4288/4289)	2026-04-26 01:18:43 +09:00
YeonGyu-Kim	2858aeccff	roadmap: #219 filed — Anthropic prompt-caching opt-in is structurally impossible: cache_control marker has zero codebase footprint (rg returns 0 hits across rust/ src/ docs/ tests/) despite the wire-side beta header 'prompt-caching-scope-2026-01-05' being unconditionally enabled at every Anthropic request (telemetry/lib.rs:16,452,469 + anthropic.rs:1443); five cacheable surfaces are uniformly locked: pub system: Option<String> at types.rs:11 is a flat string with no array form so no system-block cache_control slot exists; InputContentBlock variants Text/ToolUse/ToolResult at types.rs:80-99 have no cache_control field; ToolResultContentBlock variants Text/Json at types.rs:100-103 have no cache_control field; ToolDefinition at types.rs:105-110 has no cache_control field; openai_compat path translate_message at openai_compat.rs:946 and build_chat_completion_request at openai_compat.rs:850 emit flat-string system+content with no cache_control or Bedrock cachePoint translation; ~600 LOC of response-side cache stats infrastructure (prompt_cache.rs PromptCacheStats / PromptCacheRecord / PromptCache trait) accumulates a zero stream because no payload was opted in, and four hardcoded zero-coercion sites (openai_compat.rs:477-478, 489-490, 597-598, 1211-1212) discard upstream cache stats from Bedrock/Vertex/kimi-anthropic-compat/MiniMax-relay even when emitted; integration test at client_integration.rs:88-89 asserts the beta header is sent but no companion test asserts payload contains a cache_control marker because the data structures cannot produce one — a uniquely paradoxical false-positive opt-in shape: wire signal advertises caching intent and data-model structurally precludes it (Jobdori cycle #371 / extends #168c emission-routing audit / sibling-shape cluster grows to eighteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218/#219 / wire-format-parity cluster grows to nine: #211+#212+#213+#214+#215+#216+#217+#218+#219 / cost-parity cluster grows to seven: #204+#207+#209+#210+#213+#216+#219 — #219 is the dominant cost-parity miss, ~90% input-token-cost reduction unattainable / cache-parity request/response symmetry pair: #219 (request-side opt-in absent) + #213 (response-side stats absent on openai-compat lane) / five-surface uniform-structural-absence shape: system+tools+tool_choice+messages+tool_result_content all locked, with no extra_body escape hatch since cache_control is a per-block annotation not a top-level field / false-positive-opt-in shape: novel cluster member where wire signal says yes and structure says no / external validation: Anthropic prompt-caching reference at platform.claude.com/docs/en/build-with-claude/prompt-caching documenting cache_control: {type: ephemeral} on system/tools/messages/content blocks with 5-min default TTL and 1-hour optional TTL and 90% cost reduction on cache-read tokens, Anthropic Messages API reference documenting system: Vec<SystemBlock> array form as the cacheable shape, Bedrock prompt-caching docs documenting cachePoint: {} block form for Bedrock-anthropic relay, claudecodecamp.com analysis of how prompt caching actually works in Claude Code, xda-developers article documenting claude-code's cache-token-budget knob proving caching is actively engaged, anomalyco/opencode#5416 #14203 #16848 #17910 #20110 #20265 (cache-related issues and PR for system-prompt-split-for-cache-hit-rate optimization), opencode-anthropic-cache npm package as third-party plugin proving the ecosystem expectation, LangChain anthropicPromptCachingMiddleware as first-class JS wrapper, LiteLLM prompt-caching docs with single-line cache_control pass-through for Anthropic+Bedrock, Vercel AI SDK Anthropic provider providerOptions.anthropic.cacheControl, prompthub.us multi-provider comparison treating opt-in as documented baseline, portkey.ai gateway-level pass-through, mindstudio.ai cost-impact analysis, OpenTelemetry GenAI semconv gen_ai.usage.input_tokens.cached as documented attribute — claw is the sole client/agent/CLI in the surveyed coding-agent ecosystem with zero cache_control request-side opt-in capability despite shipping the eligibility beta header on every Anthropic request)	2026-04-26 00:40:20 +09:00
YeonGyu-Kim	116a95a253	roadmap: #218 filed — MessageRequest has no response_format / output_config / seed / logprobs / top_logprobs / logit_bias / n / metadata fields (types.rs:6-36, thirteen fields, zero hits across rust/ for any of these); build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields and emits none of these on the wire; AnthropicClient::send_raw_request (anthropic.rs:466) renders same MessageRequest via render_json_body (telemetry/lib.rs:107) with same gaps; ChatMessage (openai_compat.rs:688) has three fields (role, content, tool_calls) and no refusal field despite the streaming-aggregator test at line 1781 explicitly including "refusal": null in test data — silent serde drop; ChunkDelta (openai_compat.rs:735) has same gap; OutputContentBlock (types.rs:147) has four variants (Text, ToolUse, Thinking, RedactedThinking) and no Refusal variant; MessageResponse.stop_reason (types.rs:127) has no slot for Anthropic's 2025-11+ stop_reason='refusal' value; net effect: claw cannot opt into OpenAI strict-schema constrained decoding (response_format json_schema, GA 2024-08), cannot opt into Anthropic GA structured outputs (output_config.format, GA 2025-11-13), cannot opt into legacy JSON mode (response_format json_object), cannot supply seed for reproducible sampling, cannot request logprobs/top_logprobs, cannot bias tokens via logit_bias, cannot request multiple completions via n, and silently discards every refusal string OpenAI emits when constrained decoding rejects a generation — refusals classified as Finished/success with empty content via #217 normalize_finish_reason mapping (Jobdori cycle #370 / extends #168c emission-routing audit / sibling-shape cluster grows to seventeen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217/#218 / wire-format-parity cluster grows to eight: #211+#212+#213+#214+#215+#216+#217+#218 / four-layer-structural-absence shape: request-struct-field + request-builder-write + response-struct-field + content-block-taxonomy-variant, largest single-feature absence catalogued / external validation: OpenAI Structured Outputs guide, OpenAI Chat Completions API reference, Anthropic structured-outputs reference (GA 2025-11-13), Anthropic Messages API reference (stop_reason='refusal'), Vercel AI Gateway Anthropic structured outputs, Vercel AI SDK 6 generateObject + Zod, LangChain with_structured_output, simonw/llm --schema flag, charmbracelet/crush, anomalyco/opencode#10456 open feature request citing OpenAI Codex as reference, anomalyco/opencode#5639/#11357/#13618, OpenAI Codex CI/code-review cookbook, OpenRouter structured-outputs docs, OpenAI Python SDK client.beta.chat.completions.parse, OpenTelemetry GenAI semconv gen_ai.request.response_format + gen_ai.response.refusal)	2026-04-26 00:13:01 +09:00
YeonGyu-Kim	91e290526a	roadmap: #217 filed — normalize_finish_reason (openai_compat.rs:1389) is a two-arm match (stop→end_turn, tool_calls→tool_use) with a string-passthrough fallthrough that drops three of five OpenAI-spec finish reasons (length, content_filter, function_call); MessageResponse.stop_reason is Option<String> with no enum constraint; WorkerRegistry::observe_completion (worker_boot.rs:558) classifies failure on finish=='unknown'\|\|finish=='error' only, so OpenAI/DeepSeek/Moonshot truncation (length) and content-policy refusal (content_filter) become WorkerStatus::Finished with success events; the streaming aggregator's tool-call-block-close branch at openai_compat.rs:537 keys on 'tool_calls' literal and never fires for legacy 'function_call' shape (Azure pre-2024-02-15 / DeepSeek pre-2025-08 / SiliconFlow / OpenRouter relays); Anthropic native path produces the canonical taxonomy correctly (Jobdori cycle #369 / extends #168c emission-routing audit / sibling-shape cluster grows to sixteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216/#217 / wire-format-parity cluster grows to seven: #211+#212+#213+#214+#215+#216+#217 / classifier-leakage shape: response-side string mistranslation flows three layers deep into runtime classifier with two-literal-compare coverage / external validation: OpenAI Chat Completions API reference, Anthropic Messages API reference, OpenAI function_call deprecation notice, Azure OpenAI reference, DeepSeek/Moonshot/DashScope refs, anomalyco/opencode#19842 , charmbracelet/crush typed enum, simonw/llm Reason enum, Vercel AI SDK FinishReason union, LangChain LengthFinishReasonError/ContentFilterFinishReasonError, semantic-kernel FinishReason enum, openai-python Literal type, OpenTelemetry GenAI gen_ai.response.finish_reasons spec)	2026-04-25 23:39:13 +09:00
YeonGyu-Kim	ceb092abd7	roadmap: #216 filed — neither MessageRequest nor MessageResponse has any service_tier field; build_chat_completion_request (openai_compat.rs:845) writes thirteen optional fields (model, max_tokens/max_completion_tokens, messages, stream, stream_options, tools, tool_choice, temperature, top_p, frequency_penalty, presence_penalty, stop, reasoning_effort) and does not write service_tier; AnthropicClient::send_raw_request (anthropic.rs:466) renders the same MessageRequest struct via AnthropicRequestProfile::render_json_body (telemetry/lib.rs:107) which has no field for it either, only a per-client extra_body escape hatch (asymmetric — openai_compat path has zero hits for extra_body); ChatCompletionResponse / ChatCompletionChunk / OpenAiUsage all deserialize four fields each, dropping the upstream-echoed service_tier confirmation and the system_fingerprint reproducibility marker that OpenAI documents as the canonical "what backend served you" signal; claw cannot opt into OpenAI flex (~50% cheaper async batch — developers.openai.com/api/docs/guides/flex-processing), cannot opt into OpenAI priority (~1.5-2x premium SLA latency — developers.openai.com/api/docs/guides/priority-processing), cannot opt into Anthropic priority (auto/standard_only — platform.claude.com/docs/en/api/service-tiers), and cannot detect at the response layer whether a request was flex-served or silently upgraded to priority by a project-level default override (Jobdori cycle #368 / extends #168c emission-routing audit / sibling-shape cluster grows to fifteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215/#216 / wire-format-parity cluster grows to six: #211+#212+#213+#214+#215+#216 / cost-parity cluster grows to six: #204+#207+#209+#210+#213+#216 / three-dimensional-structural-absence shape: request-side write + response-side read + reproducibility marker, distinct from prior request-only #211#212 / response-only #207#213#214 / header-only #215 members / external validation: OpenAI flex/priority/scale-tier guides, OpenAI advanced-usage system_fingerprint guide, Anthropic service-tiers reference, OpenTelemetry GenAI semconv gen_ai.openai.request.service_tier + gen_ai.openai.response.service_tier + gen_ai.openai.response.system_fingerprint, anomalyco/opencode#12297 , Vercel AI SDK serviceTier provider option, LangChain ChatOpenAI service_tier ctor param, LiteLLM service_tier pass-through, semantic-kernel OpenAIPromptExecutionSettings.ServiceTier, openai-python SDK client.chat.completions.create(service_tier=...) first-class kwarg, MiniMax/DeepSeek Anthropic-compat layer notes, badlogic/pi-mono#1381 )	2026-04-25 23:12:25 +09:00
YeonGyu-Kim	2da12117eb	roadmap: #215 filed — expect_success reads only request-id/x-request-id headers and discards the rest; both OpenAiCompatClient::send_with_retry and AnthropicClient::send_with_retry sleep on pure exponential backoff (2^(n-1) * initial + jitter) that ignores upstream Retry-After (RFC 7231 §7.1.3, mandated by Anthropic on 429, emitted by OpenAI/DeepSeek/Moonshot/DashScope on 429/503/529); ApiError::Api has no retry_after field, scheduler has no input port for it; on a 60s server-specified cooldown, claw burns 3 retries in <8s against a closed gate then surfaces RetriesExhausted (Jobdori cycle #367 / extends #168c emission-routing audit / sibling-shape cluster grows to fourteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214/#215 / upstream-contract-honoring trio: #211+#213+#215 / wire-format-parity cluster: #211+#212+#213+#214+#215 / external validation: Anthropic rate-limits docs, OpenAI cookbook, DeepSeek rate-limit docs, RFC 7231 §7.1.3, openai-python#957, Vercel AI SDK LanguageModelV1RateLimit.retryAfter, LangChain BaseChatOpenAI, anomalyco/opencode#16993/#16994/#9091/#17583/#11705, charmbracelet/crush, LiteLLM Router.retry_after_strategy)	2026-04-25 22:41:49 +09:00
YeonGyu-Kim	959bdf8491	roadmap: #214 filed — ChunkDelta and ChatMessage in openai_compat.rs deserialize only content/tool_calls; delta.reasoning_content (sibling to delta.content, the canonical wire field for DeepSeek deepseek-reasoner / Alibaba Qwen3-Thinking / QwQ / vLLM reasoning-parser backends) is silently discarded at serde-deserialize time before any handler sees it; non-streaming ChatMessage has the same gap; is_reasoning_model classifier already returns true for o1/o3/o4/grok-3-mini/qwen-qwq/qwq/thinking and is consulted at line 901 to strip request-side tuning params but never on the response side to opt into reasoning_content extraction; local taxonomy already declares OutputContentBlock::Thinking and ContentBlockDelta::ThinkingDelta and the Anthropic native path correctly emits both with full test coverage at sse.rs:260,288 — the openai-compat translator has the destination types one import away and never bridges to them (Jobdori cycle #366 / extends #168c emission-routing audit / sibling-shape cluster grows to thirteen: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213/#214 / reasoning-fidelity trio: #207+#211+#214 / wire-format-parity cluster: #211+#212+#213+#214 / external validation: DeepSeek API docs, vLLM reasoning-outputs, anomalyco/opencode#24124 , charmbracelet/crush, simonw/llm, Vercel AI SDK, LangChain BaseChatOpenAI, LiteLLM, continue.dev#9245)	2026-04-25 22:16:02 +09:00
YeonGyu-Kim	347102d83b	roadmap: #213 filed — OpenAiUsage struct does not deserialize prompt_tokens_details.cached_tokens (OpenAI 2024-10) or prompt_cache_hit_tokens (DeepSeek); openai_compat path hardcodes cache_creation_input_tokens: 0 and cache_read_input_tokens: 0 at four sites; cost estimator computes $0 cache savings for every OpenAI/DeepSeek/Moonshot kimi request even when upstream prompt cache is hitting; Anthropic native path correctly populates same Usage fields from native wire format (Jobdori cycle #365 / extends #168c emission-routing audit / sibling-shape cluster grows to twelve: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212/#213 / cost-parity cluster: #204+#207+#209+#210+#213 / wire-format-parity cluster: #211+#212+#213 / external validation: OpenAI prompt caching docs, DeepSeek pricing docs, anomalyco/opencode#17223/#17121/#17056/#11995, Vercel AI SDK cachedInputTokens, charmbracelet/crush, simonw/llm)	2026-04-25 21:42:54 +09:00
Jobdori	c00981896f	roadmap: #212 filed — MessageRequest+ToolChoice cannot express parallel_tool_calls (OpenAI top-level) or disable_parallel_tool_use (Anthropic tool_choice modifier); zero hits across rust/ src/ tests/ docs/; ToolChoice is 3-variant enum with no modifier slot; openai_tool_choice mapper has 3-arm match no parallel path; provider default is parallel-on, claw cannot opt out (Jobdori cycle #364 / extends #168c emission-routing audit / sibling-shape cluster grows to eleven: #201/#202/#203/#206/#207/#208/#209/#210/#211/#212 / wire-format-parity cluster: #211+#212 / external validation: Anthropic docs, OpenAI API reference, LangChain BaseChatOpenAI, anomalyco/opencode, charmbracelet/crush#1061 )	2026-04-25 21:10:50 +09:00
YeonGyu Kim	f004f74ffa	roadmap: #211 filed — build_chat_completion_request selects max_tokens_key only on wire_model.starts_with("gpt-5"), sending legacy max_tokens to OpenAI o1/o3/o4-mini reasoning models which reject it with unsupported_parameter; is_reasoning_model classifier 90 lines above already knows o-series is reasoning, taxonomy half-applied within 30-line span; no test for any o-series model (Jobdori cycle #363 / extends #168c emission-routing audit / sibling-shape cluster grows to ten: #201/#202/#203/#206/#207/#208/#209/#210/#211 / external validation: charmbracelet/crush#1061 , simonw/llm#724 , HKUDS/DeepTutor#54 )	2026-04-25 20:38:43 +09:00
YeonGyu-Kim	02252a8585	roadmap: #210 filed — rusty-claude-cli shadows api::max_tokens_for_model with stripped 2-branch fork (opus=32k, else=64k); ignores model_token_limit registry, bypasses plugin maxOutputTokens override, silently sends 64_000 for kimi-k2.5 whose registry cap is 16_384 (4x over) (Jobdori cycle #362 / extends #168c emission-routing audit / sibling-shape cluster grows to nine: #201/#202/#203/#206/#207/#208/#209/#210)	2026-04-25 20:06:43 +09:00
YeonGyu-Kim	134e945a01	roadmap: #209 filed — pricing_for_model substring-matches haiku/opus/sonnet only; default_sonnet_tier function name carries Opus pricing constants (15.0/75.0 vs real Sonnet 3.0/15.0); every non-Anthropic model silently falls back producing 5-100x wrong cost estimates with no event signal, only a magic-string suffix on one summary line; rusty-claude-cli session JSON and anthropic.rs telemetry emit cost without pricing_source field (Jobdori cycle #361 / cost-parity cluster closer to #204+#207 / models.dev parity gap vs anomalyco/opencode)	2026-04-25 19:42:37 +09:00
Jobdori	c20d0330c1	roadmap: #208 filed — silent param/field strip on outbound serialization (4 tuning params for reasoning models, is_error for kimi), self-documenting 'silently strip' comments, no event emission, tests assert removal but not visibility (Jobdori cycle #359 / sibling-chain closer to #207 inbound-drop / completes OpenAI-compat boundary audit)	2026-04-25 19:06:56 +09:00
YeonGyu-Kim	ba3a34d6fe	roadmap: #207 filed — OpenAiUsage discards prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens, cache_read_input_tokens hardcoded 0 in 4 sites breaking cost parity with Anthropic path (Jobdori cycle #358 / fix-pair with #204 / anomalyco/opencode #24233 sibling)	2026-04-25 18:34:44 +09:00
YeonGyu-Kim	0e9cff588d	roadmap: #206 filed — normalize_finish_reason covers 2/5 OpenAI finish reasons, length/content_filter/function_call unmapped (Jobdori cycle #357 ) Pinpoint #206: normalize_finish_reason() in openai_compat.rs only maps stop→end_turn and tool_calls→tool_use. The 'other => other' pass-through arm silently leaks length, content_filter, function_call to downstream consumers expecting Anthropic vocabulary (max_tokens, refusal, tool_use). Sibling of #201/#202/#203/#204 (silent fallbacks at provider boundary). No structured event for unmapped values; test coverage locks only the two-case happy path. Branch: feat/jobdori-168c-emission-routing HEAD: `dba4f28`	2026-04-25 18:20:04 +09:00

1 changed files with 3266 additions and 0 deletions

3266

ROADMAP.md

View File

File diff suppressed because one or more lines are too long