Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

[WIP]: Tool hardening fixes#149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
jherr wants to merge2 commits intomain
base:main
Choose a base branch
Loading
fromfix/tools-hardening
Draft

Conversation

@jherr
Copy link
Contributor

@jherrjherr commentedDec 15, 2025
edited by coderabbitaibot
Loading

🎯 Changes

Fixing tool call issues and adding a lot of tests to make sure they are 100% stable and reliable.

✅ Checklist

  • I have followed the steps in theContributing guide.
  • I have tested this code locally withpnpm run test:pr.

🚀 Release Impact

  • This change affects published code, and I have generated achangeset.
  • This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

Release Notes

  • New Features

    • Added support for tool execution approvals, allowing tools to require user consent before running.
    • Enhanced tool orchestration to properly handle sequential and parallel tool execution flows.
    • Improved async handling for client and server tool executions with better state management.
  • Tests

    • Added comprehensive test suites covering approval flows, client tools, server tools, and tool sequencing scenarios.
    • Introduced a test simulator adapter for deterministic tool execution testing.

✏️ Tip: You can customize this high-level summary in your review settings.

@jherrjherr requested a review froma teamDecember 15, 2025 05:31
@coderabbitai
Copy link

coderabbitaibot commentedDec 15, 2025
edited
Loading

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the.coderabbit.yaml file in this repository. To trigger a single review, invoke the@coderabbitai review command.

You can disable this status message by setting thereviews.review_status tofalse in the CodeRabbit configuration file.

Walkthrough

Introduces an approval-based tool execution workflow with async state management. Adds a queuing system for post-stream actions, tracks in-flight tool executions, convertsonToolCall callback to synchronous, and defers continuation logic. Includes comprehensive LLM simulator testing infrastructure and end-to-end test suites for tool execution flows.

Changes

Cohort / File(s)Summary
Core Chat Client Orchestration
packages/typescript/ai-client/src/chat-client.ts
Introduced queuing system (postStreamActions), in-flight tool execution tracking (pendingToolExecutions), and continuation deferral (continuationPending). ChangedonToolCall signature from async to synchronous with internal promise orchestration. AddeddrainPostStreamActions,checkForContinuation, andqueuePostStreamAction helpers. Modified stream lifecycle to wait for pending executions before finalization and drain queued actions in finally blocks.
Type System Extensions
packages/typescript/ai/src/types.ts
Added optionalapproval?: { id: string; approved: boolean } field toToolCall public API to carry user-approval metadata.
Message Conversion & State Collection
packages/typescript/ai/src/message-converters.ts,packages/typescript/ai/src/core/chat.ts
EnhancedcollectClientState to inspect both UIMessage (parts array) and ModelMessage (toolCalls array) formats for approval data. Updated message converters to conditionally include approval data in tool-call entries when approval state is 'approval-responded'.
LLM Simulator Adapter
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
New comprehensive simulator for deterministic tool-call testing. ExportsLLMSimulatorAdapter class,SimulatorScript/SimulatorToolCall/SimulatorIteration interfaces,createLLMSimulator factory, andSimulatorScripts utility with preset patterns (single/multi-tool, approvals, sequential/parallel). Supports stateless and stateful iteration selection with ID generation.
Adapter Package Configuration
packages/typescript/smoke-tests/adapters/package.json,packages/typescript/smoke-tests/adapters/vitest.config.ts
Added exports field for./src/llm-simulator.ts, test scripts (test,test:watch), andvitest dev dependency. Created Vitest configuration with path alias (@tanstack/ai), globals, Node environment, coverage settings, and exclusion patterns.
Adapter Test Suites
packages/typescript/smoke-tests/adapters/src/tests/tools/*.test.ts
Added six test modules:server-tool.test.ts,client-tool.test.ts,approval.test.ts,multi-tool.test.ts,sequences.test.ts,error-handling.test.ts. Cover server/client tool execution, approvals, parallel/sequential flows, mixed tool types, error propagation, and iteration tracking via simulator.
E2E API & Page Route
packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts,packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx
Added/api/tools-test handler supporting multiple predefined scenarios (text-only, single/multi-tool, approval, sequential/parallel). Defined server tools (get_weather, fetch_data, get_time, delete_file with approval) and client tools (show_notification, display_chart). Added React componentToolsTestPage with scenario selector, event logging, approvals UI, tool call tracking, and integration withuseChat.
E2E Dependencies
packages/typescript/smoke-tests/e2e/package.json
Added@tanstack/tests-adapters (workspace:*) andzod (^4.1.13) dependencies.
E2E Playwright Tests
packages/typescript/smoke-tests/e2e/tests/tools/*.spec.ts
Added four test suites:approval-flow.spec.ts,client-tool.spec.ts,race-conditions.spec.ts,server-client-sequence.spec.ts. Cover approval workflows (single, sequential, parallel), client tool execution, mixed server-client sequences, race conditions, and stress tests. Each includes helpers for scenario selection, completion detection, metadata extraction, and event log parsing.

Sequence Diagram(s)

sequenceDiagram    participant Client as Client/App    participant CC as ChatClient    participant Queue as Action Queue    participant Tool as Tool Executor    participant Stream as Stream Lifecycle    Client->>CC: streamResponse(messages)    CC->>Stream: Initialize stream    Stream->>Stream: Reset pendingToolExecutions        loop During Stream        Stream->>Tool: onToolCall (sync)        Tool->>Tool: Create executionPromise        Tool->>Queue: Store in pendingToolExecutions        Tool->>Tool: Execute async (detached)        Tool-->>Queue: Remove from pendingToolExecutions on completion    end        Stream->>Stream: Stream completes    Stream->>Stream: Wait for all pendingToolExecutions    Stream->>Stream: Check streamCompletedSuccessfully        alt Auto-send criteria met        Stream->>CC: checkForContinuation()        CC->>Queue: queuePostStreamAction(continuation)    end        Stream->>Queue: drainPostStreamActions()    Queue->>Queue: Execute all queued actions    Queue-->>Client: Finally block runs        Note over CC,Queue: Tool result added while loading    Client->>CC: addToolResult(result)    alt Stream in progress        CC->>Queue: queuePostStreamAction(continuation)    else No active stream        CC->>CC: Trigger continuation directly    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

  • packages/typescript/ai-client/src/chat-client.ts: Dense async orchestration logic with state management across multiple new data structures (postStreamActions,pendingToolExecutions,continuationPending), control flow changes toonToolCall, and integration into stream lifecycle. Requires careful analysis of timing, race conditions, and state transitions.
  • Type system changes (types.ts,message-converters.ts,chat.ts): Moderate complexity; requires tracing approval field propagation across message formats.
  • LLM Simulator (llm-simulator.ts): Large new public API surface (~15 exports); comprehensive implementation supporting multiple iteration patterns and tool types. Needs validation of contract consistency across all preset scripts.
  • Adapter & E2E tests: High volume (~6 test modules + 4 E2E specs) with heterogeneous scenarios (server tools, client tools, approvals, race conditions, sequencing). Each test suite has distinct logic and assertions that demand individual scrutiny.
  • Interdependencies: Changes span chat-client → types → message converters → simulator usage in tests; tracing data flow through all layers is essential.

Areas requiring extra attention:

  • Async orchestration guards and edge cases inchat-client.ts (e.g., concurrent streams, pending execution cleanup, continuation re-entrance)
  • Approval state propagation throughcollectClientState andmessage-converters under various message formats
  • LLM simulator contract compliance with allSimulatorScripts presets
  • E2E test reliability under race conditions and timing edge cases

Poem

🐰Queues and tools dance in the sun,
Approvals bloom when work is done,
We hitch the stream with post-stream cheer,
Orchestration whispers, loud and clear,
Async hops through fields so green,
The finest tool-flow ever seen!

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check nameStatusExplanationResolution
Title check❓ InconclusiveThe title uses '[WIP]' prefix and is vague ('Tool hardening fixes') without clearly conveying the main changes or their scope.Replace '[WIP]' with a clear, specific title describing the primary change, such as 'Add tool execution queuing and approval support' or similar.
✅ Passed checks (2 passed)
Check nameStatusExplanation
Description check✅ PassedThe PR description addresses the template structure but lacks specific details about what 'tool call issues' were fixed and what improvements the added tests validate.
Docstring Coverage✅ PassedDocstring coverage is 96.77% which is sufficient. The required threshold is 80.00%.

Thanks for usingCodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment@coderabbitai help to get the list of available commands and usage tips.

@jherrjherr marked this pull request as draftDecember 15, 2025 05:32
@nx-cloud
Copy link

nx-cloudbot commentedDec 15, 2025
edited
Loading

View yourCI Pipeline Execution ↗ for commit20511e6

CommandStatusDurationResult
nx affected --targets=test:sherif,test:knip,tes...✅ Succeeded7sView ↗
nx run-many --targets=build --exclude=examples/**✅ Succeeded2sView ↗

☁️Nx Cloud last updated this comment at2025-12-16 17:07:44 UTC

@pkg-pr-new
Copy link

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai@149

@tanstack/ai-anthropic

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-anthropic@149

@tanstack/ai-client

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-client@149

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-devtools-core@149

@tanstack/ai-gemini

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-gemini@149

@tanstack/ai-ollama

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-ollama@149

@tanstack/ai-openai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openai@149

@tanstack/ai-react

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react@149

@tanstack/ai-react-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react-ui@149

@tanstack/ai-solid

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid@149

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid-ui@149

@tanstack/ai-svelte

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-svelte@149

@tanstack/ai-vue

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue@149

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue-ui@149

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/react-ai-devtools@149

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/solid-ai-devtools@149

commit:3752db0

Copy link

@coderabbitaicoderabbitaibot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts (1)

7-16:DuplicatecollectChunks helper - see earlier comment.

This is the same helper duplicated fromerror-handling.test.ts. Consider extracting to shared test utilities.

packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts (1)

7-16:DuplicatecollectChunks helper - see earlier comment.

Third instance of this helper. Extracting to a shared module would reduce maintenance burden.

🧹 Nitpick comments (16)
packages/typescript/ai-client/src/chat-client.ts (1)

134-169:Consider defensive error message extraction.

The async execution tracking pattern is solid. However, at line 159,error.message assumes the caught value is an Error object. If a non-Error value is thrown, this could result inundefined.

           } catch (error: any) {             await this.addToolResult({               toolCallId: args.toolCallId,               tool: args.toolName,               output: null,               state: 'output-error',-              errorText: error.message,+              errorText: error?.message ?? String(error),             })
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (3)

148-169:Consider edge case: empty content string.

The content streaming splits on spaces, but an empty string'' would result inwords = [''], yielding a single empty delta. This is unlikely to cause issues but worth noting.

// Current behavior with empty string:''.split(' ')// [''] - yields one empty content chunk

If empty content should yield no chunks, add a guard:

-    if (iteration.content) {+    if (iteration.content && iteration.content.length > 0) {

223-231:Stub implementation for structuredOutput.

This is a minimal mock that always returns empty data. This is acceptable for a test simulator, but consider documenting that it's a stub.

 async structuredOutput(   _options: StructuredOutputOptions<Record<string, unknown>>, ): Promise<StructuredOutputResult<unknown>> {-  // Simple mock implementation+  // Stub implementation - always returns empty object+  // Real structured output scenarios should use a different approach   return {     data: {},     rawText: '{}',   } }

379-420:singleServerTool andsingleClientTool are identical.

Both methods produce the exact same script structure. This appears intentional (the distinction is in how the tool is defined, not in the script), but the naming could mislead users into thinking the scripts differ.

Consider either:

  1. Adding a comment clarifying that the script is the same, or
  2. Consolidating into a singlesingleTool method with a comment
 /**  * Script for a single server tool call+ * Note: Script structure is identical to singleClientTool -+ * the server/client distinction is in the tool definition, not the script.  */
packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts (2)

103-115:Silent JSON parse failure returns empty array.

The empty catch block silently handles parse errors. Consider logging parse failures in debug mode to aid troubleshooting.

     try {       return JSON.parse(el.textContent || '[]')     } catch {+      console.warn('Failed to parse event-log-json:', el.textContent)       return []     }

215-232:Test expectations are intentionally relaxed.

The "triple client sequence" test notes a known issue with the 3rd tool. The comment explains this well. Consider adding a TODO or tracking issue reference.

-    // NOTE: This tests a complex multi-step continuation flow that requires-    // additional investigation. Currently verifying at least 2 tools complete.+    // TODO: Track issue for 3rd tool not triggering in complex continuation flows+    // Currently verifying at least 2 tools complete while this is investigated.
packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts (1)

76-82:Consider using typed chunk filtering.

The chunk type assertions rely on string matching. Consider using type guards for stronger typing.

// Could define a type guard for safer filtering:constisToolResultChunk=(c:StreamChunk):c isToolResultStreamChunk=>c.type==='tool_result'consttoolResultChunks=chunks.filter(isToolResultChunk)
packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts (2)

14-133:Helper functions are duplicated across test files.

TheselectScenario,waitForTestComplete,getMetadata,getEventLog, andgetToolCalls helpers are nearly identical to those inclient-tool.spec.ts. This creates maintenance burden.

Extract shared helpers to a common test utilities file:

// packages/typescript/smoke-tests/e2e/tests/tools/test-helpers.tsimport{typePage}from'@playwright/test'exportasyncfunctionselectScenario(page:Page,scenario:string):Promise<void>{// ... shared implementation}exportasyncfunctionwaitForTestComplete(page:Page,timeout?:number,expectedToolCount?:number,):Promise<void>{// ... shared implementation}// etc.

Then import in both test files:

import{selectScenario,waitForTestComplete,getMetadata}from'./test-helpers'

104-116:Return type includestimestamp field unlike client-tool.spec.ts.

ThegetEventLog return type here includestimestamp: number, whileclient-tool.spec.ts only hastype andtoolName. This inconsistency could cause confusion.

If extracting to shared helpers, standardize the return type to include all fields:

Promise<Array<{type:string;toolName:string;timestamp?:number}>>
packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx (1)

355-410:Non-null assertions ontc.approval!.id! rely on filter guarantees.

ThependingApprovals filter at line 229-231 checkstc.approval?.needsApproval but doesn't explicitly verifytc.approval.id exists. While this is likely safe in practice for a test page, consider optional chaining for defensive coding:

-                if (!respondedApprovals.current.has(tc.approval!.id!)) {-                    respondedApprovals.current.add(tc.approval!.id!)+                const approvalId = tc.approval?.id+                if (approvalId && !respondedApprovals.current.has(approvalId)) {+                    respondedApprovals.current.add(approvalId)
packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts (1)

19-47:Consider reducing arbitrary timeouts inselectScenario.

The function uses multiplewaitForTimeout calls (500ms, 200ms, 300ms) that could make tests flaky or slow. While hydration waits are sometimes necessary, consider using more deterministic waits where possible:

 async function selectScenario(page: Page, scenario: string): Promise<void> {-  // Wait for hydration-  await page.waitForTimeout(500)+  // Wait for component to be interactive+  await page.waitForSelector('#scenario-select:not([disabled])', { timeout: 3000 })

However, given this is e2e test infrastructure and React hydration timing can be unpredictable, the current approach is acceptable.

packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts (1)

7-16:Consider extractingcollectChunks to a shared test utility.

This helper function is duplicated across multiple test files (error-handling.test.ts,sequences.test.ts,client-tool.test.ts). Consider extracting it to a shared test utilities module to reduce duplication.

// e.g., packages/typescript/smoke-tests/adapters/src/tests/test-utils.tsexportasyncfunctioncollectChunks<T>(stream:AsyncIterable<T>):Promise<Array<T>>{constchunks:Array<T>=[]forawait(constchunkofstream){chunks.push(chunk)}returnchunks}
packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts (1)

7-16:Consider extractingcollectChunks helper to a shared test utility.

This helper is duplicated across multiple test files (also inapproval.test.ts). Consider extracting it to a shared test utilities module to reduce duplication and ensure consistency.

For example, create a file likepackages/typescript/smoke-tests/adapters/src/test-utils.ts:

exportasyncfunctioncollectChunks<T>(stream:AsyncIterable<T>):Promise<Array<T>>{constchunks:Array<T>=[]forawait(constchunkofstream){chunks.push(chunk)}returnchunks}
packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts (1)

427-489:Abort signal propagation may be incomplete.

TheabortController is created independently but isn't linked to the incomingrequestSignal. If the client aborts the request, the chat stream won't be notified to stop.

Consider linking them:

         const abortController = new AbortController()+        // Link request abort to our controller+        requestSignal?.addEventListener('abort', () => {+          abortController.abort()+        })+         try {

Alternatively, passrequestSignal directly iftoStreamResponse andchat support it.

packages/typescript/smoke-tests/e2e/tests/tools/approval-flow.spec.ts (2)

15-46:Consider using more robust waiting strategies instead of fixed timeouts.

TheselectScenario function uses multiplewaitForTimeout calls (500ms, 200ms, 300ms) which can be flaky in CI environments. While the retry loop andwaitForFunction help mitigate this, the hardcoded delays may cause unnecessary slowness or intermittent failures.

Consider replacing fixed timeouts with condition-based waits where possible:

 async function selectScenario(page: Page, scenario: string): Promise<void> {-  // Wait for hydration-  await page.waitForTimeout(500)+  // Wait for hydration - wait for select to be interactive+  await page.waitForSelector('#scenario-select:not([disabled])', { state: 'visible' })    // Try selecting multiple times   for (let attempt = 0; attempt < 3; attempt++) {     await page.focus('#scenario-select')     await page.selectOption('#scenario-select', scenario)-    await page.waitForTimeout(200)      const currentScenario = await page.evaluate(() =>       document.getElementById('test-metadata')?.getAttribute('data-scenario'),     )     if (currentScenario === scenario) break+    await page.waitForTimeout(100) // Small delay only on retry   }

180-218:Potential race condition in sequential approvals test.

The test waits for the approval section and buttons but the condition on lines 196-201 might pass when the first approval button is still visible after clicking. Consider explicitly waiting for the first approval to be processed before checking for the second.

     // Approve the first one     await page.click('.approve-button')+    // Wait for first approval to be processed (button should disappear)+    await page.waitForFunction(+      () => document.querySelectorAll('.approve-button').length === 0,+      { timeout: 5000 },+    ).catch(() => {}); // May not disappear if second comes quickly+     // Wait for second approval to appear     await page.waitForFunction(       () => {         const section = document.getElementById('approval-section')         const buttons = document.querySelectorAll('.approve-button')-        // Either new approval appeared or section is visible-        return section !== null && buttons.length > 0+        // New approval should have appeared+        return section !== null && buttons.length >= 1       },       { timeout: 10000 },     )
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and betweene011bf8 and3752db0.

⛔ Files ignored due to path filters (1)
  • pnpm-lock.yaml is excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (20)
  • packages/typescript/ai-client/src/chat-client.ts (8 hunks)
  • packages/typescript/ai/src/core/chat.ts (2 hunks)
  • packages/typescript/ai/src/message-converters.ts (1 hunks)
  • packages/typescript/ai/src/types.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/package.json (2 hunks)
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/approval.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts (1 hunks)
  • packages/typescript/smoke-tests/adapters/vitest.config.ts (1 hunks)
  • packages/typescript/smoke-tests/e2e/package.json (1 hunks)
  • packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts (1 hunks)
  • packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx (1 hunks)
  • packages/typescript/smoke-tests/e2e/tests/tools/approval-flow.spec.ts (1 hunks)
  • packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts (1 hunks)
  • packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts (1 hunks)
  • packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use tree-shakeable adapter architecture for provider implementations - export specialized adapters (text, embedding, summarize, image) as separate imports from/adapters subpath rather than monolithic adapters
Use Zod for runtime schema validation and type inference, particularly for tool input/output definitions withtoolDefinition() and Zod schema inference
Implement isomorphic tool system usingtoolDefinition() with.server() and.client() implementations for dual-environment execution
Use type-safe per-model configuration with provider options typed based on selected model to ensure compile-time safety
Implement stream processing with StreamProcessor for handling chunked responses and support partial JSON parsing for streaming AI responses

Files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx
  • packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/approval.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/approval-flow.spec.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/ai/src/message-converters.ts
  • packages/typescript/ai/src/core/chat.ts
  • packages/typescript/ai/src/types.ts
  • packages/typescript/ai-client/src/chat-client.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
**/*.test.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Write unit tests using Vitest alongside source files with.test.ts naming convention

Files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/approval.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (CLAUDE.md)

Use camelCase for function and variable names throughout the codebase

Files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx
  • packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/approval.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/approval-flow.spec.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/ai/src/message-converters.ts
  • packages/typescript/ai/src/core/chat.ts
  • packages/typescript/ai/src/types.ts
  • packages/typescript/ai-client/src/chat-client.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
🧠 Learnings (10)
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to **/*.{ts,tsx} : Implement isomorphic tool system using `toolDefinition()` with `.server()` and `.client()` implementations for dual-environment execution

Applied to files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx
  • packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/ai-client/src/chat-client.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to **/*.{ts,tsx} : Use Zod for runtime schema validation and type inference, particularly for tool input/output definitions with `toolDefinition()` and Zod schema inference

Applied to files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to packages/typescript/*/src/adapters/*.ts : Create individual adapter implementations for each provider capability (text, embed, summarize, image) with separate exports to enable tree-shaking

Applied to files:

  • packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts
  • packages/typescript/smoke-tests/e2e/package.json
  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to packages/typescript/*/src/index.ts : Export tree-shakeable adapters with clear subpath exports in package.json (e.g., `tanstack/ai/adapters`, `tanstack/ai-openai/adapters`) to minimize bundle size

Applied to files:

  • packages/typescript/smoke-tests/e2e/package.json
  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to packages/*/package.json : Use `workspace:*` protocol for internal package dependencies in package.json (e.g., `"tanstack/ai": "workspace:*"`)

Applied to files:

  • packages/typescript/smoke-tests/e2e/package.json
📚 Learning: 2025-12-13T17:09:09.784Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.784ZLearning: Use Nx workspace with affected commands to optimize testing and building only changed packages and their dependents

Applied to files:

  • packages/typescript/smoke-tests/e2e/package.json
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to **/*.test.ts : Write unit tests using Vitest alongside source files with `.test.ts` naming convention

Applied to files:

  • packages/typescript/smoke-tests/e2e/package.json
  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts
  • packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to **/*.{ts,tsx} : Use tree-shakeable adapter architecture for provider implementations - export specialized adapters (text, embedding, summarize, image) as separate imports from `/adapters` subpath rather than monolithic adapters

Applied to files:

  • packages/typescript/smoke-tests/adapters/vitest.config.ts
  • packages/typescript/smoke-tests/adapters/package.json
  • packages/typescript/smoke-tests/adapters/src/llm-simulator.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts
  • packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to .eslintrc* : Use ESLint with custom TanStack config for linting all TypeScript and JavaScript files

Applied to files:

  • packages/typescript/smoke-tests/adapters/vitest.config.ts
📚 Learning: 2025-12-13T17:09:09.783Z
Learnt from: CRRepo: TanStack/ai PR: 0File: CLAUDE.md:0-0Timestamp: 2025-12-13T17:09:09.783ZLearning: Applies to **/*.{ts,tsx} : Implement stream processing with StreamProcessor for handling chunked responses and support partial JSON parsing for streaming AI responses

Applied to files:

  • packages/typescript/ai-client/src/chat-client.ts
🧬 Code graph analysis (10)
packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx (2)
packages/typescript/ai-client/src/chat-client.ts (3)
  • sendMessage (270-280)
  • stop (402-409)
  • addToolApprovalResponse (456-493)
packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts (1)
  • Route (424-490)
packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts (2)
packages/typescript/ai-svelte/src/create-chat.svelte.ts (1)
  • isLoading (119-121)
packages/typescript/ai/src/tools/tool-calls.ts (1)
  • getToolCalls (111-115)
packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts (2)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (3)
  • SimulatorScripts (379-500)
  • createLLMSimulator (366-370)
  • SimulatorScript (35-40)
packages/typescript/ai/src/core/chat.ts (2)
  • chat (80-107)
  • chat (750-780)
packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts (2)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (2)
  • SimulatorScript (35-40)
  • createLLMSimulator (366-370)
packages/typescript/ai/src/core/chat.ts (2)
  • chat (80-107)
  • chat (750-780)
packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts (1)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (4)
  • SimulatorScripts (379-500)
  • createLLMSimulator (366-370)
  • SimulatorScript (35-40)
  • approvalTool (425-441)
packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts (1)
packages/typescript/ai/src/tools/tool-calls.ts (1)
  • getToolCalls (111-115)
packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts (1)
packages/typescript/ai-svelte/src/create-chat.svelte.ts (1)
  • isLoading (119-121)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (1)
packages/typescript/ai/src/types.ts (2)
  • ChatOptions (553-587)
  • StreamChunk (677-685)
packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts (2)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (2)
  • SimulatorScript (35-40)
  • createLLMSimulator (366-370)
packages/typescript/ai/src/core/chat.ts (2)
  • chat (80-107)
  • chat (750-780)
packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts (2)
packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (3)
  • SimulatorScripts (379-500)
  • createLLMSimulator (366-370)
  • SimulatorScript (35-40)
packages/typescript/ai/src/core/chat.ts (2)
  • chat (80-107)
  • chat (750-780)
🪛 ast-grep (0.40.0)
packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx

[warning] 591-591: Usage of dangerouslySetInnerHTML detected. This bypasses React's built-in XSS protection. Always sanitize HTML content using libraries like DOMPurify before injecting it into the DOM to prevent XSS attacks.
Context: dangerouslySetInnerHTML
Note: [CWE-79] Improper Neutralization of Input During Web Page Generation [REFERENCES]
-https://reactjs.org/docs/dom-elements.html#dangerouslysetinnerhtml
-https://cwe.mitre.org/data/definitions/79.html

(react-unsafe-html-injection)


[warning] 598-598: Usage of dangerouslySetInnerHTML detected. This bypasses React's built-in XSS protection. Always sanitize HTML content using libraries like DOMPurify before injecting it into the DOM to prevent XSS attacks.
Context: dangerouslySetInnerHTML
Note: [CWE-79] Improper Neutralization of Input During Web Page Generation [REFERENCES]
-https://reactjs.org/docs/dom-elements.html#dangerouslysetinnerhtml
-https://cwe.mitre.org/data/definitions/79.html

(react-unsafe-html-injection)

🪛 Biome (2.1.2)
packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx

[error] 592-592: Avoid passing content using the dangerouslySetInnerHTML prop.

Setting content using code can expose users to cross-site scripting (XSS) attacks

(lint/security/noDangerouslySetInnerHtml)


[error] 599-600: Avoid passing content using the dangerouslySetInnerHTML prop.

Setting content using code can expose users to cross-site scripting (XSS) attacks

(lint/security/noDangerouslySetInnerHtml)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Preview
  • GitHub Check: Test
🔇 Additional comments (70)
packages/typescript/smoke-tests/e2e/package.json (3)

25-25:Correct use of workspace protocol for internal package.

The@tanstack/tests-adapters dependency correctly uses theworkspace:* protocol for internal monorepo dependencies, aligning with established practices.


29-29:Verify vite-tsconfig-paths configuration.

Thevite-tsconfig-paths dependency is newly added. Confirm that it's configured in the Vite config file (vite.config.ts) and is actually needed for the e2e tests, particularly if TypeScript path aliases are used in test files.


25-25:@tanstack/tests-adapters is properly configured and zod versions are consistent across core packages.

The dependency is correctly defined in the workspace and usesworkspace:* as expected. Zod versions are aligned at^4.1.13 across main packages including this smoke test, tests-adapters, and other core AI packages. Minor version variations exist in test-specific packages (ai-anthropic-live-tests uses^3.24.1) and the root ai package uses a range (^3.0.0 || ^4.0.0), which is acceptable for those contexts.

packages/typescript/ai/src/types.ts (1)

69-73:LGTM! Clean extension of ToolCall for approval data.

The optionalapproval field provides a backward-compatible way to carry approval decisions back to the server. The simpler structure (compared toToolCallPart.approval) appropriately contains only what's needed in the response payload.

packages/typescript/smoke-tests/adapters/vitest.config.ts (1)

1-26:LGTM! Well-structured Vitest configuration for the smoke-tests package.

The path alias, coverage settings, and exclusion patterns are appropriate. Based on learnings, this follows the project's convention for test file naming (.test.ts).

packages/typescript/ai/src/message-converters.ts (1)

110-117:LGTM! Correctly propagates approval data for responded tool calls.

The conditional spread ensures approval metadata is only included when the tool call has been explicitly approved or denied (approval-responded state with a definedapproved value). This aligns with the newToolCall.approval type.

packages/typescript/ai/src/core/chat.ts (1)

517-546:LGTM! Enhanced approval collection supports both message formats.

The dual-format handling (UIMessage withparts array and ModelMessage withtoolCalls array) provides necessary flexibility. Note that both code paths can potentially add to the sameapprovals Map for the same message if it happens to have both structures, but this is harmless since the approval decision would be the same value.

packages/typescript/ai-client/src/chat-client.ts (5)

29-33:LGTM! Appropriate state variables for async tool execution tracking.

ThependingToolExecutions Map keyed bytoolCallId enables proper lifecycle management, and thecontinuationPending flag prevents duplicate continuation attempts during action draining.


238-242:LGTM! Proper synchronization before stream finalization.

Awaiting all pending tool executions ensures client tools complete before the stream is finalized and continuation checks occur. The outer try/catch inprocessStream will handle any rejections.


310-373:LGTM! Well-structured stream lifecycle with proper guards and continuation logic.

The concurrent stream guard, state reset, and success flag pattern correctly handle the stream lifecycle. Thefinally block order is intentional:setIsLoading(false) beforedrainPostStreamActions allows queued actions to trigger continuation without being blocked by the loading check.


444-450:LGTM! Proper deferral of continuation when stream is in progress.

Queueing the continuation check whenisLoading is true ensures we don't attempt concurrent streams, and the action will be executed after the current stream completes.


498-529:LGTM! Robust action queue and continuation helpers.

The sequential draining and dual-guard (continuationPending andisLoading) incheckForContinuation correctly prevent concurrent continuation attempts. Thefinally block ensures the flag is always reset.

packages/typescript/smoke-tests/adapters/src/llm-simulator.ts (5)

1-40:Well-structured type definitions for the simulator.

The interfaces are clear and well-documented. TheSimulatorIteration interface properly captures the possible LLM turn outcomes (content, tool calls, finish reasons).


67-94:Clean adapter class design following the established pattern.

The adapter correctly implements the expected structure withkind,name, andmodels properties. Thereset() method provides a way to reuse the adapter instance across tests.


95-130:Complex iteration detection logic with good documentation.

The stateless operation logic for E2E tests is well thought out, enabling both stateful (unit tests) and stateless (E2E with fresh adapter instances) operation modes. The comments clearly explain the two modes.


172-194:Tool call streaming looks correct.

Each tool call gets a unique ID (either from script or auto-generated), and the chunk structure matches the expectedToolCallStreamChunk format with properindex values for parallel tools.


363-370:Factory function is clean and straightforward.

ThecreateLLMSimulator factory provides a clean API for creating adapter instances.

packages/typescript/smoke-tests/e2e/tests/tools/client-tool.spec.ts (3)

16-44:Robust scenario selection with retry logic.

TheselectScenario helper handles React hydration issues well with retries and verification. The timeouts are reasonable for test stability.


50-77:Flexible completion detection with dual conditions.

ThewaitForTestComplete helper smartly checks both thetestComplete flag and tool counts, which provides resilience against race conditions where the flag might not be set.


253-271:Good debugging support on test failure.

Capturing screenshots and logging events/metadata on failure will significantly help with debugging flaky E2E tests.

packages/typescript/smoke-tests/adapters/src/tests/tools/multi-tool.test.ts (5)

1-16:Proper imports and reusable helper.

ThecollectChunks helper is a clean utility for consuming async iterables in tests. Follows Vitest conventions as per coding guidelines.


19-83:Comprehensive parallel tool execution test.

The test properly verifies that all three server tools execute, and validates both tool call and result chunk counts. Good use ofvi.fn() for mocking.


85-154:Good coverage of mixed tool types.

This test exercises server, approval, and client tools simultaneously, verifying the correct behavior for each type:

  • Server tool executes immediately
  • Approval tool waits for approval
  • Client tool emitstool-input-available

Well-structured test usingtoolDefinition() with Zod schemas as per coding guidelines.


157-213:Solid test for repeated tool invocations with different arguments.

The test correctly verifies that the same tool can be called multiple times with different arguments, and that each call receives the correct city-specific data.


215-267:Clean tool selection test.

Verifies that only the requested tool executes when multiple tools are available. Good negative assertions for non-called tools.

packages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.ts (2)

197-209:Good sequence verification using timestamps.

The test properly verifies the order of tool executions by comparing timestamps. The conditional check handles cases where timestamps might not be present.


263-284:Text-only test uses different completion strategy.

This test waits forisLoading to become false rather than usingwaitForTestComplete. This is appropriate since there are no tools to count, but the additionalwaitForTimeout(500) seems like a workaround.

packages/typescript/smoke-tests/adapters/package.json (3)

17-18:Test scripts added correctly.

Thetest andtest:watch scripts follow standard Vitest patterns.


34-34: No issues found. The vitest version^4.0.14 is consistent across the entire monorepo, including the root package.json and all other packages. The version aligns properly with the project's testing setup.


9-14:Exports field correctly exposes the simulator for cross-package imports using .ts files.

This is appropriate for a private test package. The TypeScript configuration supports direct.ts imports viaallowImportingTsExtensions, and the e2e test package successfully imports from this export. No built output is needed since the package is private and used internally during development.

packages/typescript/smoke-tests/e2e/src/routes/tools-test.tsx (5)

1-6:LGTM!

Imports are appropriate for the test page functionality. Uses the recommended patterns:toolDefinition from@tanstack/ai, Zod for schema validation, and React hooks as expected. Based on learnings, this correctly implements the isomorphic tool system with client implementations.


27-97:LGTM!

ThecreateTrackedTools function correctly implements the isomorphic tool system usingtoolDefinition() with.client() implementations. Input and output schemas use Zod as per coding guidelines. The event tracking pattern is well-suited for e2e test observability.


156-157:LGTM!

TheuseRef pattern ensures tools are created once with the stableaddEvent callback. SinceaddEvent is memoized with an empty dependency array, this approach correctly avoids unnecessary re-creation of tools on re-renders.


172-198:LGTM!

The completion detection logic appropriately handles multiple completion states (complete,output-available, or having output/result). The dependency array is correct, and the early return guard at line 173 prevents premature completion detection.


589-609:Static analysis warning is a false positive in this context.

ThedangerouslySetInnerHTML usage here serializes application-controlled state (toolEvents,toolCalls) that doesn't contain unsanitized user input. The JSON data originates from internal tool execution tracking and is safe. This pattern of embedding JSON in<script type="application/json"> tags for test harnesses is appropriate for e2e test infrastructure.

packages/typescript/smoke-tests/e2e/tests/tools/race-conditions.spec.ts (4)

53-80:LGTM!

The completion detection logic is well-designed with two conditions (explicit completion flag OR sufficient completed tools without loading). The trailing 200ms buffer for React state propagation is a pragmatic approach for e2e tests.


143-180:LGTM!

This test effectively validates the specific bug where sequential client tool calls could block each other. The execution order verification via event log timestamps and the duration upper bound (< 10s) are good regression guards against blocking behavior.


246-276:LGTM!

Smart handling of the parallel approvals scenario - capturing button IDs before clicking prevents issues with React re-renders removing buttons from the DOM. The brief wait between clicks is appropriate for test stability.


399-414:LGTM!

The failure hook provides excellent debugging support with screenshots and state dumps. This will be valuable for investigating flaky tests or race condition failures.

packages/typescript/smoke-tests/adapters/src/tests/tools/error-handling.test.ts (5)

19-103:LGTM!

Comprehensive error handling tests covering both synchronous throws and async rejections. The assertions correctly verify that error details propagate totool_result chunks with the expected error messages.


106-145:LGTM!

Good coverage of the edge case where the LLM calls a tool that isn't registered. The test verifies graceful error handling rather than crashes.


147-183:LGTM!

Correctly tests the scenario where a tool definition has no execute function, verifying thattool-input-available is emitted for client-side handling. This aligns with the isomorphic tool system pattern.


224-268:LGTM!

Effective test for themaxIterations strategy. Verifies that the agent loop terminates at the specified limit even when the script would continue indefinitely, and confirms proper completion signaling viadone chunks.


270-349:LGTM!

Good coverage of the auto-stringification behavior for non-string tool returns. Tests verify that objects are properly JSON-serialized and primitives like numbers are stringified correctly.

packages/typescript/smoke-tests/adapters/src/tests/tools/sequences.test.ts (5)

19-133:LGTM!

Thorough testing of server tool sequencing with proper call order verification usingmock.invocationCallOrder. The assertions correctly verify both execution count and order.


135-194:LGTM!

Excellent test of the server→client tool sequence pattern. Correctly verifies that server tools producetool_result chunks while client tools (without execute) producetool-input-available chunks for client-side handling.


196-276:LGTM!

Well-designed test for client→server continuation flow. The message structure correctly simulates a scenario where a client tool result has been injected, allowing verification that the server tool executes as expected in the continuation.


278-347:LGTM!

Clean implementation of the three-tool sequence test using a shared array to track execution order. ThetoEqual(['A', 'B', 'C']) assertion provides strict order verification.


349-418:LGTM!

Effective test for the parallel→sequential tool pattern. TheindexOf comparisons correctly verify that both parallel tools (A, B) complete before the sequential tool (combine) executes.

packages/typescript/smoke-tests/adapters/src/tests/tools/client-tool.test.ts (4)

19-109:LGTM!

Comprehensive testing of client tools without execute functions. ThegetCurrentIteration() check is a clever way to verify the stream paused waiting for client input rather than advancing to the next iteration.


111-154:LGTM!

Good test demonstrating that client tools with execute functions behave similarly to server tools - they auto-execute and producetool_result chunks. The output schema with Zod provides type safety for the returned data.


156-223:LGTM!

Well-structured test for the message injection pattern. The message structure withparts containing the complete tool call correctly simulates a client that has already provided a tool result, allowing verification of the continuation flow.


225-278:LGTM!

Effective test of mixed client tool behavior where one tool has an execute function and one doesn't. The assertions correctly verify that only the tool without execute emitstool-input-available.

packages/typescript/smoke-tests/adapters/src/tests/tools/approval.test.ts (6)

1-16:LGTM! Well-structured test file with proper imports and helper.

The imports are appropriate for the test suite, and the genericcollectChunks helper is cleanly implemented for collecting async iterable chunks.


18-111:LGTM! Comprehensive approval-requested test coverage.

The tests properly verify:

  1. Approval-requested chunks are emitted for tools withneedsApproval: true
  2. Tool execution is deferred until approval is granted
  3. Iteration stops when approval is pending

Theas any casts on chunk types are acceptable for test assertions.


113-187:LGTM! Good test for approval acceptance flow.

The test correctly verifies that when an approval is pre-granted in the message history (viaparts withapproved: true), the tool executes and produces atool_result chunk.


189-259:LGTM! Denial flow properly tested.

The test correctly verifies that when approval is denied (approved: false), the tool is not executed and the LLM produces a content response instead.


261-316:LGTM! Multiple approval tools test is well-designed.

The test verifies that when multiple tools requiring approval are called in the same iteration, each emits anapproval-requested chunk and neither executes until approved.


318-380:LGTM! Critical edge case for mixed approval scenarios.

This test is particularly valuable as it verifies that non-approval tools execute immediately while approval-requiring tools wait, even when both are called in the same iteration.

packages/typescript/smoke-tests/adapters/src/tests/tools/server-tool.test.ts (4)

18-143:LGTM! Comprehensive server tool execution tests.

Good coverage of:

  • Basic tool execution with simple arguments
  • Complex nested argument schemas with Zod validation
  • Proper tool call and result chunk generation

The tests follow the coding guidelines by using Zod for runtime schema validation.


145-227:LGTM! Good coverage for different tool return types.

Tests properly verify:

  • Tools returning JSON objects withoutputSchema
  • Tools returning plain objects (framework handles stringification)
  • Result content contains expected values

229-299:LGTM! Important ID tracking verification.

These tests ensure proper correlation betweentool_call andtool_result chunks via IDs, which is critical for the agent loop to function correctly. The auto-generated ID format test (/^call-\d+-\d+$/) validates the fallback behavior.


301-343:LGTM! Good test for combined content and tool call handling.

This test verifies that content and tool calls can coexist in the same iteration, which is important for realistic LLM response patterns.

packages/typescript/smoke-tests/e2e/src/routes/api.tools-test.ts (3)

14-339:LGTM! Well-structured test scenario definitions.

The scenarios provide comprehensive coverage for:

  • Basic flows (text-only, single server/client tools)
  • Approval workflows
  • Sequential and parallel tool executions
  • Mixed server/client sequences
  • Race condition scenarios

The scenario structure aligns well with the LLM simulator'sSimulatorScript interface.


341-422:LGTM! Tool definitions follow best practices.

The tool definitions properly:

  • Use Zod for runtime schema validation as per coding guidelines
  • Implement the isomorphic tool system with.server() and.client() implementations
  • IncludeneedsApproval: true for thedelete_file tool to test approval flows

492-553:LGTM! Clean scenario-to-tools mapping.

The switch statement clearly maps each scenario to its required tools. The default case returning an empty array is consistent with thetext-only scenario behavior.

packages/typescript/smoke-tests/e2e/tests/tools/approval-flow.spec.ts (4)

48-125:LGTM! Well-designed test helper functions.

The helpers provide good abstraction for:

  • waitForApproval: Simple selector-based wait
  • waitForTestComplete: Multi-condition wait with good fallback logic
  • getMetadata/getEventLog: Clean data extraction from DOM

The 200ms buffer inwaitForTestComplete (line 85) is a pragmatic approach for React state settlement.


127-178:LGTM! Single approval flow tests are well-structured.

Both approve and deny cases properly:

  • Select the scenario
  • Wait for approval UI
  • Perform the action
  • Verify metadata counts

The 500ms wait on line 172 for denial processing is acceptable as denial may not settestComplete immediately.


220-314:LGTM! Parallel approvals and mixed flow tests are robust.

The parallel approvals test (lines 220-253) correctly:

  • Captures button IDs before clicking to handle DOM re-renders
  • Verifies both approvals are granted

The mixed flow tests properly verify event log entries for both approval and client tool execution.


316-332:LGTM! Good debugging support for test failures.

TheafterEach hook provides valuable debugging information:

  • Full-page screenshots with descriptive filenames
  • Console output of metadata and events

This will be helpful for diagnosing flaky tests in CI.

Comment on lines +273 to +360
private determineIterationFromMessages(
messages: Array<{
role: string
toolCalls?: Array<{
id: string
approval?: { id: string; approved: boolean }
}>
toolCallId?: string
parts?: Array<{
type: string
id?: string
output?: any
approval?: { approved?: boolean }
}>
}>,
): number | null {
if (!messages || messages.length === 0) {
return 0 // Fresh conversation, start at iteration 0
}

// Find all assistant messages with tool calls
const assistantToolCallMessages = messages.filter(
(m) => m.role === 'assistant' && m.toolCalls && m.toolCalls.length > 0,
)

if (assistantToolCallMessages.length === 0) {
// No tool calls in history, might be first iteration or continuation
// Check if there's a user message (fresh start)
const hasUserMessage = messages.some((m) => m.role === 'user')
return hasUserMessage ? 0 : null
}

// Get all completed tool call IDs from:
// 1. Separate tool result messages (role: 'tool')
// 2. Parts array with output set (client tool results) - UIMessage format
// 3. Parts array with approval.approved set (approval responses) - UIMessage format
// 4. toolCalls array with approval.approved set (approval responses) - ModelMessage format
const completedToolIds = new Set<string>()

for (const msg of messages) {
// Check for role: 'tool' messages (server tool results)
if (msg.role === 'tool' && msg.toolCallId) {
completedToolIds.add(msg.toolCallId)
}

// Check for UIMessage format: parts with output or approval responses
if (msg.parts) {
for (const part of msg.parts) {
if (part.type === 'tool-call' && part.id) {
// Client tool results have output set
if (part.output !== undefined) {
completedToolIds.add(part.id)
}
// Approval tools are complete when approval.approved is set (true or false)
if (part.approval?.approved !== undefined) {
completedToolIds.add(part.id)
}
}
}
}

// Check for ModelMessage format: toolCalls with approval info
if (msg.role === 'assistant' && msg.toolCalls) {
for (const tc of msg.toolCalls) {
// Approval tools are complete when approval.approved is set
if (tc.approval?.approved !== undefined) {
completedToolIds.add(tc.id)
}
}
}
}

// Count how many complete tool call rounds we have
let completedRounds = 0
for (const assistantMsg of assistantToolCallMessages) {
const toolCalls = assistantMsg.toolCalls as Array<{ id: string }>
const allToolsComplete = toolCalls.every((tc) =>
completedToolIds.has(tc.id),
)
if (allToolsComplete) {
completedRounds++
}
}

// The next iteration is completedRounds (0-indexed)
// e.g., if we've completed 1 round, we're on iteration 1
return completedRounds
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

⚠️ Potential issue |🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash# Find ModelMessage and UIMessage type definitionsecho"=== Searching for ModelMessage type definition ==="rg -n"type ModelMessage|interface ModelMessage" --type=ts -A 20| head -100echo -e"\n=== Searching for UIMessage type definition ==="rg -n"type UIMessage|interface UIMessage" --type=ts -A 20| head -100echo -e"\n=== Searching for toolCalls in message types ==="rg -n"toolCalls" --type=ts -B 2 -A 2| head -80

Repository: TanStack/ai

Length of output: 13022


🏁 Script executed:

#!/bin/bash# Find MessagePart type definition to understand tool-call partsecho"=== Searching for MessagePart type definition ==="rg -n"type MessagePart|interface MessagePart" --type=ts -A 30| head -120# Check for ToolCall definitionecho -e"\n=== Searching for ToolCall type definition ==="rg -n"type ToolCall|interface ToolCall" --type=ts -A 15| head -80# Check the actual llm-simulator.ts file to see imports and contextecho -e"\n=== Checking llm-simulator.ts imports and context ==="head -50 packages/typescript/smoke-tests/adapters/src/llm-simulator.ts

Repository: TanStack/ai

Length of output: 13699


🏁 Script executed:

#!/bin/bash# Get complete ToolCallPart and ToolResultPart definitionsecho"=== Complete ToolCallPart definition ==="rg -n"export interface ToolCallPart" --type=ts -A 20| head -50echo -e"\n=== Complete ToolResultPart definition ==="rg -n"export interface ToolResultPart" --type=ts -A 10| head -40# Check if there's a different message type being used in llm-simulatorecho -e"\n=== Search for message types in llm-simulator context ==="grep -n"messages\|Message" packages/typescript/smoke-tests/adapters/src/llm-simulator.ts| head -40

Repository: TanStack/ai

Length of output: 5228


🏁 Script executed:

#!/bin/bash# Check how determineIterationFromMessages is called and what messages are passedecho"=== Checking method calls and context ==="grep -B 5 -A 5"determineIterationFromMessages" packages/typescript/smoke-tests/adapters/src/llm-simulator.ts| head -80# Check the options type to see what message format is expectedecho -e"\n=== Checking options and message types ==="grep -n"options\." packages/typescript/smoke-tests/adapters/src/llm-simulator.ts| grep -E"messages|options"| head -20

Repository: TanStack/ai

Length of output: 1246


🏁 Script executed:

#!/bin/bash# Find the ChatOptions interface to see what message types are expectedecho"=== ChatOptions in llm-simulator ==="rg -n"ChatOptions" packages/typescript/smoke-tests/adapters/src/llm-simulator.ts -B 2 -A 2# Find the actual ChatOptions type definitionecho -e"\n=== ChatOptions type definition ==="rg -n"interface ChatOptions|type ChatOptions" --type=ts -A 30| head -100# Check what the adapter signature expectsecho -e"\n=== Check adapter method signature ==="grep -n"adapter.*(" packages/typescript/smoke-tests/adapters/src/llm-simulator.ts| head -20</parameter>

Repository: TanStack/ai

Length of output: 3072


🏁 Script executed:

#!/bin/bash# Check if there's any documentation or usage showing what message types are actually passedecho"=== Checking how messages are used in llm-simulator ==="sed -n'273,360p' packages/typescript/smoke-tests/adapters/src/llm-simulator.ts# Also check if there are tests that show actual usageecho -e"\n=== Check test usage ==="find packages/typescript/smoke-tests -name"*.test.*" -o -name"*.spec.*"| head -5</parameter>

Repository: TanStack/ai

Length of output: 3705


🏁 Script executed:

#!/bin/bash# Check for all calls to determineIterationFromMessagesecho"=== All calls to determineIterationFromMessages ==="rg -n"determineIterationFromMessages" packages/typescript/smoke-tests/adapters/src/llm-simulator.ts# Check if messages could come from different sources with different typesecho -e"\n=== Check other adapters or similar patterns ==="find packages/typescript/smoke-tests/adapters/src -name"*.ts" -type f| xargs grep -l"messages"| head -5# Verify the actual usage contextecho -e"\n=== Check full context around line 108 ==="sed -n'100,130p' packages/typescript/smoke-tests/adapters/src/llm-simulator.ts

Repository: TanStack/ai

Length of output: 1938


🏁 Script executed:

#!/bin/bash# Get the complete ToolCall type definition from types.tsecho"=== ToolCall type in ModelMessage ==="sed -n'62,74p' packages/typescript/ai/src/types.ts# Compare with what the method expectsecho -e"\n=== What the method expects vs actual ToolCall ==="echo"Method expects: toolCalls with approval?.approved"echo"Actual ToolCall has:"rg -n"export interface ToolCall" --type=ts -A 12| grep -A 12"packages/typescript/ai/src"

Repository: TanStack/ai

Length of output: 3808


🏁 Script executed:

#!/bin/bash# Check for camelCase compliance in the methodecho"=== Checking variable/function names in determineIterationFromMessages ==="sed -n'273,360p' packages/typescript/smoke-tests/adapters/src/llm-simulator.ts| grep -E"^\s*(const|let|var|for|function)"| head -30

Repository: TanStack/ai

Length of output: 631


Remove unreachable code that checks for UIMessage format.

The method only receivesArray<ModelMessage> perChatOptions, which doesn't have aparts property. Lines 318-330 that checkmsg.parts for UIMessage format are dead code—ModelMessage hastoolCalls (notparts), andUIMessage hasparts (but is never passed here). The logic for handlingModelMessage format withtoolCalls andapproval.approved is correct, but the defensive UIMessage checks should be removed to avoid confusion about supported message types.

🤖 Prompt for AI Agents
In packages/typescript/smoke-tests/adapters/src/llm-simulator.ts around lines273-360 (specifically remove the unreachable UIMessage handling at ~lines318-330), remove the block that checks msg.parts and iterates parts for type'tool-call' (the UIMessage format) because this method only receivesModelMessage instances per ChatOptions; keep the remaining logic that handlesrole === 'tool' and assistant msg.toolCalls approvals so completedToolIds isstill populated, and delete any now-unused variables or comments tied only tothe removed block.

Comment on lines +136 to +139
test.beforeEach(async ({ page }) => {
await page.goto('/tools-test')
await page.waitForSelector('#run-test-button')
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

⚠️ Potential issue |🟡 Minor

Missing:not([disabled]) check in beforeEach.

Unlikeclient-tool.spec.ts which waits for#run-test-button:not([disabled]), this file only waits for#run-test-button. This could cause test flakiness if the button is initially disabled.

 test.beforeEach(async ({ page }) => {   await page.goto('/tools-test')-  await page.waitForSelector('#run-test-button')+  await page.waitForSelector('#run-test-button:not([disabled])', {+    timeout: 10000,+  }) })
📝 Committable suggestion

‼️IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
test.beforeEach(async({ page})=>{
awaitpage.goto('/tools-test')
awaitpage.waitForSelector('#run-test-button')
})
test.beforeEach(async({ page})=>{
awaitpage.goto('/tools-test')
awaitpage.waitForSelector('#run-test-button:not([disabled])',{
timeout:10000,
})
})
🤖 Prompt for AI Agents
Inpackages/typescript/smoke-tests/e2e/tests/tools/server-client-sequence.spec.tsaround lines 136-139, the beforeEach uses awaitpage.waitForSelector('#run-test-button') which can pass while the button isstill disabled; change the wait to target the enabled button (e.g. awaitpage.waitForSelector('#run-test-button:not([disabled])') or otherwise wait untilthe element is not disabled) so the test only proceeds when the run button isinteractive.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

@coderabbitaicoderabbitai[bot]coderabbitai[bot] left review comments

@AlemTuzlakAlemTuzlakAlemTuzlak approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

2 participants

@jherr@AlemTuzlak

[8]ページ先頭

©2009-2025 Movatter.jp