Với tools trả về output dài (ví dụ generate 500-word article), user chờ lâu nếu không stream.
- Hiểu tool streaming: InputJsonEvent và partial_json
- Phân biệt default (validated) vs fine-grained (raw) streaming
- Biết khi nào dùng fine-grained
- Handle invalid JSON khi dùng fine-grained
Basic tool streaming
Với stream=True khi có tools:
InputJsonEvent properties:
- partial_json — chunk JSON mới
- snapshot — cumulative JSON so far
with client.messages.stream(
model=model,
max_tokens=1000,
messages=messages,
tools=[save_article_schema],
) as stream:
for event in stream:
if event.type == "content_block_start":
if event.content_block.type == "tool_use":
print(f"\n→ Calling tool: {event.content_block.name}")
elif event.type == "input_json":
# Partial JSON chunk
print(event.partial_json, end="", flush=True)
elif event.type == "content_block_stop":
passDefault behavior: validated streaming
Anthropic buffer JSON chunks, validate complete top-level key-value, rồi send.
Ví dụ
Tool input schema expect:
Flow:
Observation
User thấy: pause → burst → pause → burst. Không smooth as text streaming.
- Claude generate "abstract": "This paper"... (incomplete)
- Anthropic buffer
- Claude hoàn thành "abstract" value
- Anthropic validate against schema → OK
- Send all abstract chunks together
- Claude generate "meta": {...} → validate → send
{
"abstract": "...",
"meta": {
"word_count": 847,
"review": "..."
}
}Fine-grained tool calling
Flag: anthropic-beta: fine-grained-tool-streaming-2025-05-14 (check docs for latest version).
Hoặc parameter fine_grained=True (tùy SDK version).
Effect
Trade-off
JSON có thể invalid khi nhận — bạn phải handle.
- Disable validation server-side
- Chunks sent immediately as Claude generates
- True streaming behavior
When to use fine-grained?
✅ Use when
❌ Don't use when
- Real-time progress UI (user thấy "typing..." effect)
- Long tool args (500+ words in 1 field)
- Start processing partial data before completion
- Short tool args (validation overhead minimal)
- JSON strict required downstream
- Không muốn handle invalid JSON
Code example
import json
with client.messages.stream(
model=model,
max_tokens=2000,
messages=messages,
tools=[save_article_schema],
extra_headers={
"anthropic-beta": "fine-grained-tool-streaming-2025-05-14"
}
) as stream:
current_args = {}
for event in stream:
if event.type == "input_json":
# partial_json có thể invalid (e.g., "word_count": undefined)
try:
parsed = json.loads(event.snapshot)
current_args = parsed
# Use partial data
print(f"\rProgress: {len(str(current_args))} chars", end="")
except json.JSONDecodeError:
# Expected với fine-grained
passUI pattern: Progressive tool call display
Frontend show:
async def stream_with_progress(messages, tools):
tool_name = None
accumulated_args = ""
with client.messages.stream(...) as stream:
for event in stream:
if event.type == "content_block_start" and event.content_block.type == "tool_use":
tool_name = event.content_block.name
await ui.send_event({"type": "tool_start", "name": tool_name})
elif event.type == "input_json":
accumulated_args += event.partial_json
# Try parse và send progress
try:
args = json.loads(accumulated_args)
await ui.send_event({
"type": "tool_progress",
"args_so_far": args
})
except:
pass
elif event.type == "content_block_stop":
await ui.send_event({"type": "tool_complete", "name": tool_name})UI pattern: Progressive tool call display (tiếp)
Feel responsive, smooth.
→ Calling: save_article
Abstract: "This paper presents..." [typing...]
Word count: 847 [complete]
Review: "Novel approach to..." [typing...]Validation vẫn happen cuối cùng
Fine-grained chỉ skip validation trong streaming. Khi Claude kết thúc tool_use block, final input phải valid JSON matching schema.
Nếu invalid → error ở client:
tool_call = response.content[0]
args = tool_call.input # ← đây phải là valid dictCombined with text streaming
Response có thể có cả text + tool:
Modern UX: Claude "talk" (text) while "preparing tool call" (tool args streaming).
with client.messages.stream(...) as stream:
for event in stream:
if event.type == "content_block_delta":
if event.delta.type == "text_delta":
print(event.delta.text, end="", flush=True)
elif event.delta.type == "input_json_delta":
# Fine-grained tool chunk
...Anti-patterns
❌ Dùng fine-grained không cần
Short tool input → no benefit, extra complexity.
Fix: Default mode OK cho 95% case.
❌ Assume JSON valid mid-stream
Fix: Wrap try-except.
❌ Block UI khi tool streaming
Render tool args chặn main thread.
Fix: Async rendering.
parsed = json.loads(event.snapshot) # crash với fine-grainedÁp dụng ngay
Bài tập 1: Basic tool streaming (20 phút)
Build tool generate_article(topic, length) — simulate generate dài.
Test default mode — observe delay patterns.
Bài tập 2: Fine-grained (30 phút)
Enable fine-grained. So sánh TTFT (time to first token):
Handle invalid JSON gracefully trong UI.
- Default: ~2s (wait validation)
- Fine-grained: ~0.3s
Tóm tắt
🎯 Tool streaming với InputJsonEvent — partial_json + snapshot.
🎯 Default: validated — chunks buffered cho tới khi complete key-value pair.
🎯 Fine-grained: raw — immediate chunks, JSON có thể invalid mid-stream.
🎯 Dùng fine-grained cho long tool args + real-time UI. Default OK 95% case.
🎯 Always validate final tool input at block_stop.