The web search tool — Claude tra cứu internet

6 — Tính năng nâng caoTrung cấp15 phút

Claude's training data có cutoff (ví dụ 1/2026). Hỏi: - "Giá vàng hôm nay?" → không biết - "Kết quả bầu cử tuần trước?" → không biết - "Claude model mới nhất?" → biết đến lúc train

Bạn sẽ học được
  • Enable built-in web search tool
  • Giới hạn search domain (whitelist authoritative sources)
  • Handle multi-block response với citations
  • Case study: research assistant, fact-checker

Enable

Pre-requisite: Bật web search tool trong Console → Privacy settings.

Claude tự decide có search hay không. Nếu có → full flow diễn ra server-side.

web_search_schema = {
    "type": "web_search_20250305",
    "name": "web_search",
    "max_uses": 5  # Max searches Claude can do per request
}

response = client.messages.create(
    model="claude-sonnet-5-20260205",
    max_tokens=2000,
    messages=[{
        "role": "user",
        "content": "What is the latest Anthropic model released in 2026?"
    }],
    tools=[web_search_schema]
)

Response structure

Multi-block:

Extract

- TextBlock: "Let me search for that..."
- ServerToolUseBlock: query="Anthropic Claude model 2026"
- WebSearchToolResultBlock:
    - WebSearchResult: {title, url, snippet}
    - WebSearchResult: {...}
    - ...
- TextBlock with citations: "Claude Opus 4.8 was released April 2026 [1]..."

Extract

for block in response.content:
    if block.type == "text":
        print(block.text)
    elif block.type == "server_tool_use":
        print(f"Searched: {block.input['query']}")
    elif block.type == "web_search_tool_result":
        for result in block.content:
            print(f"Source: {result.url} — {result.title}")

Allowed domains — restrict to authoritative

Whitelist domains cho reliable sources:

Use cases:

Improves quality drastically — no random blog contamination.

  • Medical chatbot → limit PubMed (nih.gov)
  • Legal → official gov sites
  • Financial → SEC filings only
web_search_schema = {
    "type": "web_search_20250305",
    "name": "web_search",
    "max_uses": 5,
    "allowed_domains": ["nih.gov", "who.int", "nature.com"]
}

Blocked domains

Inverse: block specific domains:

Chặn user-generated content không reliable.

"blocked_domains": ["reddit.com", "quora.com"]

Citations inline

Khi Claude response với source backing, có citation blocks embedded:

Citation metadata:

"Anthropic released Claude Opus 4.8 in April 2026, featuring improvements
in reasoning and coding [1]. The model supports up to 200K context window [2]."

Citations inline (tiếp)

UI render: hover [1] → show source.

{
    "type": "citation",
    "cited_text": "...",  # quoted text from source
    "url": "...",
    "title": "..."
}

Ví dụ: Research assistant

Claude search multiple academic sources, synthesize, cite.

response = client.messages.create(
    model="claude-sonnet-5-20260205",
    max_tokens=3000,
    messages=[{
        "role": "user",
        "content": """Research: what are the latest advances in battery technology in 2026?
Focus on lithium alternatives. Summary in 300 words with citations."""
    }],
    tools=[{
        "type": "web_search_20250305",
        "name": "web_search",
        "max_uses": 10,
        "allowed_domains": ["nature.com", "sciencedirect.com", "ieee.org", "spectrum.ieee.org"]
    }]
)

Case studies

🔍 Content writer với fact-check

Writer prompts: "Write article about X, fact-check all claims with web search."

Claude search → verify → cite. Reduce factual errors in articles 80%.

💰 Financial news digest

"Today's top 5 news affecting AAPL stock" → web search → summarize.

🏥 Medical info chatbot

allowed_domains=["nih.gov", "mayoclinic.org"] → reliable medical info only. Include disclaimer "consult doctor".

📊 Competitive intelligence

"What did company X launch in last 30 days?" → search news + blog.

vs Extended thinking

Can combine both: Search để get facts → thinking để synthesize.

Web searchExtended thinking
PurposeFresh external infoDeep reasoning on existing info
Latency5-15s (search time)5-10s
Cost$ (search + content tokens)$$ (thinking tokens)
Use whenNeed current factsNeed careful reasoning

Anti-patterns

❌ max_uses unlimited

Claude can run 20+ searches → expensive.

Fix: Set max_uses=3-5 typical.

❌ No allowed_domains cho sensitive domain

Medical chatbot search random blog → misinformation.

Fix: Whitelist authoritative sources.

❌ Forget to enable in console

API returns error if not enabled.

Fix: Check Console → Privacy → enable.

❌ Dùng cho info static

"Who's president of US?" — Claude biết. Search lãng phí.

Fix: Let Claude decide. Schema có thôi.

Áp dụng ngay

Bài tập 1: Research query (15 phút)

Ask Claude (with web search) research latest trend in your industry. Observe:

Bài tập 2: Restricted search (10 phút)

Ask medical-related question 2 ways:

Compare quality + sources.

  • Search queries used
  • Sources found
  • Citations in output
  • allowed_domains=["nih.gov"]
  • No restriction

Tóm tắt

🎯 Web search = built-in tool, server-side, Anthropic handles search.

🎯 Enable in console first, schema với max_uses, allowed_domains.

🎯 Multi-block response: text + server_tool_use + results + citations.

🎯 Whitelist domains cho domain-critical applications.

🎯 Claude auto-decide — sometimes skip search if answer in training.

Nội dung này có hữu ích không?