Trung cấpHướng dẫnclaude-apiTổng hợp

Claude API — Hướng dẫn từ A đến Z cho developer

Minh TuấnCTO, Transform GroupTheo dõi

26/03/2026 546 19 0 20 phút đọc

Nghe bài viết

00:00

1 Bạn có thể bắt đầu ngay với hướng dẫn chi tiết: Claude.ai là interface tuyệt vời cho người dùng cá nhân, nhưng nếu bạn muốn tích hợp Claude vào ứng dụng của mình — xây dựng chatbot, tự động hóa quy trình. Mỗi bước được thiết kế để giảm thiểu sai sót và tối ưu kết quả ngay từ lần đầu sử dụng, phù hợp cả người mới lẫn người có kinh nghiệm.
2 Góc nhìn thực tế cần biết: Đây là request đơn giản nhất: Python client anthropic.Anthropic Tự động lấy ANTHROPICAPIKEY từ env message client.messages.create model "claude-sonnet-4-5". Điều quan trọng là hiểu rõ khi nào nên và không nên áp dụng phương pháp này để tránh lãng phí nguồn lực vào những trường hợp không phù hợp.
3 Không thể bỏ qua kiến thức này: Truyền hình ảnh vào Messages API bằng cách thêm image content block: Đọc và encode ảnh with open"screenshot.png", "rb" as f: imagedata base64.standardb64enco. Đây là nền tảng quan trọng mà mọi người làm việc với AI đều cần hiểu rõ để đạt kết quả tốt nhất có thể.
4 Tận dụng công cụ AI hiệu quả: Prompt caching cho phép cache phần đầu của prompt thường là system prompt dài hoặc large documents để giảm chi phí cho requests lặp đi lặp lại với cùng — mẹo quan trọng là cung cấp đủ ngữ cảnh trong prompt để AI trả về kết quả chính xác hơn nhiều so với cách hỏi chung chung mà đa số người dùng vẫn làm.
5 Không có giải pháp hoàn hảo cho mọi trường hợp: Với tasks offline không cần response realtime ví dụ: xử lý hàng nghìn records qua đêm, Batch API giúp giảm chi phí 50% so với API thông thường: client. Bài viết phân tích rõ trade-off giúp bạn đưa ra quyết định phù hợp nhất với tình huống thực tế của mình.

Young woman wearing headphones works on a laptop.

Tại sao dùng Claude API?

Claude.ai là interface tuyệt vời cho người dùng cá nhân, nhưng nếu bạn muốn tích hợp Claude vào ứng dụng của mình — xây dựng chatbot, tự động hóa quy trình, hay tạo sản phẩm AI — bạn cần Anthropic API.

API cho phép bạn:

Gọi Claude programmatically từ bất kỳ ngôn ngữ lập trình nào
Kiểm soát hoàn toàn system prompt, model, và parameters
Xử lý nhiều requests song song
Tích hợp vào CI/CD pipeline hoặc workflow tự động hóa
Build sản phẩm mà người dùng cuối tương tác với Claude qua giao diện của bạn

Bài viết này hướng dẫn toàn bộ quy trình: từ lấy API key đến deploy production.

Bước 1: Lấy API Key

Truy cập console.anthropic.com
Tạo tài khoản hoặc đăng nhập nếu đã có
Vào mục API Keys trong sidebar
Nhấn Create Key
Đặt tên mô tả cho key (ví dụ: "production-app", "local-dev")
Copy key ngay — key chỉ hiển thị một lần duy nhất

Billing và Free tier

API không có free tier vĩnh viễn — bạn cần thêm payment method và được charge theo usage (số tokens). Tuy nhiên, Anthropic thường cung cấp credit miễn phí cho account mới để thử nghiệm.

Bảo mật API Key

# KHÔNG bao giờ commit API key vào git
# Dùng environment variables

# .env file (thêm .env vào .gitignore)
ANTHROPIC_API_KEY=sk-ant-...

# Load trong Python
import os
from anthropic import Anthropic
client = Anthropic(api_key=os.environ.get("ANTHROPIC_API_KEY"))

Bước 2: Cài đặt SDK

Python SDK

pip install anthropic

Node.js / TypeScript SDK

npm install @anthropic-ai/sdk
# hoặc
yarn add @anthropic-ai/sdk

Kiểm tra cài đặt

# Python
python3 -c "import anthropic; print(anthropic.__version__)"

# Node.js
node -e "const Anthropic = require('@anthropic-ai/sdk'); console.log('OK')"

Bước 3: Messages API cơ bản

Tất cả interactions với Claude đều qua Messages API. Đây là request đơn giản nhất:

Python

import anthropic

client = anthropic.Anthropic()  # Tự động lấy ANTHROPIC_API_KEY từ env

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Xin chào! Bạn có thể giúp tôi viết một email không?"}
    ]
)

print(message.content[0].text)

Node.js

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic(); // Tự động lấy ANTHROPIC_API_KEY từ env

const message = await client.messages.create({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [
    { role: 'user', content: 'Xin chào! Bạn có thể giúp tôi viết một email không?' }
  ]
});

console.log(message.content[0].text);

Roles trong Messages API

Messages API sử dụng ba roles để cấu trúc hội thoại:

system: Hướng dẫn tổng thể cho Claude — định nghĩa vai trò, phong cách, quy tắc
user: Tin nhắn từ người dùng
assistant: Tin nhắn từ Claude (dùng trong multi-turn conversations)

Dùng system prompt

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system="Bạn là một trợ lý customer service cho công ty thương mại điện tử. "
           "Luôn lịch sự, hữu ích, và trả lời bằng tiếng Việt. "
           "Nếu không biết câu trả lời, hãy chuyển sang bộ phận hỗ trợ.",
    messages=[
        {"role": "user", "content": "Đơn hàng của tôi bị delay, tôi phải làm gì?"}
    ]
)

Multi-turn Conversations

Claude không tự lưu lịch sử hội thoại — bạn phải truyền toàn bộ history trong mỗi request:

conversation_history = []

def chat(user_message):
    conversation_history.append({
        "role": "user",
        "content": user_message
    })

    response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        system="Bạn là trợ lý lập trình Python hữu ích.",
        messages=conversation_history
    )

    assistant_message = response.content[0].text

    conversation_history.append({
        "role": "assistant",
        "content": assistant_message
    })

    return assistant_message

# Sử dụng
print(chat("Giải thích list comprehension trong Python"))
print(chat("Cho tôi ví dụ phức tạp hơn"))
print(chat("Khi nào nên dùng list comprehension thay vì for loop?")

Quản lý context window

Context window của Claude là 200K tokens. Với conversation dài, bạn cần chiến lược quản lý:

Sliding window: Chỉ giữ N tin nhắn gần nhất
Summarization: Tóm tắt phần đầu conversation khi quá dài
Selective memory: Chỉ giữ những turns quan trọng

Streaming Responses

Streaming cho phép hiển thị phản hồi từng chữ ngay khi Claude generate — giảm perceived latency đáng kể, đặc biệt cho response dài.

Python streaming

with client.messages.stream(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Viết một bài thơ về Hà Nội"}]
) as stream:
    for text in stream.text_stream:
        print(text, end="", flush=True)

Node.js streaming

const stream = await client.messages.stream({
  model: 'claude-sonnet-4-5',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Viết một bài thơ về Hà Nội' }]
});

for await (const chunk of stream) {
  if (chunk.type === 'content_block_delta' && chunk.delta.type === 'text_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

Vision API — Phân tích hình ảnh

Truyền hình ảnh vào Messages API bằng cách thêm image content block:

import base64

# Đọc và encode ảnh
with open("screenshot.png", "rb") as f:
    image_data = base64.standard_b64encode(f.read()).decode("utf-8")

message = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Mô tả UI trong screenshot này và gợi ý cải thiện UX."
                }
            ],
        }
    ],
)

Cũng có thể dùng URL trực tiếp thay vì base64 (ảnh phải publicly accessible).

Tool Use — Function Calling

Tool use cho phép Claude gọi các function bạn định nghĩa — đây là nền tảng để xây dựng AI agents có thể tương tác với hệ thống bên ngoài.

tools = [
    {
        "name": "get_weather",
        "description": "Lấy thông tin thời tiết hiện tại cho một thành phố",
        "input_schema": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "Tên thành phố, ví dụ 'Hà Nội', 'TP. HCM'"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Đơn vị nhiệt độ"
                }
            },
            "required": ["city"]
        }
    }
]

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    tools=tools,
    messages=[
        {"role": "user", "content": "Thời tiết Hà Nội hôm nay thế nào?"}
    ]
)

# Kiểm tra nếu Claude muốn dùng tool
if response.stop_reason == "tool_use":
    tool_use = next(block for block in response.content if block.type == "tool_use")
    tool_name = tool_use.name
    tool_input = tool_use.input

    # Gọi function thực tế
    if tool_name == "get_weather":
        weather_result = get_weather(tool_input["city"], tool_input.get("unit", "celsius"))

    # Trả kết quả cho Claude
    final_response = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        tools=tools,
        messages=[
            {"role": "user", "content": "Thời tiết Hà Nội hôm nay thế nào?"},
            {"role": "assistant", "content": response.content},
            {
                "role": "user",
                "content": [
                    {
                        "type": "tool_result",
                        "tool_use_id": tool_use.id,
                        "content": str(weather_result)
                    }
                ]
            }
        ]
    )
    print(final_response.content[0].text)

Error Handling

Xử lý errors đúng cách là bắt buộc trong production:

import anthropic
from anthropic import APIConnectionError, RateLimitError, APIStatusError

client = anthropic.Anthropic()

try:
    message = client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=[{"role": "user", "content": "Hello"}]
    )
    print(message.content[0].text)

except RateLimitError as e:
    print(f"Rate limit exceeded. Retry after: {e.response.headers.get('retry-after')}")
    # Implement exponential backoff

except APIConnectionError as e:
    print(f"Connection error: {e}")
    # Retry với backoff

except APIStatusError as e:
    print(f"API error {e.status_code}: {e.message}")
    if e.status_code == 529:
        print("API overloaded, retry later")
    elif e.status_code == 401:
        print("Invalid API key")
    elif e.status_code == 400:
        print(f"Bad request: {e.body}")

Rate Limits

Anthropic áp dụng rate limits theo account tier:

Tier	Requests/min	Tokens/min	Tokens/ngày
Tier 1 (mới)	50	50,000	1,000,000
Tier 2	1,000	100,000	2,500,000
Tier 3	2,000	200,000	5,000,000
Tier 4	4,000	400,000	10,000,000

Tier tăng dần khi account có lịch sử sử dụng và thanh toán. Implement exponential backoff khi gặp 429 (rate limit error).

Prompt Caching — Tiết kiệm chi phí

Prompt caching cho phép cache phần đầu của prompt (thường là system prompt dài hoặc large documents) để giảm chi phí cho requests lặp đi lặp lại với cùng context:

response = client.messages.create(
    model="claude-sonnet-4-5",
    max_tokens=1024,
    system=[
        {
            "type": "text",
            "text": "Bạn là trợ lý phân tích pháp lý. Dưới đây là toàn bộ bộ luật dân sự Việt Nam...",
            "cache_control": {"type": "ephemeral"}  # Cache phần này
        }
    ],
    messages=[{"role": "user", "content": "Điều 123 quy định gì?"}]
)

Phần được cache chỉ tính 10% giá input token (thay vì 100%) cho các requests tiếp theo. Đặc biệt hiệu quả khi:

System prompt rất dài (hàng nghìn tokens)
Phân tích cùng một document nhiều lần
RAG với large context được dùng nhiều lần

Tối ưu chi phí

Chọn đúng model

Chi phí API khác nhau đáng kể giữa các model (tính per million tokens):

Model	Input	Output	Khi nào dùng
Claude Opus 4	$15	$75	Tasks phức tạp, cần chất lượng cao nhất
Claude Sonnet 4	$3	$15	Hầu hết production use cases
Claude Haiku 3.5	$0.80	$4	High volume, simple tasks

Các kỹ thuật tiết kiệm

Dùng Haiku cho pre-filtering: Dùng Haiku để phân loại/filter requests đơn giản, chỉ escalate lên Sonnet/Opus khi thực sự cần
Optimize max_tokens: Set max_tokens phù hợp — đừng set quá cao cho mọi request
Prompt caching: Cache system prompt dài để giảm chi phí
Batch processing: Dùng Anthropic Batch API cho jobs offline (giảm 50% chi phí)

Best Practices cho Production

Logging và monitoring

import logging
import time

def create_message_with_logging(messages, model="claude-sonnet-4-5"):
    start_time = time.time()

    response = client.messages.create(
        model=model,
        max_tokens=1024,
        messages=messages
    )

    duration = time.time() - start_time

    logging.info({
        "model": model,
        "input_tokens": response.usage.input_tokens,
        "output_tokens": response.usage.output_tokens,
        "duration_ms": round(duration * 1000),
        "stop_reason": response.stop_reason
    })

    return response

Timeout và retry

import anthropic
from tenacity import retry, stop_after_attempt, wait_exponential

client = anthropic.Anthropic(timeout=30.0)  # 30 giây timeout

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=4, max=10)
)
def resilient_create(messages):
    return client.messages.create(
        model="claude-sonnet-4-5",
        max_tokens=1024,
        messages=messages
    )

Input validation

Validate và sanitize user input trước khi đưa vào prompt
Set message length limits để tránh token overflow
Implement content moderation nếu nhận input từ user không tin cậy

Security

Không expose API key ra client-side (browser)
Luôn call API từ server-side
Implement authentication cho API endpoints của bạn
Monitor usage để phát hiện bất thường sớm

Extended Thinking qua API

Extended Thinking cho phép Claude "suy nghĩ" dài hơn trước khi trả lời — hiệu quả cho các bài toán phức tạp:

response = client.messages.create(
    model="claude-opus-4",  # Extended Thinking hỗ trợ Opus 4 và Sonnet 4
    max_tokens=16000,
    thinking={
        "type": "enabled",
        "budget_tokens": 10000  # Số tokens cho thinking
    },
    messages=[
        {
            "role": "user",
            "content": "Phân tích trade-offs giữa microservices và monolith cho startup 5 người với 10K users."
        }
    ]
)

# Kết quả có thể chứa cả thinking blocks và text blocks
for block in response.content:
    if block.type == "thinking":
        print("=== Thinking ===")
        print(block.thinking)
    elif block.type == "text":
        print("=== Response ===")
        print(block.text)

Batch API — Xử lý hàng loạt

Với tasks offline không cần response realtime (ví dụ: xử lý hàng nghìn records qua đêm), Batch API giúp giảm chi phí 50% so với API thông thường:

import anthropic

client = anthropic.Anthropic()

# Tạo batch với nhiều requests
message_batch = client.messages.batches.create(
    requests=[
        {
            "custom_id": "product-001",
            "params": {
                "model": "claude-haiku-3-5",
                "max_tokens": 512,
                "messages": [{"role": "user", "content": "Viết mô tả 50 từ cho sản phẩm: Áo thun cotton trắng size M"}]
            }
        },
        {
            "custom_id": "product-002",
            "params": {
                "model": "claude-haiku-3-5",
                "max_tokens": 512,
                "messages": [{"role": "user", "content": "Viết mô tả 50 từ cho sản phẩm: Quần jeans xanh slim fit"}]
            }
        }
        # Có thể thêm hàng nghìn requests
    ]
)

print(f"Batch ID: {message_batch.id}")

# Kiểm tra status sau đó
batch_status = client.messages.batches.retrieve(message_batch.id)
print(f"Status: {batch_status.processing_status}")

Batch API xử lý trong vòng 24 giờ. Phù hợp cho: generating product descriptions, moderating large datasets, translating content libraries.

Anthropic API vs Claude.ai — Chọn gì?

Tiêu chí	Claude.ai (Web/App)	Anthropic API
Đối tượng	End users, cá nhân	Developers, doanh nghiệp
Chi phí	$0-$25/user/tháng	Pay per token (biến đổi)
Training data	Có (opt-out được)	Không
Customization	Hạn chế	Hoàn toàn kiểm soát
Integration	Standalone app	Tích hợp vào bất kỳ app
Artifacts	Có giao diện visual	Chỉ text output

Quy tắc chung: Nếu bạn đang build sản phẩm, dùng API. Nếu bạn đang dùng Claude cho bản thân, dùng claude.ai.

Testing và development workflow

Dùng Workbench để prototype prompts

Trước khi code, dùng console.anthropic.com/workbench để test prompts interactively. Workbench cho phép:

Test prompts với các model khác nhau
Điều chỉnh parameters (temperature, max_tokens) trực tiếp
Export code snippets Python/TypeScript ngay từ Workbench
So sánh output giữa các model song song

Unit testing với mock responses

import unittest
from unittest.mock import MagicMock, patch

class TestMyAIFeature(unittest.TestCase):

    @patch('anthropic.Anthropic')
    def test_summarize_text(self, mock_anthropic):
        # Mock Claude response
        mock_client = MagicMock()
        mock_anthropic.return_value = mock_client

        mock_response = MagicMock()
        mock_response.content[0].text = "Tóm tắt: văn bản nói về X."
        mock_client.messages.create.return_value = mock_response

        # Test function của bạn
        result = summarize_text("Văn bản dài...")
        self.assertIn("Tóm tắt", result)

        # Verify API được gọi đúng
        mock_client.messages.create.assert_called_once()

Resources tiếp theo

Để đi sâu hơn vào Claude API:

docs.anthropic.com — Official documentation đầy đủ nhất
console.anthropic.com/workbench — Prototype và test prompts
github.com/anthropics/anthropic-cookbook — Code examples và patterns thực tế
anthropic.com/research — Papers về Constitutional AI và kỹ thuật đằng sau Claude

Kết luận

Anthropic API là một trong những AI API được thiết kế tốt nhất hiện nay — documentation rõ ràng, SDK ổn định, và tính năng phong phú từ streaming đến tool use đến vision.

Điểm mấu chốt để thành công với Claude API trong production là: xử lý errors đúng cách, implement retry logic, chọn model phù hợp với use case và budget, và monitor usage liên tục. Với nền tảng đó, Claude API có thể xử lý quy mô từ prototype đến production của hàng triệu requests.

Tài liệu chính thức đầy đủ có tại docs.anthropic.com — luôn là nguồn tham khảo đáng tin cậy nhất khi có thắc mắc về tính năng cụ thể.

Gợi ý cho bạn

Function Calling — Tool Use API chi tiết