{"product_id":"claude-api-authentication-rate-limits-va-error-handling","title":"Claude API — Authentication, Rate Limits và Error Handling","description":"\u003ch2\u003eGiới thiệu\u003c\/h2\u003e\n\u003cp\u003eLàm việc với Claude API đòi hỏi hiểu rõ ba khía cạnh nền tảng: xác thực (authentication), giới hạn tốc độ (rate limits), và xử lý lỗi (error handling). Đây không chỉ là kiến thức \"nice to have\" — thiếu chúng, ứng dụng của bạn sẽ thất bại trong production khi tải cao.\u003c\/p\u003e\n\n\u003cp\u003eBài viết này đi sâu vào từng khía cạnh với code examples thực tế, giải thích rõ ràng các error codes, và các patterns tốt nhất để xây dựng ứng dụng API production-ready.\u003c\/p\u003e\n\n\u003ch2\u003eAuthentication — Xác thực với Claude API\u003c\/h2\u003e\n\n\u003ch3\u003eAPI Keys là gì?\u003c\/h3\u003e\n\u003cp\u003eClaude API sử dụng API key để xác thực mọi request. API key là một chuỗi ký tự bắt đầu bằng \u003ccode\u003esk-ant-\u003c\/code\u003e — đây là credential duy nhất để Anthropic nhận biết request đến từ tổ chức\/người dùng nào.\u003c\/p\u003e\n\n\u003cp\u003eCó hai loại API key trong hệ thống Anthropic:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003eOrganization API keys:\u003c\/strong\u003e Tạo và quản lý tại console.anthropic.com, thuộc về organization, dùng cho production workloads\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePersonal API keys:\u003c\/strong\u003e Liên kết với tài khoản cá nhân, phù hợp cho development và testing\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3\u003eTạo và quản lý API Key\u003c\/h3\u003e\n\u003cp\u003eĐể tạo API key mới:\u003c\/p\u003e\n\u003col\u003e\n\u003cli\u003eĐăng nhập vào \u003cstrong\u003econsole.anthropic.com\u003c\/strong\u003e\n\u003c\/li\u003e\n\u003cli\u003eVào \u003cstrong\u003eAPI Keys\u003c\/strong\u003e trong sidebar\u003c\/li\u003e\n\u003cli\u003eClick \u003cstrong\u003eCreate Key\u003c\/strong\u003e, đặt tên mô tả rõ mục đích (ví dụ: \"production-app-v2\", \"dev-testing\")\u003c\/li\u003e\n\u003cli\u003eCopy key ngay lập tức — Anthropic chỉ hiển thị một lần\u003c\/li\u003e\n\u003c\/ol\u003e\n\n\u003cblockquote\u003eLưu ý quan trọng: Anthropic không lưu API key sau khi tạo. Nếu mất key, bạn phải tạo key mới và revoke key cũ.\u003c\/blockquote\u003e\n\n\u003ch3\u003eSử dụng API Key trong code\u003c\/h3\u003e\n\u003cp\u003eCách đúng để truyền API key là qua header \u003ccode\u003ex-api-key\u003c\/code\u003e:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003ecurl https:\/\/api.anthropic.com\/v1\/messages \\\n  -H \"x-api-key: YOUR_API_KEY\" \\\n  -H \"anthropic-version: 2023-06-01\" \\\n  -H \"content-type: application\/json\" \\\n  -d '{\n    \"model\": \"claude-sonnet-4-5\",\n    \"max_tokens\": 1024,\n    \"messages\": [{\"role\": \"user\", \"content\": \"Hello\"}]\n  }'\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eTrong Python với SDK chính thức:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport anthropic\nimport os\n\n# Đọc từ environment variable — KHÔNG hardcode key trong code\nclient = anthropic.Anthropic(\n    api_key=os.environ.get(\"ANTHROPIC_API_KEY\")\n)\n\nmessage = client.messages.create(\n    model=\"claude-sonnet-4-5\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\nprint(message.content[0].text)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eTrong Node.js\/TypeScript:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport Anthropic from \"@anthropic-ai\/sdk\";\n\nconst client = new Anthropic({\n  apiKey: process.env.ANTHROPIC_API_KEY,\n});\n\nconst message = await client.messages.create({\n  model: \"claude-sonnet-4-5\",\n  max_tokens: 1024,\n  messages: [{ role: \"user\", content: \"Hello\" }],\n});\n\nconsole.log(message.content[0].text);\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eBest practices bảo mật API Key\u003c\/h3\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003eDùng environment variables:\u003c\/strong\u003e Không bao giờ hardcode key trong source code\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eKhông commit vào git:\u003c\/strong\u003e Thêm \u003ccode\u003e.env\u003c\/code\u003e vào \u003ccode\u003e.gitignore\u003c\/code\u003e\n\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRotate định kỳ:\u003c\/strong\u003e Tạo key mới và revoke key cũ theo lịch (ví dụ: 90 ngày)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003ePrinciple of least privilege:\u003c\/strong\u003e Tạo key riêng cho từng environment (dev, staging, production)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eMonitor usage:\u003c\/strong\u003e Theo dõi usage dashboard để phát hiện bất thường\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3\u003eHeader bắt buộc\u003c\/h3\u003e\n\u003cp\u003eMỗi request đến Claude API cần các headers sau:\u003c\/p\u003e\n\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eHeader\u003c\/th\u003e\n\u003cth\u003eGiá trị\u003c\/th\u003e\n\u003cth\u003eBắt buộc\u003c\/th\u003e\n\u003c\/tr\u003e\n\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ccode\u003ex-api-key\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003eAPI key của bạn\u003c\/td\u003e\n\u003ctd\u003eCó\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ccode\u003eanthropic-version\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003e\u003ccode\u003e2023-06-01\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003eCó\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ccode\u003econtent-type\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003e\u003ccode\u003eapplication\/json\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003eCó\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\u003ccode\u003eanthropic-beta\u003c\/code\u003e\u003c\/td\u003e\n\u003ctd\u003eTên beta feature\u003c\/td\u003e\n\u003ctd\u003eKhông (chỉ khi dùng beta)\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003ch2\u003eRate Limits — Hiểu và xử lý giới hạn tốc độ\u003c\/h2\u003e\n\n\u003ch3\u003eCấu trúc Rate Limits\u003c\/h3\u003e\n\u003cp\u003eAnthropic áp dụng rate limits theo nhiều chiều đồng thời:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003eRPM (Requests Per Minute):\u003c\/strong\u003e Số request tối đa mỗi phút\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTPM (Tokens Per Minute):\u003c\/strong\u003e Số token (input + output) tối đa mỗi phút\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eITPM (Input Tokens Per Minute):\u003c\/strong\u003e Số input token tối đa mỗi phút\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cp\u003eRate limits khác nhau theo tier và model. Khi mới tạo tài khoản, bạn bắt đầu ở Tier 1 và tăng dần khi spend nhiều hơn:\u003c\/p\u003e\n\n\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eTier\u003c\/th\u003e\n\u003cth\u003eĐiều kiện\u003c\/th\u003e\n\u003cth\u003eClaude Sonnet 4 RPM\u003c\/th\u003e\n\u003cth\u003eTPM\u003c\/th\u003e\n\u003c\/tr\u003e\n\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003eTier 1\u003c\/td\u003e\n\u003ctd\u003eMới đăng ký\u003c\/td\u003e\n\u003ctd\u003e50\u003c\/td\u003e\n\u003ctd\u003e40,000\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eTier 2\u003c\/td\u003e\n\u003ctd\u003eĐã spend $40+\u003c\/td\u003e\n\u003ctd\u003e1,000\u003c\/td\u003e\n\u003ctd\u003e80,000\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eTier 3\u003c\/td\u003e\n\u003ctd\u003eĐã spend $200+\u003c\/td\u003e\n\u003ctd\u003e2,000\u003c\/td\u003e\n\u003ctd\u003e160,000\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003eTier 4\u003c\/td\u003e\n\u003ctd\u003eĐã spend $400+\u003c\/td\u003e\n\u003ctd\u003e4,000\u003c\/td\u003e\n\u003ctd\u003e400,000\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cp\u003eSố liệu chính xác thay đổi theo thời gian — kiểm tra trang docs.anthropic.com\/rate-limits để có thông tin mới nhất.\u003c\/p\u003e\n\n\u003ch3\u003eHeaders rate limit trong response\u003c\/h3\u003e\n\u003cp\u003eMỗi API response trả về headers cho biết trạng thái rate limit hiện tại:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eanthropic-ratelimit-requests-limit: 1000\nanthropic-ratelimit-requests-remaining: 999\nanthropic-ratelimit-requests-reset: 2024-12-01T00:00:00Z\nanthropic-ratelimit-tokens-limit: 80000\nanthropic-ratelimit-tokens-remaining: 79500\nanthropic-ratelimit-tokens-reset: 2024-12-01T00:01:00Z\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eĐọc các headers này để chủ động điều chỉnh rate của ứng dụng trước khi bị hit limit.\u003c\/p\u003e\n\n\u003ch2\u003eError Codes — Xử lý từng loại lỗi\u003c\/h2\u003e\n\n\u003ch3\u003eTổng quan HTTP Status Codes\u003c\/h3\u003e\n\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eStatus Code\u003c\/th\u003e\n\u003cth\u003eTên lỗi\u003c\/th\u003e\n\u003cth\u003eNguyên nhân thường gặp\u003c\/th\u003e\n\u003c\/tr\u003e\n\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003e400\u003c\/td\u003e\n\u003ctd\u003eBad Request\u003c\/td\u003e\n\u003ctd\u003eRequest format sai, field thiếu, giá trị không hợp lệ\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e401\u003c\/td\u003e\n\u003ctd\u003eUnauthorized\u003c\/td\u003e\n\u003ctd\u003eAPI key thiếu hoặc không hợp lệ\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e403\u003c\/td\u003e\n\u003ctd\u003eForbidden\u003c\/td\u003e\n\u003ctd\u003eAPI key không có quyền, region bị block\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e404\u003c\/td\u003e\n\u003ctd\u003eNot Found\u003c\/td\u003e\n\u003ctd\u003eEndpoint không tồn tại\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e429\u003c\/td\u003e\n\u003ctd\u003eToo Many Requests\u003c\/td\u003e\n\u003ctd\u003eVượt rate limit\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e500\u003c\/td\u003e\n\u003ctd\u003eInternal Server Error\u003c\/td\u003e\n\u003ctd\u003eLỗi phía Anthropic\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e529\u003c\/td\u003e\n\u003ctd\u003eOverloaded\u003c\/td\u003e\n\u003ctd\u003eAPI đang quá tải, thử lại sau\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003ch3\u003eError response format\u003c\/h3\u003e\n\u003cp\u003eKhi có lỗi, API trả về JSON với cấu trúc:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003e{\n  \"type\": \"error\",\n  \"error\": {\n    \"type\": \"rate_limit_error\",\n    \"message\": \"Rate limit exceeded for model claude-sonnet-4-5\"\n  }\n}\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eCác error type phổ biến:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003ccode\u003einvalid_request_error\u003c\/code\u003e — lỗi 400, request không hợp lệ\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003eauthentication_error\u003c\/code\u003e — lỗi 401, API key sai\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003epermission_error\u003c\/code\u003e — lỗi 403, không có quyền\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003enot_found_error\u003c\/code\u003e — lỗi 404\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003erate_limit_error\u003c\/code\u003e — lỗi 429, vượt rate limit\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003eapi_error\u003c\/code\u003e — lỗi 500, server error\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003eoverloaded_error\u003c\/code\u003e — lỗi 529, quá tải\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2\u003eRetry Strategy — Xử lý lỗi thông minh\u003c\/h2\u003e\n\n\u003ch3\u003eExponential Backoff\u003c\/h3\u003e\n\u003cp\u003eKhi gặp lỗi 429 hoặc 529, không retry ngay lập tức — hãy dùng exponential backoff với jitter:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport anthropic\nimport time\nimport random\n\ndef make_request_with_retry(client, max_retries=5, **kwargs):\n    \"\"\"\n    Gửi request với exponential backoff retry.\n    \"\"\"\n    for attempt in range(max_retries):\n        try:\n            return client.messages.create(**kwargs)\n        except anthropic.RateLimitError as e:\n            if attempt == max_retries - 1:\n                raise  # Re-raise sau khi hết số lần retry\n\n            # Exponential backoff: 1s, 2s, 4s, 8s, 16s\n            base_delay = 2 ** attempt\n            # Thêm jitter để tránh thundering herd\n            jitter = random.uniform(0, 1)\n            delay = base_delay + jitter\n\n            print(f\"Rate limited. Retrying in {delay:.2f}s (attempt {attempt + 1}\/{max_retries})\")\n            time.sleep(delay)\n        except anthropic.APIStatusError as e:\n            if e.status_code == 529:  # Overloaded\n                if attempt == max_retries - 1:\n                    raise\n                delay = 2 ** attempt + random.uniform(0, 1)\n                time.sleep(delay)\n            else:\n                raise  # Không retry các lỗi khác\n\n# Sử dụng\nclient = anthropic.Anthropic()\nresponse = make_request_with_retry(\n    client,\n    model=\"claude-sonnet-4-5\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Hello\"}]\n)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eSDK built-in retry\u003c\/h3\u003e\n\u003cp\u003eAnthropic SDK có sẵn retry mechanism. Bạn có thể configure khi khởi tạo client:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport anthropic\n\n# Python SDK — tự động retry với exponential backoff\nclient = anthropic.Anthropic(\n    max_retries=3,  # Mặc định là 2\n)\n\n# Hoặc disable retry hoàn toàn\nclient_no_retry = anthropic.Anthropic(\n    max_retries=0,\n)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport Anthropic from \"@anthropic-ai\/sdk\";\n\n\/\/ Node.js SDK\nconst client = new Anthropic({\n  maxRetries: 3, \/\/ Mặc định là 2\n  timeout: 20 * 1000, \/\/ 20 giây timeout\n});\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eKhi nào retry, khi nào không\u003c\/h3\u003e\n\u003ctable\u003e\n\u003cthead\u003e\n\u003ctr\u003e\n\u003cth\u003eError Code\u003c\/th\u003e\n\u003cth\u003eNên retry?\u003c\/th\u003e\n\u003cth\u003eLý do\u003c\/th\u003e\n\u003c\/tr\u003e\n\u003c\/thead\u003e\n\u003ctbody\u003e\n\u003ctr\u003e\n\u003ctd\u003e400\u003c\/td\u003e\n\u003ctd\u003eKhông\u003c\/td\u003e\n\u003ctd\u003eRequest của bạn sai, fix code trước\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e401\u003c\/td\u003e\n\u003ctd\u003eKhông\u003c\/td\u003e\n\u003ctd\u003eAPI key sai, không có lý do retry\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e403\u003c\/td\u003e\n\u003ctd\u003eKhông\u003c\/td\u003e\n\u003ctd\u003eVấn đề quyền hạn, cần can thiệp thủ công\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e429\u003c\/td\u003e\n\u003ctd\u003eCó (với backoff)\u003c\/td\u003e\n\u003ctd\u003eTạm thời, sẽ hết sau một thời gian\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e500\u003c\/td\u003e\n\u003ctd\u003eCó (giới hạn)\u003c\/td\u003e\n\u003ctd\u003eCó thể là lỗi tạm thời phía server\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e529\u003c\/td\u003e\n\u003ctd\u003eCó (với backoff dài)\u003c\/td\u003e\n\u003ctd\u003eAPI quá tải, cần đợi lâu hơn\u003c\/td\u003e\n\u003c\/tr\u003e\n\u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003ch2\u003eRequest và Response Format\u003c\/h2\u003e\n\n\u003ch3\u003eCấu trúc request cơ bản\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003e{\n  \"model\": \"claude-sonnet-4-5\",\n  \"max_tokens\": 1024,\n  \"messages\": [\n    {\n      \"role\": \"user\",\n      \"content\": \"Giải thích recursion bằng ví dụ đơn giản\"\n    }\n  ],\n  \"system\": \"Bạn là giáo viên lập trình thân thiện.\",\n  \"temperature\": 0.7,\n  \"stream\": false\n}\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eCác parameters quan trọng:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003emodel:\u003c\/strong\u003e Model ID (\u003ccode\u003eclaude-opus-4\u003c\/code\u003e, \u003ccode\u003eclaude-sonnet-4-5\u003c\/code\u003e, \u003ccode\u003eclaude-haiku-3-5\u003c\/code\u003e)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003emax_tokens:\u003c\/strong\u003e Số token output tối đa — bắt buộc phải truyền\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003emessages:\u003c\/strong\u003e Array các tin nhắn theo turn (user\/assistant)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003esystem:\u003c\/strong\u003e System prompt — không nằm trong messages array\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003etemperature:\u003c\/strong\u003e 0.0 (deterministic) đến 1.0 (creative), mặc định 1.0\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003estream:\u003c\/strong\u003e Bật streaming mode\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3\u003eCấu trúc response\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003e{\n  \"id\": \"msg_01XFDUDYJgAACzvnptvVoYEL\",\n  \"type\": \"message\",\n  \"role\": \"assistant\",\n  \"content\": [\n    {\n      \"type\": \"text\",\n      \"text\": \"Recursion là khi một hàm gọi chính nó...\"\n    }\n  ],\n  \"model\": \"claude-sonnet-4-5\",\n  \"stop_reason\": \"end_turn\",\n  \"stop_sequence\": null,\n  \"usage\": {\n    \"input_tokens\": 25,\n    \"output_tokens\": 156\n  }\n}\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eTrường \u003ccode\u003estop_reason\u003c\/code\u003e cho biết tại sao generation dừng:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003ccode\u003eend_turn\u003c\/code\u003e — model kết thúc tự nhiên\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003emax_tokens\u003c\/code\u003e — đạt giới hạn max_tokens, response có thể bị cắt\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003estop_sequence\u003c\/code\u003e — gặp stop sequence được định nghĩa\u003c\/li\u003e\n\u003cli\u003e\n\u003ccode\u003etool_use\u003c\/code\u003e — model muốn sử dụng tool\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2\u003eSDK Error Handling Patterns\u003c\/h2\u003e\n\n\u003ch3\u003ePython — xử lý toàn diện\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003eimport anthropic\nimport logging\n\nlogger = logging.getLogger(__name__)\n\ndef safe_claude_call(client, **kwargs):\n    try:\n        response = client.messages.create(**kwargs)\n\n        # Kiểm tra response bị cắt\n        if response.stop_reason == \"max_tokens\":\n            logger.warning(\"Response bị cắt do max_tokens. Tăng max_tokens nếu cần.\")\n\n        return response\n\n    except anthropic.AuthenticationError:\n        logger.error(\"API key không hợp lệ. Kiểm tra ANTHROPIC_API_KEY.\")\n        raise\n    except anthropic.PermissionDeniedError:\n        logger.error(\"Không có quyền. Kiểm tra API key permissions.\")\n        raise\n    except anthropic.BadRequestError as e:\n        logger.error(f\"Request không hợp lệ: {e.message}\")\n        raise\n    except anthropic.RateLimitError:\n        logger.warning(\"Rate limit exceeded. SDK sẽ tự retry.\")\n        raise\n    except anthropic.APIStatusError as e:\n        logger.error(f\"API error {e.status_code}: {e.message}\")\n        raise\n    except anthropic.APIConnectionError:\n        logger.error(\"Không thể kết nối tới Anthropic API. Kiểm tra network.\")\n        raise\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eTypeScript — với type safety\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003eimport Anthropic from \"@anthropic-ai\/sdk\";\n\nasync function safeClaude(\n  client: Anthropic,\n  params: Anthropic.MessageCreateParamsNonStreaming\n): Promise\u003canthropic.message\u003e {\n  try {\n    const response = await client.messages.create(params);\n\n    if (response.stop_reason === \"max_tokens\") {\n      console.warn(\"Response truncated. Consider increasing max_tokens.\");\n    }\n\n    return response;\n  } catch (error) {\n    if (error instanceof Anthropic.AuthenticationError) {\n      throw new Error(\"Invalid API key\");\n    }\n    if (error instanceof Anthropic.RateLimitError) {\n      console.warn(\"Rate limited — SDK will retry automatically\");\n      throw error;\n    }\n    if (error instanceof Anthropic.APIError) {\n      console.error(`API Error ${error.status}: ${error.message}`);\n      throw error;\n    }\n    throw error;\n  }\n}\u003c\/anthropic.message\u003e\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eStreaming — Xử lý response real-time\u003c\/h2\u003e\n\n\u003ch3\u003eKhi nào dùng streaming?\u003c\/h3\u003e\n\u003cp\u003eThay vì đợi toàn bộ response rồi hiển thị một lần, streaming cho phép hiển thị từng token ngay khi được generate. Dùng streaming khi:\u003c\/p\u003e\n\u003cul\u003e\n\u003cli\u003eBuilding chatbot UX — user thấy response ngay, không cảm giác \"đóng băng\"\u003c\/li\u003e\n\u003cli\u003eResponse dài — không cần đợi hàng chục giây trước khi thấy gì\u003c\/li\u003e\n\u003cli\u003eMuốn allow user cancel generation sớm\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3\u003eStreaming với Python SDK\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003eimport anthropic\n\nclient = anthropic.Anthropic()\n\n# Streaming với context manager\nwith client.messages.stream(\n    model=\"claude-sonnet-4-5\",\n    max_tokens=1024,\n    messages=[{\"role\": \"user\", \"content\": \"Giải thích về black holes\"}]\n) as stream:\n    for text in stream.text_stream:\n        print(text, end=\"\", flush=True)\n\n# Lấy final message sau khi stream xong\nfinal_message = stream.get_final_message()\nprint(f\"\nTokens used: {final_message.usage.input_tokens} in, {final_message.usage.output_tokens} out\")\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eStreaming với Node.js SDK\u003c\/h3\u003e\n\u003cpre\u003e\u003ccode\u003eimport Anthropic from \"@anthropic-ai\/sdk\";\n\nconst client = new Anthropic();\n\nconst stream = await client.messages.stream({\n  model: \"claude-sonnet-4-5\",\n  max_tokens: 1024,\n  messages: [{ role: \"user\", content: \"Giải thích về black holes\" }],\n});\n\nfor await (const chunk of stream) {\n  if (\n    chunk.type === \"content_block_delta\" \u0026amp;\u0026amp;\n    chunk.delta.type === \"text_delta\"\n  ) {\n    process.stdout.write(chunk.delta.text);\n  }\n}\n\nconst finalMessage = await stream.finalMessage();\nconsole.log(`\nUsage: ${finalMessage.usage.input_tokens} in \/ ${finalMessage.usage.output_tokens} out`);\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch3\u003eRate limits và streaming\u003c\/h3\u003e\n\u003cp\u003eStreaming vẫn bị ảnh hưởng bởi rate limits. Một streaming request vẫn tính là một request cho RPM limit, và toàn bộ tokens (input + output) tính cho TPM limit. Error 429 có thể xảy ra trước khi stream bắt đầu, nhưng không xảy ra giữa chừng của stream.\u003c\/p\u003e\n\n\u003ch2\u003eMonitoring và Observability\u003c\/h2\u003e\n\n\u003ch3\u003eTheo dõi usage\u003c\/h3\u003e\n\u003cp\u003eMỗi response trả về \u003ccode\u003eusage\u003c\/code\u003e object với số token đã dùng. Aggregate số liệu này để kiểm soát chi phí:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eclass UsageTracker:\n    def __init__(self):\n        self.total_input_tokens = 0\n        self.total_output_tokens = 0\n        self.request_count = 0\n\n    def track(self, response):\n        self.total_input_tokens += response.usage.input_tokens\n        self.total_output_tokens += response.usage.output_tokens\n        self.request_count += 1\n\n    def cost_estimate_usd(self, model=\"claude-sonnet-4-5\"):\n        \"\"\"Ước tính chi phí dựa trên usage.\"\"\"\n        pricing = {\n            \"claude-opus-4\": (15.0, 75.0),      # (input, output) per 1M tokens\n            \"claude-sonnet-4-5\": (3.0, 15.0),\n            \"claude-haiku-3-5\": (0.80, 4.0),\n        }\n        input_price, output_price = pricing.get(model, (3.0, 15.0))\n        cost = (self.total_input_tokens \/ 1_000_000 * input_price +\n                self.total_output_tokens \/ 1_000_000 * output_price)\n        return round(cost, 4)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eBest Practices tổng hợp\u003c\/h2\u003e\n\n\u003ch3\u003eChecklist trước khi deploy production\u003c\/h3\u003e\n\u003cul\u003e\n\u003cli\u003e\n\u003cstrong\u003eAPI key security:\u003c\/strong\u003e Dùng secret manager (AWS Secrets Manager, HashiCorp Vault) thay vì env file\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eRetry với backoff:\u003c\/strong\u003e Luôn handle 429 và 529 với exponential backoff\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eCircuit breaker:\u003c\/strong\u003e Implement pattern để dừng gọi khi error rate cao\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eTimeout:\u003c\/strong\u003e Set timeout hợp lý (30-60s cho request thông thường)\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eLogging:\u003c\/strong\u003e Log error codes, không log API key hay nội dung nhạy cảm\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eUsage monitoring:\u003c\/strong\u003e Alert khi token usage gần đến limit\u003c\/li\u003e\n\u003cli\u003e\n\u003cstrong\u003eGraceful degradation:\u003c\/strong\u003e Fallback khi API không khả dụng\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch3\u003eCommon mistakes cần tránh\u003c\/h3\u003e\n\u003cul\u003e\n\u003cli\u003eKhông set \u003ccode\u003emax_tokens\u003c\/code\u003e — request sẽ bị từ chối\u003c\/li\u003e\n\u003cli\u003eHardcode API key trong source code\u003c\/li\u003e\n\u003cli\u003eKhông handle \u003ccode\u003estop_reason == \"max_tokens\"\u003c\/code\u003e — response bị cắt silently\u003c\/li\u003e\n\u003cli\u003eRetry tất cả error codes — không nên retry 400, 401, 403\u003c\/li\u003e\n\u003cli\u003eKhông đọc rate limit headers — bỏ lỡ cơ hội điều chỉnh proactively\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003ch2\u003eKết luận\u003c\/h2\u003e\n\u003cp\u003eAuthentication, rate limits, và error handling là ba trụ cột của một Claude API integration bền vững. Bắt đầu với API key management đúng cách, implement exponential backoff cho rate limit errors, và handle từng error code phù hợp.\u003c\/p\u003e\n\n\u003cp\u003eSDK chính thức của Anthropic (Python và Node.js) đã xử lý nhiều edge cases tự động — hãy sử dụng chúng thay vì tự implement HTTP calls từ đầu. Đọc rate limit headers proactively và monitor token usage để tránh bị surprise bởi bills hay downtime.\u003c\/p\u003e\n\u003chr\u003e\n\u003ch3\u003eBài viết liên quan\u003c\/h3\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/anthropic-console-qu%E1%BA%A3n-ly-api-billing-va-workbench\"\u003eAnthropic Console — Quản lý API, billing và workbench\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/building-effective-agents-v%E1%BB%9Bi-claude-h%C6%B0%E1%BB%9Bng-d%E1%BA%ABn-ki%E1%BA%BFn-truc\"\u003eBuilding Effective Agents với Claude — Hướng dẫn kiến trúc\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/context-engineering-ngh%E1%BB%87-thu%E1%BA%ADt-qu%E1%BA%A3n-ly-context-cho-claude\"\u003eContext Engineering — Nghệ thuật quản lý context cho Claude\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/b%E1%BA%AFt-d%E1%BA%A7u-v%E1%BB%9Bi-claude-vision-g%E1%BB%ADi-hinh-%E1%BA%A3nh-qua-api\"\u003eBắt đầu với Claude Vision — Gửi hình ảnh qua API\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/claude-cho-data-t%E1%BA%A1o-bi%E1%BB%83u-d%E1%BB%93-va-visualization\"\u003eClaude cho Data: Tạo biểu đồ và visualization\u003c\/a\u003e\u003c\/li\u003e\n\u003c\/ul\u003e","brand":"Minh Tuấn","offers":[{"title":"Default Title","offer_id":47721067970772,"sku":null,"price":0.0,"currency_code":"VND","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0821\/0264\/9044\/files\/claude-api-authentication-rate-limits-va-error-handling.jpg?v=1774521086","url":"https:\/\/claude.vn\/en\/products\/claude-api-authentication-rate-limits-va-error-handling","provider":"CLAUDE.VN","version":"1.0","type":"link"}