Cơ bảnHướng dẫnClaude APINguồn: Anthropic

Upload PDF lên Claude API — Đọc và tóm tắt tài liệu

Minh TuấnCTO, Transform GroupTheo dõi

26/03/2026 0 0 5 phút đọc

Nghe bài viết

00:00

1 Công cụ AI sẽ thay đổi cách bạn làm việc: Kích thước PDF tối đa: 32 MB Số trang tối đa: 100 trang Format gửi: Base64-encoded string Claude đọc được cả text-based. Điểm mấu chốt là biết cách đặt prompt đúng để nhận kết quả có thể sử dụng ngay.
2 Góc nhìn thực tế: import anthropic import base64 from pathlib import Path client = anthropic.Anthropic def loadpdfasbase64pdfpath: str. Điều quan trọng là hiểu rõ khi nào nên và không nên áp dụng phương pháp này.
3 Nội dung cốt lõi: def analyzepdfpdfpath: str, question: str -> str: """Gửi PDF lên Claude và đặt câu hỏi về nó.""" pdfbase64 =. Nắm vững phần này sẽ giúp bạn áp dụng hiệu quả hơn 70% so với đọc lướt toàn bài.
4 Công cụ AI sẽ thay đổi cách bạn làm việc: """ pdfbase64 = loadpdfasbase64pdfpath schemastr = json.dumpsschema, ensureascii=False, indent=2 prompt = f"""Đọc tài. Điểm mấu chốt là biết cách đặt prompt đúng để nhận kết quả có thể sử dụng ngay.
5 Một điều ít người đề cập: from concurrent.futures import ThreadPoolExecutor from pathlib import Path def batchsummarizepdfspdffolder: str,. Hiểu rõ bối cảnh áp dụng sẽ quyết định 80% thành công khi triển khai.

Claude có thể đọc và hiểu PDF — không phải chỉ extract text thô, mà thực sự hiểu nội dung, tables, headings, và cấu trúc tài liệu. Đây là tính năng cực kỳ hữu ích cho các ứng dụng xử lý hợp đồng, báo cáo, nghiên cứu, và tài liệu kỹ thuật.

Giới hạn và yêu cầu

Kích thước PDF tối đa: 32 MB
Số trang tối đa: 100 trang
Format gửi: Base64-encoded string
Claude đọc được cả text-based PDF và scanned PDF (với OCR)
Hỗ trợ từ claude-3-5-sonnet-20241022 trở lên

Bước 1: Đọc PDF và encode Base64

import anthropic
import base64
from pathlib import Path

client = anthropic.Anthropic()

def load_pdf_as_base64(pdf_path: str) -> str:
    """Đọc file PDF và chuyển sang Base64."""
    with open(pdf_path, "rb") as f:
        pdf_data = f.read()
    return base64.standard_b64encode(pdf_data).decode("utf-8")

# Đọc PDF
pdf_base64 = load_pdf_as_base64("bao_cao_tai_chinh.pdf")
print(f"PDF size: {len(pdf_base64)} bytes (base64)")

Bước 2: Gửi PDF lên API

def analyze_pdf(pdf_path: str, question: str) -> str:
    """Gửi PDF lên Claude và đặt câu hỏi về nó."""
    pdf_base64 = load_pdf_as_base64(pdf_path)

    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_base64,
                    },
                },
                {
                    "type": "text",
                    "text": question
                }
            ],
        }],
    )

    return response.content[0].text

# Ví dụ sử dụng
answer = analyze_pdf(
    "hop_dong_thue_nha.pdf",
    "Tóm tắt các điều khoản quan trọng nhất trong hợp đồng này bằng tiếng Việt."
)
print(answer)

Tóm tắt tài liệu dài

def summarize_pdf(pdf_path: str, summary_type: str = "general") -> dict:
    """
    Tóm tắt PDF với nhiều loại khác nhau.

    summary_type:
    - "general": Tóm tắt tổng quan
    - "executive": Executive summary cho lãnh đạo
    - "technical": Tập trung vào chi tiết kỹ thuật
    - "action_items": Trích xuất action items và deadlines
    """
    pdf_base64 = load_pdf_as_base64(pdf_path)

    prompts = {
        "general": """Tóm tắt tài liệu này bằng tiếng Việt. Bao gồm:
1. Mục đích chính của tài liệu
2. Các điểm quan trọng (tối đa 5 điểm)
3. Kết luận hoặc khuyến nghị""",

        "executive": """Viết executive summary bằng tiếng Việt cho tài liệu này.
Giả sử người đọc là CEO với thời gian hạn chế. Tối đa 150 từ.
Bao gồm: vấn đề, giải pháp đề xuất, và tác động kinh doanh.""",

        "technical": """Trích xuất tất cả thông tin kỹ thuật quan trọng từ tài liệu.
Bao gồm: specifications, requirements, constraints, và technical decisions.""",

        "action_items": """Liệt kê tất cả action items, tasks, và deadlines từ tài liệu.
Format: danh sách có cấu trúc với người phụ trách (nếu có) và deadline.""",
    }

    prompt = prompts.get(summary_type, prompts["general"])

    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=1500,
        messages=[{
            "role": "user",
            "content": [
                {
                    "type": "document",
                    "source": {
                        "type": "base64",
                        "media_type": "application/pdf",
                        "data": pdf_base64,
                    },
                },
                {"type": "text", "text": prompt}
            ],
        }],
    )

    return {
        "file": pdf_path,
        "type": summary_type,
        "summary": response.content[0].text,
        "tokens_used": response.usage.input_tokens + response.usage.output_tokens,
    }

# Test
result = summarize_pdf("annual_report_2024.pdf", "executive")
print(f"Summary ({result['tokens_used']} tokens):")
print(result["summary"])

Trích xuất thông tin có cấu trúc

import json

def extract_structured_data(pdf_path: str, schema: dict) -> dict:
    """
    Trích xuất thông tin cụ thể từ PDF theo schema định nghĩa sẵn.
    """
    pdf_base64 = load_pdf_as_base64(pdf_path)

    schema_str = json.dumps(schema, ensure_ascii=False, indent=2)

    prompt = f"""Đọc tài liệu PDF và trích xuất thông tin theo schema JSON sau:

{schema_str}

Trả về JSON hợp lệ theo đúng schema. Nếu thông tin không có trong tài liệu, dùng null.
Chỉ trả về JSON, không có text khác."""

    response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=2000,
        messages=[
            {
                "role": "user",
                "content": [
                    {
                        "type": "document",
                        "source": {
                            "type": "base64",
                            "media_type": "application/pdf",
                            "data": pdf_base64,
                        },
                    },
                    {"type": "text", "text": prompt}
                ],
            },
            # Prefill để đảm bảo JSON output
            {"role": "assistant", "content": "{"}
        ],
        temperature=0.0,
    )

    json_str = "{" + response.content[0].text
    return json.loads(json_str)

# Ví dụ: Extract thông tin hợp đồng
contract_schema = {
    "contract_type": "string",
    "parties": {
        "party_a": {"name": "string", "address": "string"},
        "party_b": {"name": "string", "address": "string"}
    },
    "effective_date": "string (YYYY-MM-DD)",
    "end_date": "string (YYYY-MM-DD)",
    "total_value": "number",
    "currency": "string",
    "key_terms": ["string"],
    "penalty_clauses": "string or null"
}

data = extract_structured_data("hop_dong.pdf", contract_schema)
print(f"Contract type: {data['contract_type']}")
print(f"Value: {data['total_value']} {data['currency']}")

Xử lý nhiều PDF cùng lúc

from concurrent.futures import ThreadPoolExecutor
from pathlib import Path

def batch_summarize_pdfs(pdf_folder: str, max_workers: int = 3) -> list:
    """Tóm tắt nhiều PDFs song song."""
    pdf_files = list(Path(pdf_folder).glob("*.pdf"))
    print(f"Tìm thấy {len(pdf_files)} files PDF")

    def process_one(pdf_path):
        try:
            result = summarize_pdf(str(pdf_path), "general")
            return {"file": pdf_path.name, "status": "success", "summary": result["summary"]}
        except Exception as e:
            return {"file": pdf_path.name, "status": "error", "error": str(e)}

    with ThreadPoolExecutor(max_workers=max_workers) as executor:
        results = list(executor.map(process_one, pdf_files))

    success = sum(1 for r in results if r["status"] == "success")
    print(f"Hoàn thành: {success}/{len(pdf_files)} files")
    return results

# Tóm tắt tất cả PDFs trong folder
results = batch_summarize_pdfs("./reports/", max_workers=3)
for r in results:
    print(f"
{r['file']}: {r['status']}")
    if r["status"] == "success":
        print(r["summary"][:200] + "...")

Q&A với tài liệu

class PDFChatbot:
    """Chatbot có thể trả lời câu hỏi về một PDF."""

    def __init__(self, pdf_path: str):
        self.pdf_base64 = load_pdf_as_base64(pdf_path)
        self.pdf_name = Path(pdf_path).name
        self.conversation_history = []
        print(f"Đã load PDF: {self.pdf_name}")

    def ask(self, question: str) -> str:
        # Thêm câu hỏi vào lịch sử
        self.conversation_history.append({
            "role": "user",
            "content": question
        })

        # Tạo messages với PDF ở đầu conversation
        messages = [
            {
                "role": "user",
                "content": [
                    {
                        "type": "document",
                        "source": {
                            "type": "base64",
                            "media_type": "application/pdf",
                            "data": self.pdf_base64,
                        },
                        "title": self.pdf_name,
                    },
                    {"type": "text", "text": "Đây là tài liệu bạn sẽ dùng để trả lời câu hỏi của tôi."}
                ],
            },
            {"role": "assistant", "content": "Tôi đã đọc tài liệu. Bạn muốn hỏi gì?"},
            # Thêm lịch sử conversation
            *self.conversation_history,
        ]

        response = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=1000,
            system="Trả lời câu hỏi dựa trên tài liệu. Nếu thông tin không có trong tài liệu, nói rõ điều đó.",
            messages=messages,
        )

        answer = response.content[0].text
        self.conversation_history.append({"role": "assistant", "content": answer})
        return answer

# Sử dụng
chatbot = PDFChatbot("bao_cao_thuong_nien_2024.pdf")

print(chatbot.ask("Doanh thu năm 2024 là bao nhiêu?"))
print(chatbot.ask("So với năm 2023 tăng hay giảm?"))
print(chatbot.ask("Những rủi ro nào được đề cập trong báo cáo?")

Tips và best practices

Dùng model mạnh: claude-opus-4-5 hiểu tài liệu phức tạp tốt hơn haiku đáng kể
Compress PDF trước: Nếu PDF lớn, compress xuống dưới 32MB trước khi gửi
Chia nhỏ tài liệu dài: PDF trên 100 trang cần chia thành nhiều phần
Cung cấp context: Nói rõ tài liệu là gì (hợp đồng, báo cáo, v.v.) để Claude hiểu đúng format
Cache PDF: Nếu hỏi nhiều câu về cùng một PDF, dùng Prompt Caching để tránh gửi lại base64 mỗi lần

Xử lý PDF là một trong những use cases phổ biến nhất của Claude trong doanh nghiệp. Kết hợp với Prompt Caching để tối ưu chi phí khi làm việc với tài liệu dài.

Tính năng liên quan:PDF Processing Document Analysis Base64 Multimodal

Bai viet co huu ich khong?

Writer cho nền tảng kiến thức Claude AI cho người Việt. Software engineer với hơn 20 năm kinh nghiệm, đam mê AI và chia sẻ kiến thức công nghệ.

5 bài viết · 16K lượt đọc

Bình luận (0)

Đăng nhập để bình luận...

Đăng nhập để bình luận

Đang tải bình luận...

Gợi ý cho bạn

Batch Processing — Xử lý hàng loạt request với Claude API

Upload PDF lên Claude API — Đọc và tóm tắt tài liệu

Điểm nổi bật

Giới hạn và yêu cầu

Bước 1: Đọc PDF và encode Base64

Bước 2: Gửi PDF lên API

Tóm tắt tài liệu dài

Trích xuất thông tin có cấu trúc

Xử lý nhiều PDF cùng lúc

Q&A với tài liệu

Tips và best practices

Gợi ý cho bạn

Batch Processing — Xử lý hàng loạt request với Claude API

Vision + Tool Use — Trích xuất dữ liệu từ hình ảnh

Trích xuất JSON có cấu trúc với Tool Use — Không cần regex

Tool Use với Pydantic — Type-safe tools cho Claude

Tin liên quan nên xem

Tạo test data tự động với Claude — Synthetic Test Generation

Citations — Trích dẫn nguồn chính xác với Claude API

Bắt đầu với Claude Vision — Gửi hình ảnh qua API

Sub-Agent Pattern — Dùng Haiku phân tích, Opus tổng hợp

Upload PDF lên Claude API — Đọc và tóm tắt tài liệu

Điểm nổi bật

Giới hạn và yêu cầu

Bước 1: Đọc PDF và encode Base64

Bước 2: Gửi PDF lên API

Tóm tắt tài liệu dài

Trích xuất thông tin có cấu trúc

Xử lý nhiều PDF cùng lúc

Q&A với tài liệu

Tips và best practices

Gợi ý cho bạn

Batch Processing — Xử lý hàng loạt request với Claude API

Vision + Tool Use — Trích xuất dữ liệu từ hình ảnh

Trích xuất JSON có cấu trúc với Tool Use — Không cần regex

Tool Use với Pydantic — Type-safe tools cho Claude

Tin liên quan nên xem

Tạo test data tự động với Claude — Synthetic Test Generation

Citations — Trích dẫn nguồn chính xác với Claude API

Bắt đầu với Claude Vision — Gửi hình ảnh qua API

Sub-Agent Pattern — Dùng Haiku phân tích, Opus tổng hợp

Đăng ký nhận bản tin