Trung cấpKỹ thuậtClaude APINguồn: Anthropic

Agent Workflows — Chaining, Routing, Parallelization

Minh TuấnCTO, Transform GroupTheo dõi

26/03/2026 635 0 6 phút đọc

Nghe bài viết

00:00

1 Bạn có thể bắt đầu ngay với hướng dẫn chi tiết: Một LLM call đơn lẻ có giới hạn: context window hữu hạn, không thể parallelize, không có feedback loop. Mỗi bước được thiết kế để giảm thiểu sai sót và tối ưu kết quả từ lần đầu sử dụng.
2 Thành thật mà nói: import anthropic client = anthropic.Anthropic def promptchaindocumenttext: """Pipeline 3 bước: Extract -> Translate. Phương pháp này hiệu quả trong hầu hết trường hợp, nhưng bạn cần điều chỉnh cho phù hợp ngữ cảnh riêng.
3 Không thể bỏ qua: def routingagentuserquery: """Route câu hỏi đến specialist phù hợp""" Router: phân loại intent routerresponse =. Đây là kiến thức nền tảng mà mọi người làm việc với AI đều cần hiểu rõ.
4 Áp dụng ngay: Một Orchestrator LLM động phân tích task và điều phối nhiều Worker LLMs chuyên biệt: def — phần này cung cấp quy trình cụ thể giúp bạn triển khai hiệu quả mà không cần thử nghiệm nhiều lần.
5 Thành thật mà nói: Generator tạo output, Evaluator đánh giá, loop cho đến khi đạt quality threshold: def evaluatoroptimizertask: str,. Phương pháp này hiệu quả trong hầu hết trường hợp, nhưng bạn cần điều chỉnh cho phù hợp ngữ cảnh riêng.

Anthropic đã tổng kết kinh nghiệm xây dựng hàng trăm AI applications vào 5 agentic patterns cơ bản. Đây là những "design patterns" cho AI workflows — giống như Factory, Observer hay Strategy pattern trong lập trình hướng đối tượng, nhưng dành cho LLM systems.

Hiểu và áp dụng đúng 5 patterns này sẽ giúp bạn xây dựng AI applications phức tạp một cách có cấu trúc, dễ maintain và scale.

Tại sao cần Agentic Patterns?

Một LLM call đơn lẻ có giới hạn: context window hữu hạn, không thể parallelize, không có feedback loop. Agentic patterns giải quyết những hạn chế này bằng cách orchestrate nhiều LLM calls thành pipeline thông minh.

Pattern 1: Prompt Chaining (Chuỗi xử lý tuần tự)

Chia task phức tạp thành nhiều bước tuần tự, output của bước này là input của bước tiếp theo.

import anthropic

client = anthropic.Anthropic()

def prompt_chain(document_text):
    """Pipeline 3 bước: Extract -> Translate -> Summarize"""

    # Bước 1: Trích xuất thông tin chính
    step1 = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": f"""Extract key facts from this document.
Output as JSON with fields: main_topic, key_points (list), entities (list).

Document:
{document_text}"""
        }]
    )
    extracted = step1.content[0].text

    # Bước 2: Dịch sang tiếng Việt
    step2 = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": f"""Translate this JSON data to Vietnamese.
Keep the JSON structure exactly the same.

{extracted}"""
        }]
    )
    translated = step2.content[0].text

    # Bước 3: Tạo executive summary
    step3 = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=500,
        messages=[{
            "role": "user",
            "content": f"""Based on these extracted facts, write a 3-sentence executive summary in Vietnamese.

Facts:
{translated}"""
        }]
    )

    return {
        "extracted": extracted,
        "translated": translated,
        "summary": step3.content[0].text
    }

Khi nào dùng Prompt Chaining:

Task có các bước logic tự nhiên (extract → transform → generate)
Kết quả trung gian cần kiểm tra hoặc lưu lại
Mỗi bước dùng model khác nhau (haiku cho simple, opus cho complex)
Muốn retry từng bước độc lập khi có lỗi

Pattern 2: Routing (Phân loại và định tuyến)

Một LLM "classifier" phân tích input và quyết định route đến handler chuyên biệt nào.

def routing_agent(user_query):
    """Route câu hỏi đến specialist phù hợp"""

    # Router: phân loại intent
    router_response = client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=100,
        messages=[{
            "role": "user",
            "content": f"""Classify this query into exactly one category:
- TECHNICAL: code, bugs, API, architecture questions
- BILLING: pricing, invoices, subscriptions
- GENERAL: other questions

Query: {user_query}

Respond with only the category name."""
        }]
    )

    category = router_response.content[0].text.strip()

    # Specialist handlers
    specialists = {
        "TECHNICAL": handle_technical,
        "BILLING": handle_billing,
        "GENERAL": handle_general
    }

    handler = specialists.get(category, handle_general)
    return handler(user_query)

def handle_technical(query):
    return client.messages.create(
        model="claude-opus-4-5",  # Model mạnh hơn cho technical
        max_tokens=4000,
        system="You are a senior software engineer. Provide detailed technical answers with code examples.",
        messages=[{"role": "user", "content": query}]
    ).content[0].text

def handle_billing(query):
    return client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=1000,
        system="You are a billing specialist. Be precise about pricing and refer to official documentation.",
        messages=[{"role": "user", "content": query}]
    ).content[0].text

def handle_general(query):
    return client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=1000,
        messages=[{"role": "user", "content": query}]
    ).content[0].text

Khi nào dùng Routing:

Nhiều loại câu hỏi/task khác nhau với specialist xử lý tốt hơn
Muốn dùng model đắt tiền chỉ khi thực sự cần
Có domain expertise khác nhau cần áp dụng

Pattern 3: Parallelization (Xử lý song song)

Chia task thành nhiều sub-tasks độc lập, chạy song song, rồi aggregate kết quả.

import asyncio
import anthropic

async_client = anthropic.AsyncAnthropic()

async def analyze_product_reviews(reviews: list[str]):
    """Phân tích nhiều reviews song song"""

    async def analyze_single_review(review: str, review_id: int):
        response = await async_client.messages.create(
            model="claude-haiku-4-5",
            max_tokens=300,
            messages=[{
                "role": "user",
                "content": f"""Analyze this review. Return JSON:
{{
  "sentiment": "positive/negative/neutral",
  "score": 1-5,
  "key_issue": "main complaint or praise",
  "actionable": true/false
}}

Review: {review}"""
            }]
        )
        return {"id": review_id, "analysis": response.content[0].text}

    # Chạy tất cả song song
    tasks = [
        analyze_single_review(review, i)
        for i, review in enumerate(reviews)
    ]
    results = await asyncio.gather(*tasks)

    # Aggregate
    return aggregate_reviews(results)

def aggregate_reviews(results):
    """Tổng hợp kết quả từ tất cả reviews"""
    # Trong production: parse JSON, tính stats, etc.
    return {
        "total": len(results),
        "results": results
    }

# Gọi
reviews = ["Great product!", "Delivery was slow", "Amazing quality, worth the price"]
results = asyncio.run(analyze_product_reviews(reviews))

Fan-out/Fan-in với Voting

Một biến thể của Parallelization: chạy cùng một task nhiều lần, voting để lấy kết quả tốt nhất:

async def parallel_with_voting(question: str, n_votes: int = 3):
    """Chạy N lần, lấy kết quả được vote nhiều nhất"""

    async def single_run(run_id: int):
        response = await async_client.messages.create(
            model="claude-haiku-4-5",
            max_tokens=500,
            temperature=0.7,  # Một chút randomness để đa dạng
            messages=[{"role": "user", "content": question}]
        )
        return response.content[0].text

    tasks = [single_run(i) for i in range(n_votes)]
    answers = await asyncio.gather(*tasks)

    # Voting: dùng LLM để chọn best answer
    voting_prompt = f"""
You got {n_votes} different answers to the same question.
Choose the best one and explain why.

Question: {question}

Answers:
""" + "
".join([f"{i+1}. {a}" for i, a in enumerate(answers)])

    final = await async_client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=500,
        messages=[{"role": "user", "content": voting_prompt}]
    )
    return final.content[0].text

Pattern 4: Orchestrator-Workers

Một Orchestrator LLM động phân tích task và điều phối nhiều Worker LLMs chuyên biệt:

def orchestrator_workers(complex_task: str):
    """Orchestrator tạo kế hoạch, workers thực hiện"""

    # Orchestrator: tạo execution plan
    plan_response = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=2000,
        messages=[{
            "role": "user",
            "content": f"""You are an orchestrator. Break this task into subtasks.
Output JSON array of steps, each with:
- step_id: number
- description: what to do
- worker_type: "researcher"/"writer"/"coder"/"analyst"
- depends_on: list of step_ids that must complete first

Task: {complex_task}"""
        }]
    )

    import json
    plan = json.loads(plan_response.content[0].text)

    # Execute theo dependency order
    results = {}
    for step in plan:
        # Chờ dependencies
        deps = step.get("depends_on", [])
        dep_context = {dep_id: results[dep_id] for dep_id in deps if dep_id in results}

        # Gọi worker phù hợp
        worker_result = call_worker(
            step["worker_type"],
            step["description"],
            dep_context
        )
        results[step["step_id"]] = worker_result

    # Orchestrator tổng hợp
    final = client.messages.create(
        model="claude-opus-4-5",
        max_tokens=3000,
        messages=[{
            "role": "user",
            "content": f"""Synthesize these worker results into a final deliverable.
Original task: {complex_task}
Results: {json.dumps(results, ensure_ascii=False)}"""
        }]
    )
    return final.content[0].text

def call_worker(worker_type: str, task: str, context: dict):
    system_prompts = {
        "researcher": "You are a research specialist. Find facts and provide citations.",
        "writer": "You are a professional writer. Create clear, engaging content.",
        "coder": "You are a senior developer. Write clean, well-documented code.",
        "analyst": "You are a data analyst. Provide quantitative insights."
    }
    return client.messages.create(
        model="claude-haiku-4-5",
        max_tokens=2000,
        system=system_prompts.get(worker_type, "You are a helpful assistant."),
        messages=[{
            "role": "user",
            "content": f"Task: {task}
Context from previous steps: {context}"
        }]
    ).content[0].text

Pattern 5: Evaluator-Optimizer

Generator tạo output, Evaluator đánh giá, loop cho đến khi đạt quality threshold:

def evaluator_optimizer(task: str, max_iterations: int = 3):
    """Tự cải thiện output qua nhiều vòng lặp"""
    current_output = None
    feedback_history = []

    for iteration in range(max_iterations):
        # Generator: tạo hoặc cải thiện output
        gen_prompt = task if iteration == 0 else f"""
Task: {task}

Previous attempt:
{current_output}

Feedback received:
{chr(10).join(feedback_history)}

Please improve the output addressing all feedback points."""

        current_output = client.messages.create(
            model="claude-opus-4-5",
            max_tokens=3000,
            messages=[{"role": "user", "content": gen_prompt}]
        ).content[0].text

        # Evaluator: chấm điểm và feedback
        eval_response = client.messages.create(
            model="claude-haiku-4-5",
            max_tokens=500,
            messages=[{
                "role": "user",
                "content": f"""Evaluate this output for the task: {task}

Output:
{current_output}

Rate 1-10 and list specific improvements needed.
If score >= 8, output APPROVED.
Format: SCORE: X
FEEDBACK: ..."""
            }]
        ).content[0].text

        if "APPROVED" in eval_response or "SCORE: 9" in eval_response or "SCORE: 10" in eval_response:
            return {"output": current_output, "iterations": iteration + 1, "approved": True}

        feedback_history.append(f"Iteration {iteration + 1}: {eval_response}")

    return {"output": current_output, "iterations": max_iterations, "approved": False}

Chọn Pattern phù hợp

Pattern	Dùng khi	Độ phức tạp
Prompt Chaining	Task có steps rõ ràng, tuần tự	Thấp
Routing	Nhiều loại input, specialist xử lý tốt hơn	Thấp
Parallelization	Tasks độc lập, cần tốc độ hoặc voting	Trung bình
Orchestrator-Workers	Task phức tạp, không biết trước steps	Cao
Evaluator-Optimizer	Output quality quan trọng, cần iterate	Trung bình

Đọc chi tiết về từng pattern: Evaluator-Optimizer Pattern và Orchestrator-Workers Architecture.

Gợi ý cho bạn

Agent Loop — Nền tảng xây dựng AI Agent với Claude

Agent Workflows — Chaining, Routing, Parallelization

Điểm nổi bật

Tại sao cần Agentic Patterns?

Pattern 1: Prompt Chaining (Chuỗi xử lý tuần tự)

Pattern 2: Routing (Phân loại và định tuyến)

Pattern 3: Parallelization (Xử lý song song)

Fan-out/Fan-in với Voting

Pattern 4: Orchestrator-Workers

Pattern 5: Evaluator-Optimizer

Chọn Pattern phù hợp

Bài viết liên quan

Gợi ý cho bạn

Agent Loop — Nền tảng xây dựng AI Agent với Claude

Dynamic Prompt Generation — Tạo prompt tự động theo context và user input

Agent Loop vs Prompt Chaining — Chọn pattern đúng cho bài toán của bạn

Retrieval Agent — Xây dựng Agentic RAG với Claude

Tin liên quan nên xem

Orchestrator-Workers — Kiến trúc điều phối agent phức tạp

Content Moderation — Xây dựng bộ lọc nội dung với Claude

JSON Mode — Buộc Claude trả về JSON chính xác

Evaluator-Optimizer — Tự cải thiện output với feedback loop

Agent Workflows — Chaining, Routing, Parallelization

Điểm nổi bật

Tại sao cần Agentic Patterns?

Pattern 1: Prompt Chaining (Chuỗi xử lý tuần tự)

Pattern 2: Routing (Phân loại và định tuyến)

Pattern 3: Parallelization (Xử lý song song)

Fan-out/Fan-in với Voting

Pattern 4: Orchestrator-Workers

Pattern 5: Evaluator-Optimizer

Chọn Pattern phù hợp

Bài viết liên quan

Gợi ý cho bạn

Agent Loop — Nền tảng xây dựng AI Agent với Claude

Dynamic Prompt Generation — Tạo prompt tự động theo context và user input

Agent Loop vs Prompt Chaining — Chọn pattern đúng cho bài toán của bạn

Retrieval Agent — Xây dựng Agentic RAG với Claude

Tin liên quan nên xem

Orchestrator-Workers — Kiến trúc điều phối agent phức tạp

Content Moderation — Xây dựng bộ lọc nội dung với Claude

JSON Mode — Buộc Claude trả về JSON chính xác

Evaluator-Optimizer — Tự cải thiện output với feedback loop

Đăng ký nhận bản tin