Bài tập tổng hợp — Viết prompt production-grade — Building with the Claude API

Bạn sẽ học được

Combine 4 technique (clear, specific, examples, XML) vào 1 prompt
Iterate prompt qua 3-4 version với measurable improvement
Document prompt evolution trong changelog
Ship 1 prompt production-ready cho use case cụ thể

Đề bài

Build 1 prompt production cho scenario bạn chọn từ 3 option dưới. Iterate v1 → v4, đo improvement.

Option A: Resume screener

Input: CV text + job description.

Output JSON:

{
  "match_score": 85,
  "strengths": ["...", "...", "..."],
  "gaps": ["...", "..."],
  "recommendation": "Interview | Pass | Junior pool",
  "questions_to_ask": ["...", "..."]
}

Option B: Customer email reply generator

Input: Incoming customer email + context (product, previous interactions, team policy).

Output: Draft reply email (plain text).

Option C: Meeting notes → action items

Input: Meeting transcript (50-500 words).

Output JSON:

{
  "decisions": ["...", "..."],
  "action_items": [
    {"task": "...", "owner": "...", "deadline": "..."}
  ],
  "risks_raised": ["..."],
  "followup_needed": true
}

Quy trình

Bước 1: Set goal (5 phút)

Trong notebook 04_final_prompt.ipynb, write:

Bước 2: v1 — Baseline (10 phút)

Viết prompt đơn giản nhất. Chạy qua 5 test case. Chấm điểm bằng tay (1-10):

## Goal

App: [Option A/B/C]

Prompt mục tiêu: [1 câu tóm tắt]

Success criteria (checklist khi eval):
- [ ] [criteria 1]
- [ ] [criteria 2]
- [ ] [criteria 3]
- [ ] [criteria 4]
- [ ] [criteria 5]

Test dataset: [bao nhiêu cases, source, cover edge không?]

Bước 2: v1 — Baseline (10 phút)

Bước 3: v2 — Apply Clear & Direct (10 phút)

Rewrite first line thành action command. Add basic structure.

## v1 — Baseline

Prompt:
[paste]

Test results:
- Case 1: score 3/10. Issue: ...
- Case 2: score 2/10. Issue: ...
- ...

Average: 2.6/10

Bước 3: v2 — Apply Clear & Direct (10 phút)

Bước 4: v3 — Add Guidelines (10 phút)

Add 5-6 quality guidelines. Specify format, length, must-include.

## v2 — Clear & Direct

Prompt:
[paste]

Changes from v1:
- Line 1: changed from "X" to "Y"

Test results:
- Case 1: score 5/10. Better structure.
- ...

Average: 4.5/10 (+73%)

Bước 4: v3 — Add Guidelines (10 phút)

Bước 5: v4 — Add Examples + XML (15 phút)

Wrap context trong XML tags. Add 2 examples (1 normal, 1 edge case).

## v3 — + Guidelines

Prompt:
[paste]

Changes:
- Added 6 guidelines

Test results:
- Case 1: score 7/10. All required fields present.
- ...

Average: 7.2/10

Bước 5: v4 — Add Examples + XML (15 phút)

Bước 6: Verify & Document (10 phút)

Check:

[ ] Run 5 test case lần 2 với v4 — stable score?
[ ] Có regression (case trước đây pass giờ fail) không?
[ ] Document changelog trong file prompts_changelog.md

## v4 — + Examples + XML

Prompt:
[paste]

Changes:
- Wrapped input in XML tags
- Added 2 examples

Test results:
- Case 1: score 9/10. Edge case handled.
- ...

Average: 8.5/10

Deliverable

Submit 3 thứ:

1. File my_prompt.py

2. File prompts_changelog.md

# my_prompt.py

PROMPT_V4 = """[final prompt]"""


def run_prompt(inputs: dict) -> str:
    # ... code gọi Claude với prompt
    pass

2. File prompts_changelog.md

3. Notebook 04_final_prompt.ipynb

Chạy v4 qua 5-10 test case, print output. Ready to demo.

# Prompt Evolution

## v4 (Current, 2026-04-20)
Score: 8.5/10
Changes: Added examples + XML structure
Author: [your name]

## v3 (2026-04-19)
Score: 7.2/10
Changes: Added 6 quality guidelines

## v2 (2026-04-19)
Score: 4.5/10
Changes: Clear & direct rewrite

## v1 (2026-04-18) — Baseline
Score: 2.6/10

Reference: Ví dụ hoàn chỉnh (Option A — Resume Screener)

Version này score 8.5/10 trên test 10 case. Ship production.

PROMPT_V4 = """Evaluate candidate resume against job description. Return structured assessment.

<job_description>
{job_description}
</job_description>

<candidate_resume>
{resume}
</candidate_resume>

<evaluation_framework>
1. Technical skills match (weight: 40%)
2. Experience relevance (weight: 30%)
3. Cultural fit signals (weight: 15%)
4. Growth potential (weight: 15%)
</evaluation_framework>

<guidelines>
- match_score: integer 0-100 (weighted avg of above)
- strengths: exactly 3 items, specific quotes from resume
- gaps: 1-3 items, concrete missing skills/experience
- recommendation: "Interview" if score > 70, "Pass" if < 50, "Junior pool" if 50-70
- questions_to_ask: exactly 3 items, target gaps or ambiguity
</guidelines>

<examples>
<example>
<input_job>Senior Python Engineer - 5+ years - AWS + Kubernetes</input_job>
<input_resume>John - 6 years Python, Django, some AWS basics, no K8s</input_resume>
<output>
{{
  "match_score": 72,
  "strengths": [
    "6 years Python exceeds 5-year requirement",
    "Django framework implies strong backend fundamentals",
    "Some AWS exposure"
  ],
  "gaps": [
    "No Kubernetes experience mentioned",
    "Unclear depth of AWS (only 'basics')"
  ],
  "recommendation": "Interview",
  "questions_to_ask": [
    "Can you walk through an AWS project in detail?",
    "Any exposure to container orchestration (Docker Compose, ECS, K8s)?",
    "Most complex distributed system you've designed?"
  ]
}}
</output>
</example>

<example>
<input_job>Senior Data Scientist - ML + deep learning</input_job>
<input_resume>Mary - Business analyst 3 years, Excel, SQL, some Python</input_resume>
<output>
{{
  "match_score": 25,
  "strengths": [
    "SQL skills transferable to data engineering",
    "Some Python exposure shows willingness to learn"
  ],
  "gaps": [
    "No ML or deep learning experience",
    "No statistical modeling background",
    "Role is too senior for current skill set"
  ],
  "recommendation": "Pass",
  "questions_to_ask": [
    "Interest in ML training path?",
    "Would Junior Data Analyst role interest you?",
    "Any ML projects outside of work?"
  ]
}}
</output>
</example>
</examples>

Return JSON only, no markdown wrapper."""

Tips & Tricks

Tip 1: Debug output quality by adding reasoning field

Tạm thời thêm <reasoning> vào output để thấy Claude "nghĩ" gì:

Sau khi hiểu logic, remove _reasoning cho production.

Tip 2: Test với extreme cases

Prompt production phải handle gracefully.

Tip 3: Version control prompt

Commit prompt vào git cùng code. PR review cho prompt như PR review code.

Input rỗng
Input quá dài (10K words)
Input ngôn ngữ khác (Vietnamese, Chinese)
Input ambiguous

{
  "match_score": 72,
  "_reasoning": "...",  // ← debugging
  "strengths": [...]
}

Tip 3: Version control prompt

git add my_prompt.py prompts_changelog.md
git commit -m "feat: prompt v4 — add examples for edge cases"

Self-review checklist

Content quality

Code quality

Documentation

[ ] v4 cover tất cả success criteria?
[ ] Test dataset đa dạng (normal + edge)?
[ ] 3-5 round iteration?
[ ] Score tăng consistently?
[ ] my_prompt.py có docstring?
[ ] run_prompt() có type hints?
[ ] Error handling (API fail, JSON parse)?
[ ] prompts_changelog.md có tất cả version?
[ ] Notebook demo chạy được fresh?
[ ] Test case cả positive và negative?

Tóm tắt

🎯 4 techniques áp dụng theo thứ tự: Clear & Direct → Specific Guidelines → Examples → XML.

🎯 Mỗi iteration đo score, không phải "feeling".

🎯 Changelog + version control — prompt như code.

🎯 Test dataset đa dạng — 20% edge case.

🎯 Ship khi score stable 8+/10 qua 3 lần run.

Nội dung này có hữu ích không?

Kiểm tra kiến thức

Củng cố những gì bạn vừa học

15 câu trắc nghiệm · đạt từ 70% · câu hỏi và đáp án xáo trộn mỗi lần.