Providing examples — Few-shot prompting

3 — Prompt EngineeringTrung cấp20 phút

Một người dạy bạn nấu phở. 2 cách:

Bạn sẽ học được
  • Phân biệt zero-shot, one-shot, few-shot prompting
  • Wrap example trong XML tags để Claude parse đúng
  • Chọn example tốt (đại diện, corner cases, diverse)
  • Biết khi nào examples cải thiện nhiều nhất
  • Avoid over-fitting prompt với example

Zero-shot vs Few-shot

Zero-shot

Chỉ instruction, không example:

Strength: Concise. Works cho task common. Weakness: Claude guess — có thể miss corner case.

One-shot

1 example:

Classify this tweet as positive or negative: "Yêu app này quá!"

One-shot

Strength: Claude biết format expected. Weakness: 1 example không cover nuance.

Few-shot (2-5 examples)

Nhiều example, đặc biệt corner case:

Classify tweets as positive or negative.

Example:
Tweet: "Great game tonight!"
Label: Positive

Now classify:
Tweet: "Yêu app này quá!"

Few-shot (2-5 examples)

Strength: Cover corner cases (sarcasm, mild sentiment). Weakness: Tốn token, risk over-fit example.

Rule: 2-5 examples thường đủ. > 5 chỉ cho task rất nuanced.

Classify tweets as positive or negative.

Examples:

Tweet: "Great game tonight!"
Label: Positive

Tweet: "Oh yeah, I really needed a flight delay tonight! Excellent!"
Label: Negative  (← sarcasm!)

Tweet: "The food was OK, nothing special."
Label: Negative  (← mild, not neutral)

Now classify:
Tweet: "Yêu app này quá!"

XML structure cho examples

Anthropic đặc biệt "quen" với XML tags. Structure examples clear:

Benefits:

  • Claude dễ parse boundaries
  • Có thể add <reasoning> để teach logic, không chỉ label
  • Consistent structure → consistent output
prompt = """Classify tweet sentiment: Positive, Negative, or Neutral.

<examples>
<example>
<input>Great game tonight!</input>
<output>Positive</output>
</example>

<example>
<input>Oh yeah, I really needed a flight delay tonight! Excellent!</input>
<output>Negative</output>
<reasoning>Sarcasm. Words positive, but context shows frustration.</reasoning>
</example>

<example>
<input>The food was OK.</input>
<output>Neutral</output>
</example>
</examples>

Now classify:
<input>{tweet}</input>"""

Corner case examples

Ví dụ mạnh nhất là corner case — scenario Claude hay fail.

Sentiment analysis

Normal:

Corner cases:

Which to include? Corner cases. Skip obvious ones.

Product categorization

Normal:

Corner cases:

  • "Love it!" → Positive
  • "Terrible service" → Negative
  • "Not bad" → Positive (mild)
  • "Could be worse" → Neutral/slightly Negative
  • "It's fine, I guess" → Neutral
  • "Best movie since Plan 9 from Outer Space" → Negative (sarcasm, Plan 9 is notorious bad movie)
  • "iPhone 15" → Electronics
  • "Running shoes" → Sports
  • "Fitness tracker" → Electronics OR Sports? → Pick 1, explain
  • "Yoga mat" → Sports OR Home? → Pick 1
  • "Water bottle" → Kitchen OR Sports? → Context matters

Case study: Meal planner với examples (v5)

Từ bài 6.17, score = 7.86. Add examples:

Score v5: 8.9/10 (+13% vs v4)

Example cho:

  • Exact format structure
  • Level of detail (portions in g, timing, rationale)
  • Tone (direct, numeric)
prompt_v5 = f"""Generate a one-day meal plan for an athlete.

<example>
<input>
Height: 180cm, Weight: 80kg, Goal: Build muscle, Restrictions: None
</input>
<output>
Total: 3200 calories | Protein 200g | Carbs 350g | Fat 80g

8:00 Breakfast (700 cal):
- 3 eggs scrambled (210g) — 210 cal, 18g protein
- Oatmeal with banana (80g dry oats, 1 banana) — 400 cal, 10g protein
- Greek yogurt (150g) — 90 cal, 15g protein

12:00 Lunch (900 cal):
- Grilled chicken breast (200g) — 330 cal, 60g protein
- Brown rice (80g dry) — 280 cal, 6g protein
- Broccoli + olive oil (150g + 10ml) — 80 + 90 cal

15:00 Snack (500 cal):
- Protein shake (30g whey + 250ml milk) — 250 cal, 30g protein
- Almonds (30g) — 180 cal, 6g protein

19:00 Dinner (1100 cal):
- Salmon (200g) — 400 cal, 40g protein
- Sweet potato (250g) — 215 cal, 4g protein
- Salad + avocado (200g + 100g) — 200 + 160 cal

Rationale: High protein (2.5g/kg body weight) for muscle building. 
Timing around workout (assumed 17:00-18:00).
</output>
</example>

Now generate for:
Height: {h}, Weight: {w}, Goal: {goal}, Restrictions: {restrictions}"""

Sourcing examples từ evals

Best examples là từ previous evals — outputs scored highest:

Workflow

Insight: Prompt engineering cải thiện exponentially khi bạn có eval loop — vì bạn extract insight từ previous outputs.

  • Chạy prompt v1 qua test dataset
  • Identify case scored 9-10/10
  • Extract input + output pairs
  • Use as examples in v2
# After eval v1
best_cases = sorted(eval_results, key=lambda x: x.score, reverse=True)[:3]

examples_text = "\n\n".join(f"""
<example>
<input>{case.input}</input>
<output>{case.output}</output>
</example>""" for case in best_cases)

prompt_v2 = f"""{base_instruction}

<examples>
{examples_text}
</examples>

Now handle:
<input>{{input}}</input>"""

Anti-patterns

❌ Example quá generic

Vấn đề: Input không cụ thể, Claude không học được gì.

Fix: Example input phải là thực tế cụ thể.

❌ Example contradict instruction

<example>
<input>Tweet about food</input>
<output>Positive</output>
</example>

❌ Example contradict instruction

Vấn đề: Claude confused, follow example hay instruction?

Fix: Example must exemplify instruction, not contradict.

❌ Biased examples

Instruction: "Respond in 100 words."
<example>
<output>[3-word response]</output>
</example>

❌ Biased examples

All positive → Claude có thể bias toward Positive for all future input.

Fix: Balance — include positive, negative, edge, neutral.

❌ Over-specific examples

Examples quá match 1 scenario → Claude fail khi input khác scenario đó.

Fix: Diverse examples. 3 examples should cover 3 different scenarios.

❌ Too many examples

10+ examples bloat prompt, tốn token, Claude overwhelm.

Fix: 2-5 examples. Pick diverse + corner case.

<example 1><input>good</input><output>Positive</output></example>
<example 2><input>great</input><output>Positive</output></example>
<example 3><input>amazing</input><output>Positive</output></example>

Áp dụng ngay

Bài tập 1: Upgrade prompt với 3 examples (20 phút)

Lấy prompt bạn đã có. Thêm 3 examples:

Test: score tăng không? Nếu tăng → keep. Nếu giảm → examples có vấn đề (generic? contradict? biased?).

Bài tập 2: Example + reasoning (15 phút)

Mở rộng 1 example, thêm <reasoning>:

Reasoning giúp Claude học logic, không chỉ pattern.

Test: xem output của Claude có "smarter" không.

  • Normal case
  • Corner case
  • Edge case
<example>
<input>...</input>
<output>...</output>
<reasoning>This is the right output because...</reasoning>
</example>

Tóm tắt bài học

🎯 Few-shot > zero-shot cho task có nuance — 2-5 examples.

🎯 Wrap trong XML tags — Claude parse đúng boundary.

🎯 Corner case examples mạnh nhất — teach Claude handle tricky input.

🎯 Source examples từ eval history — best examples đã từng scored 9-10.

🎯 Diverse, balanced examples — không bias toward 1 output category.

Tài liệu tham khảo
  • Anthropic: Use examples
Nội dung này có hữu ích không?