Environment inspection — Agent phải "nhìn" được — Building with the Claude API

Agent perform action — không biết kết quả nếu không inspect.

Bạn sẽ học được

Hiểu tại sao environment inspection critical cho agents
Pattern: "read before write"
System prompt instruct Claude inspect after actions
Apply for file, UI, API contexts

Example: Computer Use

Claude clicks button:

What happened?

Answer: screenshot after action.

Computer Use always pairs click → screenshot. Claude sees result → adapts.

Did page navigate?
Did dialog open?
Did page just load something?
Did click miss the button?

Claude: computer_click(x=450, y=300)
Result: "Click sent"

Pattern: Read before write

Ví dụ: File edit

Read before write = safe.

User: Add a new route to main.py

Bad agent:
  edit_file("main.py", old="routes = [", new="routes = [\n  new_route,")
  ← may create syntax error if pattern different than assumed

Good agent:
  1. read_file("main.py") ← inspect first
  2. [sees actual structure]
  3. edit with exact pattern that matches
  4. read_file("main.py") ← verify

System prompt for inspection

Nudge Claude inspect via system prompt:

Sets expectation Claude will inspect.

system = """You are a coding assistant. Follow these rules:

1. ALWAYS read files before editing — never guess content.
2. After edits, re-read to verify changes applied correctly.
3. If action fails or result unexpected, investigate before retry.
4. When running commands, check output carefully for errors.
"""

Tools that enable inspection

Files

Processes

UIs

APIs

read_file(path) — see content
ls(dir) — see what exists
grep(pattern, path) — find specific content
get_output(command_id) — read logs of long-running
check_status(job_id) — is it done?
screenshot() — see current state
get_element_text(selector) — read value
get_status(resource_id) — latest state
list_resources() — inventory

Case study: Video creation agent

Agent creates video with FFmpeg. Need verify quality.

Without inspection

With inspection

1. generate_script(topic) → script
2. text_to_speech(script) → audio
3. generate_images(script) → images
4. ffmpeg_combine(...) → video file

Agent: "Done, video created."
Reality: Audio/video misaligned, typo in script, images wrong...

With inspection

Claude iterate: verify each stage before proceed. Final video higher quality.

system = """Video creation agent. Always verify:
- After generating script, read it aloud mentally — catches typos
- After audio, sample length — matches expected duration?
- After images, check count + describe each to confirm right
- After video, extract frames at intervals → verify content
- Use bash+whisper.cpp → generate captions → verify dialogue placement
"""

Inspection granularity

Fine-grained

Every action followed by verify:

Coarse-grained

Verify periodically or at end:

Smart (recommended)

Verify high-stakes actions + batch verify low-stakes.

Pros: Catch errors early
Cons: Many turns, expensive
Pros: Fewer turns
Cons: Errors propagate

High-stakes (verify each):
  - Edit production code
  - Delete files
  - Send emails

Low-stakes (batch verify):
  - Format whitespace
  - Rename variables (trivial)

Hidden costs — overhead

Inspection tools tốn tokens. Budget:

Fix: Instruct Claude batch when safe.

Task: edit 10 lines
Actions: 10 reads + 10 edits + 10 verify reads = 30 tool calls

vs batch:
1 read → understand → 10 edits → 1 final read = 12 tool calls

Observability for user

Show inspection in UI:

User sees progression. Trust builds.

Agent status:
  [✓] Read main.py
  [✓] Analyzed structure
  [⏳] Editing routes...
  [ ] Verifying changes...
  [ ] Running tests...

Anti-patterns

❌ No verify after action

"I think it worked" = fragile assumption.

Fix: System prompt instruct verify.

❌ Verify everything

30 verify calls for 10 edits = unnecessarily slow.

Fix: Smart — high-stakes only.

❌ Ignore inspection results

Claude reads "Error in line 5" and continues anyway.

Fix: Prompt "If verification shows issue, stop and investigate."

❌ Circular inspection

Read → edit → read → edit → read... indefinitely.

Fix: max_turns. After N reads without progress, escalate.

Ví dụ code structure

System prompt + right tools = agent can inspect.

SYSTEM = """Agent for codebase maintenance.

Workflow for edits:
1. read_file(path) first, understand context
2. Propose edit with exact old_str (unique in file)
3. edit_file(path, old_str, new_str)
4. read_file(path) to verify change correct
5. If verify fails, investigate + retry

Never edit without reading first.
Never assume tests pass — run them explicitly.
"""


async def run_agent(task: str):
    messages = [{"role": "user", "content": task}]
    
    for turn in range(30):  # agent may need many turns
        response = chat(messages, system=SYSTEM, tools=tools)
        messages.append({"role": "assistant", "content": response.content})
        
        if response.stop_reason == "end_turn":
            return text_from_message(response)
        
        if response.stop_reason == "tool_use":
            results = await run_tools(response)
            messages.append({"role": "user", "content": results})

Áp dụng ngay

Bài tập 1: Add inspection to your agent (20 phút)

Take Module 9 reminder agent. Add:

Test: "Remind me about doctor tomorrow" when such reminder already exists.

Bài tập 2: Computer Use simulation (optional, advanced)

Design agent với screenshot() + click(x, y) tools. Even simulated, force Claude pattern: click → screenshot → decide next click.

Tool list_reminders()
System prompt: "Before set_reminder, list existing to avoid duplicates"

Tóm tắt

🎯 Environment inspection critical — agent blind without.

🎯 "Read before write" pattern.

🎯 System prompt nudges Claude inspect.

🎯 Smart granularity: verify high-stakes, batch low-stakes.

🎯 Observability UI shows inspection → user trust.

Nội dung này có hữu ích không?