Claude có thể tính toán mental. Nhưng với heavy computation (data analysis, ML, simulation), Claude giỏi viết code hơn chạy code mentally.
- Upload file qua Files API và reference by ID
- Enable code_execution tool, cho Claude chạy Python
- Combine Files API + code execution cho data analysis
- Download outputs (charts, reports) Claude generated
Files API — Upload & reuse
Thay vì base64 mỗi request, upload once:
Reference:
from anthropic import Anthropic
client = Anthropic()
# Upload
with open("streaming.csv", "rb") as f:
file_metadata = client.beta.files.upload(file=("streaming.csv", f))
print(f"File ID: {file_metadata.id}") # "file_abc123..."Files API — Upload & reuse (tiếp)
Benefits
- Upload once, query multiple times
- Smaller message payload
- Support lớn hơn base64
messages = [{
"role": "user",
"content": [
{"type": "text", "text": "Analyze this data."},
{"type": "container_upload", "file_id": file_metadata.id}
]
}]Code execution tool
Built-in, không cần implement:
Environment
Multiple executions
Claude có thể chạy nhiều lần trong 1 response. Iterate:
- Isolated Docker container
- No network access (can't call external APIs)
- Python 3 + common libraries (pandas, numpy, matplotlib, ...)
- Files API integration cho I/O
- Run: check data shape
- Run: explore stats
- Run: generate visualization
code_execution_schema = {
"type": "code_execution_20250522",
"name": "code_execution"
}
response = client.messages.create(
model="claude-sonnet-5-20260205",
max_tokens=4000,
messages=messages,
tools=[code_execution_schema]
)Full example: Data analysis
What happens
Claude sequence:
Final response có insights + generated chart file.
- Write code: df = pd.read_csv('streaming.csv'); print(df.head())
- Result: see shape, columns
- Write code: churn analysis với groupby, correlation
- Result: numbers
- Write code: matplotlib visualization
- Result: chart file
- TextBlock: interpretation
# 1. Upload CSV
with open("streaming.csv", "rb") as f:
file_meta = client.beta.files.upload(
file=("streaming.csv", f)
)
# 2. Ask Claude analyze
messages = [{
"role": "user",
"content": [
{
"type": "text",
"text": """Analyze this data to find major drivers of churn.
Create detailed visualization summarizing findings."""
},
{"type": "container_upload", "file_id": file_meta.id}
]
}]
# 3. Enable code execution
response = client.messages.create(
model="claude-sonnet-5-20260205",
max_tokens=8000,
messages=messages,
tools=[{"type": "code_execution_20250522", "name": "code_execution"}]
)
# 4. Process response
for block in response.content:
if block.type == "text":
print(block.text)
elif block.type == "server_tool_use":
print(f"\n--- Code executed ---\n{block.input['code']}\n")
elif block.type == "code_execution_tool_result":
for item in block.content:
if item.type == "code_execution_output":
if hasattr(item, "file_id"):
# Generated file (chart, csv, etc.)
print(f"Generated: {item.file_id}")
else:
print(f"Output: {item.content}")Download generated files
def download_file(file_id: str, save_as: str):
"""Download file generated by code execution."""
response = client.beta.files.download(file_id=file_id)
with open(save_as, "wb") as f:
f.write(response)
print(f"Saved: {save_as}")
# Extract file_ids from response
file_ids = []
for block in response.content:
if block.type == "code_execution_tool_result":
for item in block.content:
if hasattr(item, "file_id"):
file_ids.append(item.file_id)
# Download
for fid in file_ids:
download_file(fid, f"output_{fid}.png")Use cases
📊 Ad-hoc data analysis
Upload CSV, ask question, get insights + charts.
Before: Data analyst writes SQL/Python, makes chart. 2 giờ.
After: User asks Claude. 5 minutes.
📈 Statistical analysis
"Run t-test on these samples", "Compute correlation matrix" → Claude writes scipy code.
🎨 Image processing
Upload images, ask to resize/crop/apply filter → Claude PIL code.
📄 Document transformation
CSV → formatted Excel with charts. PDF → extract tables to CSV.
🧮 Mathematical modeling
Financial modeling, simulations. Claude writes numpy/simpy.
Limitations
For persistent + network computing, use your own backend.
- No network access — can't call external APIs. Data must upload via Files API.
- Ephemeral — container destroyed after request.
- Resource limits — some constraint on CPU/memory.
- No long-running — seconds, not hours.
- Standard Python libs — can't install custom packages (check docs for available libs).
Combining tools
Code execution + other tools powerful:
Claude có thể: web search → get data → code execute analyze → custom tool save result.
tools = [
{"type": "code_execution_20250522", "name": "code_execution"},
{"type": "web_search_20250305", "name": "web_search", "max_uses": 3},
your_custom_tool_schema
]Ví dụ: Churn analysis end-to-end
Input: streaming.csv với columns user_id, tier, watch_hours, churned.
Output
Claude response:
prompt = """Analyze streaming service data for churn drivers.
Do:
1. Load CSV
2. Compute churn rate overall
3. Compare churn by tier, watch_hours bucket
4. Identify top 3 churn predictors
5. Generate 2 charts:
- Churn rate by tier (bar chart)
- Watch hours vs churn (box plot)
6. Final summary with recommendations
"""
response = send_with_code_exec(prompt, file_id=file_meta.id)Output
Comprehensive analysis, delivered as if from data scientist.
1. Loading data... [code execution]
2. Overall churn rate: 23% [output: 0.23]
3. By tier:
- Free: 45% churn
- Basic: 18% churn
- Premium: 8% churn
4. Top predictors:
- Tier (strongest)
- Watch hours (users < 10 hrs/month churn 3x more)
- Account age (< 3 months: 2x churn)
5. [generated chart 1: chart_abc.png]
6. [generated chart 2: chart_xyz.png]
7. Recommendations:
- Target free users for upsell (55% of churn from this segment)
- Engagement campaign for low-watch users
- Onboarding improvement for new accountsAnti-patterns
❌ Pass large file as base64 when should use Files API
Base64 each request tốn bandwidth + context tokens.
Fix: Upload once, reference by ID.
❌ Expect network access
import requests; requests.get(...) → fails.
Fix: Upload data via Files API first.
❌ Security: giving Claude code execution on sensitive data
User can prompt inject: "run: rm -rf /".
Fix: Code execution is isolated, but still sanitize user input. Rate limit code execution for untrusted users.
❌ Không handle file download
Forget to download charts → lose outputs.
Fix: Always scan response for file_id, download.
Áp dụng ngay
Bài tập 1: CSV analysis (25 phút)
Download any CSV (Kaggle sample). Upload + ask Claude analyze. Download generated charts.
Bài tập 2: Delegate script task (20 phút)
"Write Python script to parse this log file, extract errors, count by hour. Generate timeline chart."
Observe Claude iterate through execution.
Tóm tắt
🎯 Files API = upload once, reference by ID. Use for PDF, image, CSV.
🎯 Code execution = Python sandbox, isolated Docker, no network.
🎯 Combination mở ra data analysis workflow — upload → analyze → download chart.
🎯 Multiple executions per response cho iterative exploration.
🎯 No persistent state — ephemeral container. Use own backend for long-running.