Trung cấpresearch ClaudeCộng đồng

Perplexity Computer vs Claude Code vs Cowork vs Manus: Test thực tế 4 AI agents trên cùng tasks

Minh TuấnCTO, Transform GroupTheo dõi

27/03/2026 0 0 6 phút đọc

Nghe bài viết

00:00

1 Mặc dù Perplexity thắng cả 2 tasks về accuracy, Kai đưa ra recommendation bất ngờ: Claude at $20/month for both Cowork and Code. However, for accuracy-dependent research work, supplementing with Perplexity Computer's real-time data access provides measurable value despite higher per-use costs." Lý do:.
2 Manus AI xếp cuối trong cả 2 tasks nhưng có một distinction đáng chú ý: $0.56 cho Task 1, $0.47 cho Task 2 . Với low-stakes, simple automation tasks, Manus AI có thể là rational choice:.
3 Đây là một research task thực tế — loại task mà người dùng thực sự cần làm, với data cần phải accurate. Research một property ở Brooklyn và tạo comprehensive report bao gồm comparable sales, zoning data, neighborhood trends, school ratings, walkability scores, và risk flags.
4 Build và deploy một working application tạo personalized AI news filtered by industry, role, và current priorities. Đây là task yêu cầu cả technical implementation VÀ content quality.
5 Claude Code's "catastrophic hallucination với 2026 dates" trong Task 2 không phải là isolated incident — đây là structural limitation của language models và cách chúng handle time. Khi asked về recent events, model có 3 options: Sau cutoff, chúng không có real knowledge về những gì đã xảy ra.

Khi AI agent được test như sản phẩm thực — không phải demo

Hầu hết các bài so sánh AI tools đều có một vấn đề: chúng dùng cherry-picked examples hoặc artificial tasks để showcase strengths của một tool cụ thể. Daria Cupareanu tại AiBlewMyMind chọn approach khác — và kết quả tiết lộ những điều cộng đồng AI cần biết.

Methodology của test:

Cùng 2 tasks được assign cho tất cả 4 tools
Outputs được evaluated bởi "LLM Council" — 4 AI models khác nhau chấm điểm ẩn danh, không biết output đến từ tool nào
Detailed cost tracking cho mỗi task mỗi tool
Evaluation dựa trên accuracy, actionability, và quality

Task 1: Real Estate Property Dossier

Objective: Research một property ở Brooklyn và tạo comprehensive report bao gồm comparable sales, zoning data, neighborhood trends, school ratings, walkability scores, và risk flags.

Đây là một research task thực tế — loại task mà người dùng thực sự cần làm, với data cần phải accurate.

Kết quả

Tool	Cost	LLM Council Rank	Key finding
Perplexity Computer	~$18	🥇 1st	"Only tool to get zoning code (R6A) correct"
Claude Code	N/A (subscription)	🥈 2nd	Solid executive summary, zoning errors
Claude Cowork	N/A (subscription)	🥉 3rd	Guessed zoning với "Likely" — lost credibility
Manus AI	~$0.56	4th	"Hallucinated irrelevant info về Marine Terminal"

Perplexity Computer thắng rõ ràng với khả năng verify primary sources. "Acted analytically" — khi không chắc chắn, nó tìm kiếm thêm thay vì đoán. Claude Cowork mắc lỗi tệ nhất: dùng từ "Likely" trước thông tin zoning — điều này tức thì làm mất credibility của toàn bộ report trong mắt LLM Council.

Task 2: AI News Briefing App

Objective: Build và deploy một working application tạo personalized AI news filtered by industry, role, và current priorities.

Đây là task yêu cầu cả technical implementation VÀ content quality.

Kết quả

Tool	Cost	LLM Council Rank	Key finding
Perplexity Computer	~$7.92	🥇 1st	"Acted like high-level consultant, news anchored in present"
Claude Cowork	N/A (subscription)	🥈 2nd	Best visual design, lacked deep intelligence
Claude Code	N/A (subscription)	🥉 3rd	"Catastrophic hallucination với 2026 dates"
Manus AI	~$0.47	4th	"Generic news, broken links, zero actionable value"

Claude Code's critical failure đáng chú ý: tạo ra app với "catastrophic hallucination" về 2026 dates — nghĩa là nó fabricated recent events thay vì acknowledge lack of real-time data. Đây là đúng loại failure mode nguy hiểm nhất trong production contexts.

3 observations quan trọng từ test này

Observation 1: Design ≠ Accuracy

Claude tools (Cowork và Code) nhất quán tạo ra visually superior outputs. The apps looked better, reports were better formatted. Nhưng appearance không correlate với accuracy hay functionality.

Lesson cho enterprise buyers: đừng bị distracted bởi polish. Test accuracy với ground-truth-verifiable tasks.

Observation 2: Real-time data access là differentiator lớn

Perplexity Computer thắng largely vì nó có real-time web access và prioritized verification. Claude models — dù mạnh hơn về reasoning — thiếu reliable access to current information và không luôn acknowledge điều này rõ ràng.

Đây là gap mà Anthropic đang address với Connectors và web search features, nhưng vẫn còn khoảng cách với Perplexity's core strength.

Observation 3: Cost asymmetry phức tạp hơn nó trông

Perplexity: $7-18 per task (transparent, usage-based)
Claude: $0 per task ngoài subscription (opaque per-task cost)

Ai rẻ hơn phụ thuộc vào usage pattern. Nếu bạn làm 100 tasks/tháng với Claude: Pro plan $20 = $0.20/task. Perplexity cùng tasks: có thể $700-1800.

Nhưng nếu bạn chỉ cần 1-2 high-stakes research tasks/tháng nơi accuracy là critical: Perplexity ở $7-18/task có thể là better value.

Verdict: "Nếu chỉ chọn 1 platform: Claude"

Mặc dù Perplexity thắng cả 2 tasks về accuracy, Kai đưa ra recommendation bất ngờ:

"If you can only pick one platform: Claude at $20/month for both Cowork and Code. However, for accuracy-dependent research work, supplementing with Perplexity Computer's real-time data access provides measurable value despite higher per-use costs."

Lý do: Claude ở $20/month cho cả Cowork VÀ Code là exceptional value nếu bạn dùng cả hai. Perplexity Computer có real-time advantage nhưng cost per-task cao hơn nhiều nếu dùng thường xuyên.

Recommended workflow:

Daily tasks, content creation, coding, automation: Claude (Cowork + Code)
High-stakes research cần verified real-time data: Supplement với Perplexity Computer
Cheap prototyping và basic automation: Manus AI ($0.47/task là remarkably cheap)

Điều này có nghĩa gì cho việc chọn tool?

Test này confirm một truth quan trọng: không có one-size-fits-all AI agent. Mỗi tool có strength profile riêng:

Perplexity: Accuracy king cho current events và verifiable facts
Claude Code: Technical execution excellence, best design quality
Claude Cowork: Workflow automation, best UX cho non-technical users
Manus AI: Cheapest option cho simple, low-stakes tasks

Professionals với diverse needs sẽ có multi-tool stack. Individuals với budget constraints sẽ focus vào tool phù hợp nhất với primary use case của mình.

Phân tích sâu: Tại sao Claude bị kém về real-time data?

Claude Code's "catastrophic hallucination với 2026 dates" trong Task 2 không phải là isolated incident — đây là structural limitation của language models và cách chúng handle time.

LLMs được trained trên data đến một cutoff point. Sau cutoff, chúng không có real knowledge về những gì đã xảy ra. Khi asked về recent events, model có 3 options:

Acknowledge không biết và từ chối (safest)
Use reasoning để infer what might have happened (acceptable)
Generate plausible-sounding but fabricated information (dangerous)

Claude Code trong test đó chose option 3 cho some 2026 dates — điều này thực ra là một known failure mode cần watch carefully.

Anthropic đang address điều này với web search integration và Connectors, nhưng cho AI news app specifically cần real-time data, cần explicit web access tools để work reliably.

Manus AI: Surprise underdog với chi phí cực thấp

Manus AI xếp cuối trong cả 2 tasks nhưng có một distinction đáng chú ý: $0.56 cho Task 1, $0.47 cho Task 2.

Đây là pricing dramatically lower hơn alternatives. Với low-stakes, simple automation tasks, Manus AI có thể là rational choice:

Simple web scraping
Basic document formatting
Routine email drafts
Simple research summaries (accuracy không critical)

Nhưng cho tasks yêu cầu accuracy, completeness, hoặc judgment: cost advantage không justify quality trade-offs.

Lesson cho AI tool selection: Matching tool với task

Test này provide clear decision framework:

Task type	Best tool	Lý do
Current events research	Perplexity Computer	Real-time web access + verification
Code và technical work	Claude Code	Best technical reasoning, visual output
Workflow automation	Claude Cowork	Best UX, Connectors integration
Simple, low-stakes tasks	Manus AI	Cheapest per-task cost

Để deep dive vào Claude Code capabilities, autonomous coding agent với Claude cho thấy potential đầy đủ. Với Claude Cowork cho automation workflows, Claude automation với Zapier/Make/n8n là starting point tốt. Và để hiểu Perplexity's real-time data advantage trong context, AI tools comparison cung cấp broader competitive landscape.

Nguồn tham khảo

AiBlewMyMind — Perplexity Computer vs Claude Code vs Cowork vs Manus (Kai, 23/03/2026)
LLM Council evaluation methodology

Tính năng liên quan:cowork claude-code computer-use

Bai viet co huu ich khong?

Writer cho nền tảng kiến thức Claude AI cho người Việt. Software engineer với hơn 20 năm kinh nghiệm, đam mê AI và chia sẻ kiến thức công nghệ.

5 bài viết · 16K lượt đọc

Bình luận (0)

Đăng nhập để bình luận...

Đăng nhập để bình luận

Đang tải bình luận...

Gợi ý cho bạn

Claude Có Thể Dùng Máy Tính Của Bạn: Anthropic Ra Mắt Computer Use Agent

Perplexity Computer vs Claude Code vs Cowork vs Manus: Test thực tế 4 AI agents trên cùng tasks

Điểm nổi bật

Khi AI agent được test như sản phẩm thực — không phải demo

Task 1: Real Estate Property Dossier

Kết quả

Task 2: AI News Briefing App

Kết quả

3 observations quan trọng từ test này

Observation 1: Design ≠ Accuracy

Observation 2: Real-time data access là differentiator lớn

Observation 3: Cost asymmetry phức tạp hơn nó trông

Verdict: "Nếu chỉ chọn 1 platform: Claude"

Điều này có nghĩa gì cho việc chọn tool?

Phân tích sâu: Tại sao Claude bị kém về real-time data?

Manus AI: Surprise underdog với chi phí cực thấp

Lesson cho AI tool selection: Matching tool với task

Nguồn tham khảo

Gợi ý cho bạn

Claude Có Thể Dùng Máy Tính Của Bạn: Anthropic Ra Mắt Computer Use Agent

Claude kiểm soát Mac của tôi 30 phút: Trải nghiệm thực tế về Computer Use

Thử Dùng Claude Cowork Dọn 2.200 File: Kết Quả Thực Tế Sau 20 Phút

Tôi Nghĩ Vibe Coding Chỉ Dành Cho Developer — Claude Đã Chứng Minh Tôi Sai

Tin liên quan nên xem

Deep Dive: Cassie Kozyrkov Khám Phá Các Công Cụ Mới Nhất Của Claude

Claude Cowork Dispatch được gọi là "OpenClaw cho người lớn" — Phân tích từ các chuyên gia authority site

Claude Cowork Giải Phóng 60GB Dung Lượng Máy Tính: Trải Nghiệm Thực Tế

Computer Use Trong Cowork: Claude Tự Thao Tác Máy Tính Của Bạn

Perplexity Computer vs Claude Code vs Cowork vs Manus: Test thực tế 4 AI agents trên cùng tasks

Điểm nổi bật

Khi AI agent được test như sản phẩm thực — không phải demo

Task 1: Real Estate Property Dossier

Kết quả

Task 2: AI News Briefing App

Kết quả

3 observations quan trọng từ test này

Observation 1: Design ≠ Accuracy

Observation 2: Real-time data access là differentiator lớn

Observation 3: Cost asymmetry phức tạp hơn nó trông

Verdict: "Nếu chỉ chọn 1 platform: Claude"

Điều này có nghĩa gì cho việc chọn tool?

Phân tích sâu: Tại sao Claude bị kém về real-time data?

Manus AI: Surprise underdog với chi phí cực thấp

Lesson cho AI tool selection: Matching tool với task

Nguồn tham khảo

Gợi ý cho bạn

Claude Có Thể Dùng Máy Tính Của Bạn: Anthropic Ra Mắt Computer Use Agent

Claude kiểm soát Mac của tôi 30 phút: Trải nghiệm thực tế về Computer Use

Thử Dùng Claude Cowork Dọn 2.200 File: Kết Quả Thực Tế Sau 20 Phút

Tôi Nghĩ Vibe Coding Chỉ Dành Cho Developer — Claude Đã Chứng Minh Tôi Sai

Tin liên quan nên xem

Deep Dive: Cassie Kozyrkov Khám Phá Các Công Cụ Mới Nhất Của Claude

Claude Cowork Dispatch được gọi là "OpenClaw cho người lớn" — Phân tích từ các chuyên gia authority site

Claude Cowork Giải Phóng 60GB Dung Lượng Máy Tính: Trải Nghiệm Thực Tế

Computer Use Trong Cowork: Claude Tự Thao Tác Máy Tính Của Bạn

Đăng ký nhận bản tin