{"product_id":"computer-use-demo-claude-diều-khiển-may-tinh-của-bạn","title":"Computer Use Demo — Claude điều khiển máy tính của bạn","description":"\n\u003cp\u003eComputer Use là một trong những tính năng ấn tượng nhất của Claude — khả năng \u003cstrong\u003enhìn màn hình máy tính và điều khiển chuột, bàn phím\u003c\/strong\u003e như một con người. Thay vì chỉ sinh text, Claude có thể thực sự mở app, điều hướng website, điền form, và thực hiện các tác vụ desktop phức tạp.\u003c\/p\u003e\n\n\u003cp\u003eBài viết này hướng dẫn bạn setup môi trường an toàn bằng Docker và xây dựng demo computer use đầu tiên.\u003c\/p\u003e\n\n\u003ch2\u003eComputer Use hoạt động như thế nào?\u003c\/h2\u003e\n\n\u003cp\u003eVề mặt kỹ thuật, computer use dựa trên 3 khái niệm:\u003c\/p\u003e\n\n\u003col\u003e\n  \u003cli\u003e\n\u003cstrong\u003eScreenshot\u003c\/strong\u003e — Claude chụp màn hình, nhận ảnh base64 qua vision API\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eAction Tools\u003c\/strong\u003e — Claude gọi tool để click, type, scroll, hay nhấn phím tắt\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eFeedback Loop\u003c\/strong\u003e — Sau mỗi action, chụp screenshot mới để xác nhận kết quả\u003c\/li\u003e\n\u003c\/ol\u003e\n\n\u003cpre\u003e\u003ccode\u003eClaude nhin man hinh\n      |\n      v\n[Phan tich: can lam gi tiep theo?]\n      |\n      v\n[Goi tool: click(x, y) \/ type(text) \/ screenshot()]\n      |\n      v\n[Nhan ket qua + screenshot moi]\n      |\n      v\n[Lap lai cho den khi xong viec]\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eĐiều quan trọng: Claude \u003cstrong\u003ekhông có quyền truy cập trực tiếp\u003c\/strong\u003e vào OS — nó chỉ thấy screenshot và ra lệnh qua tools. Bạn, developer, là người implement tools đó.\u003c\/p\u003e\n\n\u003ch2\u003eSetup môi trường Docker an toàn\u003c\/h2\u003e\n\n\u003cp\u003eChạy computer use trong Docker để cô lập hoàn toàn với máy host:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003e# Dockerfile\nFROM ubuntu:22.04\n\n# Cai dat X11 virtual display\nRUN apt-get update \u0026amp;\u0026amp; apt-get install -y     xvfb     x11vnc     xdotool     scrot     python3     python3-pip     firefox-esr     --no-install-recommends\n\n# Cai dat Python deps\nRUN pip3 install anthropic pillow\n\n# Tao non-root user de bao mat hon\nRUN useradd -m -s \/bin\/bash claudeuser\nUSER claudeuser\nWORKDIR \/home\/claudeuser\n\nCOPY demo.py .\n\nCMD [\"bash\", \"-c\", \"Xvfb :99 -screen 0 1366x768x24 \u0026amp; sleep 1 \u0026amp;\u0026amp; DISPLAY=:99 python3 demo.py\"]\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eBuild và chạy:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003edocker build -t computer-use-demo .\ndocker run -e ANTHROPIC_API_KEY=your_key_here computer-use-demo\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eĐịnh nghĩa Computer Use Tools\u003c\/h2\u003e\n\n\u003cp\u003eAnthropic cung cấp sẵn tool schema chuẩn cho computer use. Bạn cần implement phía backend:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003eimport anthropic\nimport subprocess\nimport base64\nfrom PIL import ImageGrab\nimport io\n\nclient = anthropic.Anthropic()\n\n# Tool 1: Chup man hinh\ndef take_screenshot() -\u0026gt; str:\n    \"\"\"Chup man hinh, tra ve base64 PNG.\"\"\"\n    result = subprocess.run(\n        [\"scrot\", \"-\", \"-z\"],\n        capture_output=True\n    )\n    return base64.b64encode(result.stdout).decode()\n\n# Tool 2: Click chuot\ndef mouse_click(x: int, y: int, button: str = \"left\") -\u0026gt; str:\n    button_map = {\"left\": \"1\", \"middle\": \"2\", \"right\": \"3\"}\n    btn = button_map.get(button, \"1\")\n    subprocess.run([\"xdotool\", \"mousemove\", str(x), str(y)])\n    subprocess.run([\"xdotool\", \"click\", btn])\n    return f\"Da click {button} tai ({x}, {y})\"\n\n# Tool 3: Nhap text\ndef type_text(text: str) -\u0026gt; str:\n    subprocess.run([\"xdotool\", \"type\", \"--clearmodifiers\", text])\n    return f\"Da nhap: {text[:50]}...\"\n\n# Tool 4: Nhan phim tat\ndef key_press(key: str) -\u0026gt; str:\n    subprocess.run([\"xdotool\", \"key\", key])\n    return f\"Da nhan phim: {key}\"\n\n# Tool 5: Scroll\ndef scroll(x: int, y: int, direction: str, amount: int = 3) -\u0026gt; str:\n    btn = \"4\" if direction == \"up\" else \"5\"\n    for _ in range(amount):\n        subprocess.run([\"xdotool\", \"click\", \"--repeat\", \"1\", btn])\n    return f\"Da scroll {direction} {amount} lan tai ({x}, {y})\"\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eTool Schemas theo chuẩn Anthropic\u003c\/h2\u003e\n\n\u003cpre\u003e\u003ccode\u003ecomputer_tools = [\n    {\n        \"type\": \"computer_20241022\",\n        \"name\": \"computer\",\n        \"display_width_px\": 1366,\n        \"display_height_px\": 768,\n        \"display_number\": 1\n    }\n]\n\n# Hoac tu dinh nghia chi tiet hon:\ncustom_tools = [\n    {\n        \"name\": \"screenshot\",\n        \"description\": \"Chup man hinh hien tai, tra ve anh PNG base64\",\n        \"input_schema\": {\n            \"type\": \"object\",\n            \"properties\": {},\n            \"required\": []\n        }\n    },\n    {\n        \"name\": \"mouse_click\",\n        \"description\": \"Click chuot tai toa do (x, y)\",\n        \"input_schema\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"x\": {\"type\": \"integer\", \"description\": \"Toa do X (pixel)\"},\n                \"y\": {\"type\": \"integer\", \"description\": \"Toa do Y (pixel)\"},\n                \"button\": {\n                    \"type\": \"string\",\n                    \"enum\": [\"left\", \"middle\", \"right\"],\n                    \"description\": \"Nut chuot, mac dinh left\"\n                }\n            },\n            \"required\": [\"x\", \"y\"]\n        }\n    },\n    {\n        \"name\": \"type_text\",\n        \"description\": \"Nhap text vao vi tri hien tai\",\n        \"input_schema\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"text\": {\"type\": \"string\", \"description\": \"Text can nhap\"}\n            },\n            \"required\": [\"text\"]\n        }\n    },\n    {\n        \"name\": \"key_press\",\n        \"description\": \"Nhan phim tat, vi du: Return, ctrl+c, alt+Tab\",\n        \"input_schema\": {\n            \"type\": \"object\",\n            \"properties\": {\n                \"key\": {\"type\": \"string\", \"description\": \"Ten phim theo xdotool format\"}\n            },\n            \"required\": [\"key\"]\n        }\n    }\n]\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eAgent Loop với Vision\u003c\/h2\u003e\n\n\u003cp\u003eĐiểm khác biệt quan trọng: khi gửi screenshot, bạn dùng \u003cstrong\u003eimage content block\u003c\/strong\u003e, không phải text:\u003c\/p\u003e\n\n\u003cpre\u003e\u003ccode\u003edef run_computer_agent(task: str) -\u0026gt; str:\n    \"\"\"\n    Chay computer use agent voi task cho truoc.\n    \"\"\"\n    # Chup man hinh ban dau\n    screenshot_b64 = take_screenshot()\n\n    # Tao message dau tien voi anh\n    messages = [\n        {\n            \"role\": \"user\",\n            \"content\": [\n                {\n                    \"type\": \"image\",\n                    \"source\": {\n                        \"type\": \"base64\",\n                        \"media_type\": \"image\/png\",\n                        \"data\": screenshot_b64\n                    }\n                },\n                {\n                    \"type\": \"text\",\n                    \"text\": f\"Day la man hinh hien tai. Nhiem vu cua ban: {task}\"\n                }\n            ]\n        }\n    ]\n\n    system = \"\"\"Ban la mot AI dieu khien may tinh.\n    Truoc moi hanh dong, hay quan sat man hinh can than.\n    Sau moi hanh dong, chup man hinh moi de xac nhan ket qua.\n    Neu co loi, thu lai voi cach khac.\n    Bao cao khi hoan thanh nhiem vu.\"\"\"\n\n    tool_map = {\n        \"screenshot\": lambda: take_screenshot(),\n        \"mouse_click\": mouse_click,\n        \"type_text\": type_text,\n        \"key_press\": key_press,\n        \"scroll\": scroll\n    }\n\n    for _ in range(50):  # Max 50 actions\n        response = client.messages.create(\n            model=\"claude-sonnet-4-5\",\n            max_tokens=4096,\n            system=system,\n            tools=custom_tools,\n            messages=messages\n        )\n\n        messages.append({\n            \"role\": \"assistant\",\n            \"content\": response.content\n        })\n\n        if response.stop_reason == \"end_turn\":\n            return next(\n                (b.text for b in response.content if hasattr(b, \"text\")), \"\"\n            )\n\n        tool_results = []\n        for block in response.content:\n            if block.type == \"tool_use\":\n                print(f\"Action: {block.name}({block.input})\")\n                result = tool_map[block.name](**block.input)\n\n                # Neu la screenshot, tra ve image block\n                if block.name == \"screenshot\":\n                    tool_results.append({\n                        \"type\": \"tool_result\",\n                        \"tool_use_id\": block.id,\n                        \"content\": [\n                            {\n                                \"type\": \"image\",\n                                \"source\": {\n                                    \"type\": \"base64\",\n                                    \"media_type\": \"image\/png\",\n                                    \"data\": result\n                                }\n                            }\n                        ]\n                    })\n                else:\n                    tool_results.append({\n                        \"type\": \"tool_result\",\n                        \"tool_use_id\": block.id,\n                        \"content\": str(result)\n                    })\n\n        messages.append({\n            \"role\": \"user\",\n            \"content\": tool_results\n        })\n\n    return \"Timeout: het so action toi da\"\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eDemo: Tự động điền form web\u003c\/h2\u003e\n\n\u003cpre\u003e\u003ccode\u003e# Mo Firefox va dien form\nresult = run_computer_agent(\n    \"Mo Firefox, vao trang google.com, \"\n    \"tim kiem 'anthropic claude api', \"\n    \"va chup man hinh ket qua dau tien\"\n)\n\nprint(result)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003cp\u003eClaude sẽ tự động:\u003c\/p\u003e\n\u003col\u003e\n  \u003cli\u003eQuan sát màn hình, xác định Firefox chưa mở\u003c\/li\u003e\n  \u003cli\u003eDouble-click icon Firefox (hoặc dùng terminal)\u003c\/li\u003e\n  \u003cli\u003eChụp screenshot sau khi Firefox mở\u003c\/li\u003e\n  \u003cli\u003eClick vào address bar\u003c\/li\u003e\n  \u003cli\u003eGõ \u003ccode\u003egoogle.com\u003c\/code\u003e và Enter\u003c\/li\u003e\n  \u003cli\u003eClick vào search box, gõ query\u003c\/li\u003e\n  \u003cli\u003eChụp screenshot kết quả cuối cùng\u003c\/li\u003e\n\u003c\/ol\u003e\n\n\u003ch2\u003eSafety Considerations — Quan trọng!\u003c\/h2\u003e\n\n\u003cp\u003eComputer use là tính năng mạnh nhưng tiềm ẩn rủi ro. Anthropic khuyến nghị:\u003c\/p\u003e\n\n\u003cul\u003e\n  \u003cli\u003e\n\u003cstrong\u003eLuôn dùng sandbox\u003c\/strong\u003e — Docker, VM, hoặc máy ảo. KHÔNG chạy trực tiếp trên máy host.\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eGiới hạn quyền\u003c\/strong\u003e — Non-root user, không có quyền sudo trong container\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eMonitor actions\u003c\/strong\u003e — Log mọi action trước khi thực thi, cho phép human review\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eConfirm sensitive actions\u003c\/strong\u003e — Xóa file, gửi email, mua hàng... cần human confirm\u003c\/li\u003e\n  \u003cli\u003e\n\u003cstrong\u003eNetwork isolation\u003c\/strong\u003e — Hạn chế network access trong container\u003c\/li\u003e\n\u003c\/ul\u003e\n\n\u003cpre\u003e\u003ccode\u003eSENSITIVE_PATTERNS = [\n    \"rm -rf\", \"delete\", \"format\",\n    \"send email\", \"purchase\", \"payment\"\n]\n\ndef safe_action(action_name: str, action_input: dict) -\u0026gt; str:\n    # Kiem tra hanh dong nguy hiem\n    input_str = str(action_input).lower()\n    for pattern in SENSITIVE_PATTERNS:\n        if pattern in input_str:\n            confirm = input(\n                f\"CANH BAO: Hanh dong nhay cam '{action_name}' \"\n                f\"voi input '{input_str[:50]}'. \"\n                f\"Xac nhan? (y\/n): \"\n            )\n            if confirm.lower() != 'y':\n                return \"Hanh dong bi huy boi nguoi dung\"\n\n    return tool_map[action_name](**action_input)\u003c\/code\u003e\u003c\/pre\u003e\n\n\u003ch2\u003eTổng kết\u003c\/h2\u003e\n\n\u003ctable\u003e\n  \u003cthead\u003e\n    \u003ctr\u003e\n\u003cth\u003eThành phần\u003c\/th\u003e\n\u003cth\u003eVai trò\u003c\/th\u003e\n\u003cth\u003eCông nghệ\u003c\/th\u003e\n\u003c\/tr\u003e\n  \u003c\/thead\u003e\n  \u003ctbody\u003e\n    \u003ctr\u003e\n\u003ctd\u003eScreenshot\u003c\/td\u003e\n\u003ctd\u003eClaude nhìn màn hình\u003c\/td\u003e\n\u003ctd\u003escrot + base64\u003c\/td\u003e\n\u003c\/tr\u003e\n    \u003ctr\u003e\n\u003ctd\u003eMouse control\u003c\/td\u003e\n\u003ctd\u003eClick, drag, scroll\u003c\/td\u003e\n\u003ctd\u003exdotool\u003c\/td\u003e\n\u003c\/tr\u003e\n    \u003ctr\u003e\n\u003ctd\u003eKeyboard\u003c\/td\u003e\n\u003ctd\u003eType, hotkeys\u003c\/td\u003e\n\u003ctd\u003exdotool type\/key\u003c\/td\u003e\n\u003c\/tr\u003e\n    \u003ctr\u003e\n\u003ctd\u003eSandbox\u003c\/td\u003e\n\u003ctd\u003eCô lập an toàn\u003c\/td\u003e\n\u003ctd\u003eDocker + Xvfb\u003c\/td\u003e\n\u003c\/tr\u003e\n    \u003ctr\u003e\n\u003ctd\u003eVision API\u003c\/td\u003e\n\u003ctd\u003eClaude phân tích ảnh\u003c\/td\u003e\n\u003ctd\u003eClaude vision + base64\u003c\/td\u003e\n\u003c\/tr\u003e\n  \u003c\/tbody\u003e\n\u003c\/table\u003e\n\n\u003cp\u003eComputer Use mở ra khả năng tự động hóa bất kỳ tác vụ desktop nào — từ điền form, xử lý email, đến test UI tự động. Bước tiếp theo: xem \u003ca href=\"\/en\/collections\/san-pham\"\u003eBrowser Use Demo\u003c\/a\u003e để thấy cách tự động hóa web chuyên sâu hơn với Puppeteer, hoặc quay lại \u003ca href=\"\/en\/collections\/nang-cao\"\u003eLLM Agent từ đầu\u003c\/a\u003e để hiểu kiến trúc agent foundation.\u003c\/p\u003e\n\n\n\u003chr\u003e\n\u003ch3\u003eBài viết liên quan\u003c\/h3\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/claude-skills-t%E1%BA%A1o-excel-powerpoint-pdf-t%E1%BB%B1-d%E1%BB%99ng\"\u003eClaude Skills — Tạo Excel, PowerPoint, PDF tự động\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/extended-thinking-tool-use-suy-lu%E1%BA%ADn-sau-k%E1%BA%BFt-h%E1%BB%A3p-cong-c%E1%BB%A5\"\u003eExtended Thinking + Tool Use — Suy luận sâu kết hợp công cụ\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/react-agent-v%E1%BB%9Bi-llamaindex-claude-ly-lu%E1%BA%ADn-hanh-d%E1%BB%99ng\"\u003eReAct Agent với LlamaIndex + Claude — Lý luận + Hành động\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/claude-code-vs-github-copilot-vs-cursor-dau-la-ide-ai-t%E1%BB%91t-nh%E1%BA%A5t\"\u003eClaude Code vs GitHub Copilot vs Cursor — Đâu là IDE AI tốt nhất?\u003c\/a\u003e\u003c\/li\u003e\n\u003cli\u003e\u003ca href=\"\/en\/products\/claude-cho-d%E1%BB%AF-li%E1%BB%87u-va-phan-tich-t%E1%BB%95ng-quan-plugin\"\u003eClaude cho Dữ liệu và Phân tích: Tổng quan Plugin\u003c\/a\u003e\u003c\/li\u003e\n\u003c\/ul\u003e","brand":"Minh Tuấn","offers":[{"title":"Default Title","offer_id":47721909878996,"sku":null,"price":0.0,"currency_code":"VND","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0821\/0264\/9044\/files\/computer-use-demo-claude-di_u-khi_n-may-tinh-c_a-b_n_d4ac3fcb-5a83-4c90-8b35-2e074f419803.jpg?v=1774521835","url":"https:\/\/claude.vn\/en\/products\/computer-use-demo-claude-di%e1%bb%81u-khi%e1%bb%83n-may-tinh-c%e1%bb%a7a-b%e1%ba%a1n","provider":"CLAUDE.VN","version":"1.0","type":"link"}