Roots — Cấp quyền truy cập file với ranh giới rõ ràng — MCP: Chủ đề nâng cao

Tưởng tượng bạn viết MCP server cho video editing — 1 tool convert(input_path, format). Test local, user nói với Claude:

Bạn sẽ học được

Giải thích được 2 vai trò của roots: context cho model và security boundary.
Phân biệt roots với các cơ chế permission khác (OAuth scopes, filesystem permission).
Biết SDK không tự enforce roots — developer phải tự kiểm tra.
Implement helper is_path_allowed() đúng cách (bao gồm handling symlink, relative path).
Liệt kê 3 pattern expose roots (CLI args, config, prompt inject).

Roots là gì — 2 vai trò cùng lúc

Hai vai trò này tách biệt nhưng bổ sung:

Cả 2 đều cần. Một mà thiếu cái còn lại là nguy hiểm hoặc vô dụng.

Context là "tôi nên tìm ở đâu" (helpful).
Boundary là "tôi không được đi đâu" (protective).

┌──────────────────────────────────────────────────────────┐
│                                                          │
│  ROOTS = LIST OF DIRECTORIES SERVER CAN ACCESS           │
│                                                          │
│  Ví dụ:                                                  │
│    [                                                     │
│      "file:///Users/jimmy/Documents/projects/",          │
│      "file:///Users/jimmy/Desktop/"                      │
│    ]                                                     │
│                                                          │
│  ────────────────────────────────────────────            │
│                                                          │
│  Vai trò 1: CONTEXT                                      │
│  ────────────────                                        │
│  Claude biết "đây là nơi có khả năng chứa file           │
│  user đang nói đến" → search trong roots thay vì         │
│  toàn filesystem.                                        │
│                                                          │
│  Vai trò 2: BOUNDARY (SECURITY)                          │
│  ───────────────────────                                 │
│  Server NÊN reject request truy cập path ngoài roots.   │
│  Protect data nhạy cảm (credentials, SSH key, ...).     │
│                                                          │
└──────────────────────────────────────────────────────────┘

Workflow với roots

Observation quan trọng: roots/list là server → client request. Điều này có nghĩa:

Cần bidirectional transport. stateless_http=True disable roots (quay lại bài 10.5).
Client phải capability roots support.

USER                 CLIENT               SERVER (your tool)
 │                     │                      │
 │ "convert biking.mp4"│                      │
 │────────────────────▶│                      │
 │                     │ tools/call convert   │
 │                     │─────────────────────▶│
 │                     │                      │
 │                     │ server không biết    │
 │                     │ path đầy đủ của      │
 │                     │ biking.mp4           │
 │                     │                      │
 │                     │ ◀──── roots/list ────│  ← server xin
 │                     │       request        │
 │                     │                      │
 │                     │ ListRootsResult:     │
 │                     │ ["~/Movies",         │
 │                     │  "~/Desktop"]        │
 │                     │─────────────────────▶│
 │                     │                      │
 │                     │                      │ server search
 │                     │                      │ trong 2 roots
 │                     │                      │ tìm biking.mp4
 │                     │                      │ → found at
 │                     │                      │ ~/Movies/biking.mp4
 │                     │                      │
 │                     │                      │ do conversion
 │                     │                      │
 │                     │ ◀─── tool result ────│
 │                     │                      │
 │ "Done!"             │                      │
 │◀────────────────────│                      │

Cần implement 2 phía

Server side — Yêu cầu roots khi cần

Điểm chú ý:

Client side — Expose roots

ctx.session.list_roots() — server request roots từ client runtime.
Path.resolve() — quan trọng: resolve symlinks, relative paths.
relative_to(root) — Pythonic way check "path có trong root không".
SDK không enforce — nếu bạn không gọi is_path_allowed(), tool sẽ access path bất kỳ. Security không automatic.

from mcp.server.fastmcp import FastMCP, Context
from pathlib import Path

mcp = FastMCP(name="video-tools")


from urllib.parse import urlparse, unquote


def _uri_to_path(uri: str) -> Path:
    """Convert file:// URI to OS-appropriate Path. Handles Windows 'file:///C:/foo'."""
    parsed = urlparse(uri)
    path = unquote(parsed.path)
    # Windows: URL path '/C:/foo' → strip leading slash
    if sys.platform == "win32" and path.startswith("/") and len(path) > 2 and path[2] == ":":
        path = path[1:]
    return Path(path).resolve()


async def is_path_allowed(ctx: Context, requested_path: str) -> bool:
    """Check if requested path falls within any approved root."""
    roots = await ctx.session.list_roots()
    requested = Path(requested_path).resolve()

    for root in roots.roots:
        root_path = _uri_to_path(root.uri)
        try:
            requested.relative_to(root_path)
            return True  # requested is inside root
        except ValueError:
            continue  # not in this root, try next

    return False


@mcp.tool()
async def convert_video(
    input_path: str,
    format: str,
    *,
    ctx: Context,
) -> str:
    # Security check first
    if not await is_path_allowed(ctx, input_path):
        raise PermissionError(
            f"Path {input_path} is outside approved roots. "
            f"Please use a file in allowed directories."
        )

    # Do conversion
    # ... actual ffmpeg code ...
    output_path = input_path.replace(".mp4", f".{format}")
    return f"Converted to {output_path}"

Client side — Expose roots

3 pattern cung cấp roots phổ biến:

Mỗi pattern có trade-off về UX.

Pattern A: CLI args (trên) — developer control dễ, dev flow nhanh.
Pattern B: Config file — user edit ~/.config/mcp/roots.json, client load.
Pattern C: Prompt inject — user nói với Claude "use folder X as root", client parse ý định, inject vào runtime.

import asyncio
import sys
from pathlib import Path

from mcp.client.stdio import stdio_client, StdioServerParameters
from mcp import ClientSession
from mcp.types import Root, ListRootsResult


# 3 pattern cung cấp roots:

# Pattern A: CLI args
root_paths = sys.argv[1:]
# Usage: python client.py /path/to/videos /another/path


async def list_roots_callback() -> ListRootsResult:
    roots = [
        Root(
            uri=f"file://{Path(p).resolve()}",
            name=Path(p).name,
        )
        for p in root_paths
    ]
    return ListRootsResult(roots=roots)


async def run():
    server_params = StdioServerParameters(
        command="uv", args=["run", "server.py"]
    )

    async with stdio_client(server_params) as (read, write):
        async with ClientSession(
            read, write,
            list_roots_callback=list_roots_callback,  # ← register
        ) as session:
            await session.initialize()

            result = await session.call_tool(
                name="convert_video",
                arguments={"input_path": "/Users/jimmy/Movies/biking.mp4", "format": "mov"},
            )
            print(result.content[0].text)


if __name__ == "__main__":
    if not root_paths:
        print("Usage: python client.py <root1> [root2] ...")
        sys.exit(1)
    asyncio.run(run())

is_path_allowed — Pitfall hay gặp

Viết helper này đúng cách là tricky hơn nhiều người tưởng.

Sai lầm 1: String prefix match

Vấn đề:

Đúng cách: pathlib.Path.resolve() + relative_to()

/Users/alice/Documents starts with /Users/alice → OK.
/Users/alice/../../etc/passwd cũng "starts with" nếu không resolve → BUG!
/Users/alice-evil/secrets starts with /Users/alice → nhầm!

# ❌ BUG
def is_path_allowed(requested, root):
    return requested.startswith(root)

Đúng cách: pathlib.Path.resolve() + relative_to()

resolve() handle:

relative_to() raise ValueError nếu path không trong root. Bắt exception đó thay vì string check.

Sai lầm 2: Case sensitivity trên Windows/macOS

.. trong path → unwrap.
Symlinks → follow (có thể gọi canonical path).
Relative path → absolute.

# ✅ Correct
from pathlib import Path

def is_path_allowed(requested: str, roots: list[str]) -> bool:
    req = Path(requested).resolve()
    for root in roots:
        root_p = Path(root).resolve()
        try:
            req.relative_to(root_p)
            return True
        except ValueError:
            pass
    return False

Sai lầm 2: Case sensitivity trên Windows/macOS

Windows & macOS default case-insensitive. Dùng os.path.normcase (cách standard cross-platform):

# ❌ Trên Windows/macOS (case-insensitive FS)
# /Users/Alice có thể == /users/alice physically

is_path_allowed — Pitfall hay gặp (tiếp)

normcase() trên Linux giữ nguyên, trên macOS/Windows lowercase. An toàn hơn việc tự str(...).lower() (có thể phá ký tự Unicode ở một số path).

Sai lầm 3: TOCTOU (Time-of-check Time-of-use) race

import os

if os.path.normcase(str(req.resolve())) == os.path.normcase(str(root_p.resolve())):
    return True

Sai lầm 3: TOCTOU (Time-of-check Time-of-use) race

Mitigate:

Open bằng os.open(path, O_NOFOLLOW) để fail nếu symlink.
Hoặc resolve() ngay trước khi open, verify 1 lần nữa.
Production-grade: dùng openat() với file descriptor của root.

# ❌ Check OK, then use
if is_path_allowed(path):
    with open(path) as f:
        data = f.read()
# Giữa check và open, attacker có thể swap symlink!

Advanced patterns

Pattern 1: Nested roots

User có thể grant multiple roots với mức độ khác nhau:

Tool có thể treat public/ readable, work/ read+write. Tùy server logic.

Pattern 2: Glob roots

Một số client support glob pattern:

roots = [
    Root(uri="file:///Users/jimmy/Documents/public/", name="public"),
    Root(uri="file:///Users/jimmy/Documents/work/", name="work"),
]

Pattern 2: Glob roots

Match nhiều folder. SDK phải resolve pattern trước khi check.

Pattern 3: Virtual roots

Roots không nhất thiết là filesystem path. Có thể là URI scheme khác:

Root(uri="file:///Users/jimmy/Projects/*/src/", name="src folders")

Pattern 3: Virtual roots

Server check scheme và dùng appropriate client (boto3, httpx) để access.

Pattern 4: Dynamic roots

User có thể thay đổi roots giữa session. Client emit notifications/roots/list_changed:

Root(uri="s3://my-bucket/videos/", name="S3 videos")
Root(uri="https://api.example.com/docs/", name="API docs")

Pattern 4: Dynamic roots

Server listens → receives notification → re-list roots → update cached allowed list.

Security considerations — Đừng để bug nhỏ thành lỗ hổng lớn

1. Validate mọi input path

Không chỉ path từ Claude. Kể cả path được server compute internal → vẫn qua is_path_allowed(). Defense in depth.

2. Log path access

Audit trail khi có incident.

3. Rate limit file operation

Prevent attacker tool exfiltration: giới hạn số file/giây, file size per call.

4. Scope minimum

User nên grant root nhỏ nhất có thể. Không nên grant /Users/jimmy/ (toàn home). Chỉ /Users/jimmy/Documents/project-xyz/.

5. Document rõ ràng

Trong README, explicit: "This server requires read access to approved roots only. It will NEVER access files outside those roots."

6. Test với path bất thường

../../../etc/passwd
Symlink trỏ ra ngoài roots
NULL byte trong path: /safe/path\x00/../secret
Very long path
Unicode / emoji trong path

logger.info(f"Access granted: {path}")
logger.warning(f"Access denied: {path}")

Ví dụ theo ngành

🛠️ IDE integration — review_code

Pain: MCP server review code. User open project ~/work/proj-a, server không nên access ~/work/proj-b hoặc ~/.ssh.

Giải pháp:

🎬 Video editor MCP — extract_clips

Pain: Tool extract clips từ video. User cho access folder videos.

Giải pháp:

📝 Document analysis MCP

Pain: Phân tích PDF trong folder user. Enterprise có regulation strict về data location.

Giải pháp:

🔬 Data science MCP — query_dataset

Pain: Tool phân tích dataset. User có 10 dataset, mỗi cái có compliance riêng.

Giải pháp:

🎓 Education — grade_student_essays

Pain: Giáo viên cần Claude review 30 essay từ 1 class. File essays/ chứa essay của nhiều class khác nhau, không muốn Claude đọc class khác.

Giải pháp:

🏥 Healthcare — search_medical_notes

Pain: Tool tìm clinical notes trong folder bệnh án. HIPAA/PDPA quy định strict: không được access file ngoài bệnh nhân đang điều trị.

Giải pháp:

IDE (Cursor) expose current workspace folder as root.
Server check mọi file path trước khi read/write.
Kết quả: Security boundary tự động match workspace boundary. Zero leak giữa projects.
Claude Desktop expose ~/Movies as root.
Tool extract_clips(video, start, end) verify video in ~/Movies trước khi chạy ffmpeg.
Kết quả: User safe — dù prompt nghịch ngợm ("extract clip từ ~/.ssh/id_rsa") → denied.
User chỉ cung cấp 1 folder chứa docs cần phân tích.
Server log mọi access, gửi vào SIEM.
Reject mọi path outside root với error message giải thích.
Kết quả: Pass compliance audit — có thể chứng minh server không access bất kỳ data nào outside scope user explicitly allowed.
Root-per-dataset pattern. User expose 1 root cho mỗi session analysis.
Server chỉ thấy 1 dataset 1 lần → không cross-contaminate.
Kết quả: Audit trail rõ ràng cho mỗi analysis.
Giáo viên chỉ expose folder essays/class-9A-2026/ làm root.
Server từ chối mọi path ngoài root — dù Claude được hint khéo.
Kèm tool list_accessible_paths() để giáo viên confirm trước scope.
Kết quả: Compliance với privacy policy trường, không cần setup riêng per class trong server code.
Workflow session-per-patient. Khi bắt đầu consultation, clinician expose folder patients/PT-00142/ as root.
Mọi tool (search_notes, read_note, extract_lab_results) đều dùng is_path_allowed().
Audit log mọi access vào SIEM của bệnh viện.
Kết quả: Pass HIPAA audit — chứng minh được server chỉ access đúng scope bệnh nhân đang được điều trị.

Anti-patterns

❌ Trust path từ tool argument

Hiện tượng: open(input_path) không check.

Cách đúng: Luôn is_path_allowed() trước.

❌ String compare path

Hiện tượng: if path.startswith(root):.

Cách đúng: Path.resolve().relative_to().

❌ Không handle case-insensitive FS

Hiện tượng: Tool work trên Linux, fail trên macOS.

Cách đúng: Normalize case trước compare trên Darwin/Windows.

❌ Cache roots rồi không refresh

Hiện tượng: User thêm root mới, server vẫn dùng cache cũ.

Cách đúng: Listen notifications/roots/list_changed, hoặc gọi list_roots() mỗi request (latency cost nhưng correct).

❌ Expose roots không cần thiết

Hiện tượng: User "grant all folders" → MCP server access mọi thứ.

Cách đúng: Document best practice: grant least privilege. Cung cấp sample config với scope nhỏ.

❌ Không handle empty roots list

Hiện tượng: Server gọi list_roots() → result rỗng → tool crash vì no accessible path.

Cách đúng:

if not roots.roots:
    return "Please configure at least one root directory to use this tool."

Mẹo nâng cao

Mẹo 1: Log roots lúc start

Khi session initialize, list roots và log:

User biết server thấy folder nào. Transparency cao.

Mẹo 2: Provide helper tool

Thêm tool list_accessible_paths() để user query:

roots = await ctx.session.list_roots()
await ctx.info(f"Active roots: {[r.name for r in roots.roots]}")

Mẹo 2: Provide helper tool

Dev debug nhanh hơn.

Mẹo 3: Có read_any + read_in_root versions

Cho phép tool flexible: đôi khi user muốn read file ngoài root (vd từ URL), đôi khi chặt. Tool khác nhau cho use case khác nhau.

Mẹo 4: Root metadata

Bên cạnh URI và name, attach metadata:

@mcp.tool()
async def list_accessible_paths(*, ctx: Context) -> str:
    roots = await ctx.session.list_roots()
    return "Accessible roots:\n" + "\n".join(
        f"- {r.name}: {r.uri}" for r in roots.roots
    )

Mẹo 4: Root metadata

Server respect metadata để decide write permission.

Mẹo 5: Dùng chroot / sandbox cho paranoid

Nếu bạn thực sự cần security strong (multi-tenant server), run tool trong subprocess với chroot/namespace/container. Even if is_path_allowed bug, OS layer ngăn breakout.

Root(
    uri="file:///Users/jimmy/work/",
    name="work",
    metadata={"readonly": True, "dataSensitivity": "high"}
)

Áp dụng ngay

Bài tập 1: Viết is_path_allowed với edge cases (20 phút)

Bước 1: Tạo helper function:

Bước 2: Test với các case:

Bước 3: Tự fix bugs thấy.

Bài tập 2 (optional): Security test

Viết 10 malicious path examples (path traversal, symlink attacks, race conditions). Test helper reject hết. Nếu có pass → bug.

Case	Expected
requested="/Users/alice/docs/a.txt", roots=["/Users/alice/docs/"]	True
requested="/Users/alice/../bob/secret.txt", roots=["/Users/alice/"]	False
requested="/Users/alice/docs/" (trailing slash), roots=["/Users/alice/docs"]	True
requested="/USERS/Alice/DOCS/a.txt" (case), roots=["/Users/alice/docs/"]	macOS: True, Linux: False
requested="~/docs/a.txt", roots=["/Users/alice/docs/"]	True (after expand)

from pathlib import Path

def is_path_allowed(requested: str, roots: list[str]) -> bool:
    # Your implementation
    pass

Tóm tắt bài học

🎯 Roots = list directories server được phép access — 2 vai trò: context (find files) và boundary (security).

🎯 Server xin roots qua list_roots — đây là server→client request, phụ thuộc bidirectional transport.

🎯 SDK không auto-enforce — dev phải viết is_path_allowed(). Không check = mọi path access được.

🎯 String prefix match là bug — dùng Path.resolve().relative_to() để đúng.

🎯 3 pattern expose roots — CLI args, config file, prompt inject. UX trade-off.

🎯 Case sensitivity matter — macOS/Windows case-insensitive, Linux không. Handle theo platform.

🎯 Defense in depth — roots là 1 layer. OS permission, sandbox, audit log là các layer bổ sung.

Tài liệu tham khảo

MCP Roots Specification
Python pathlib docs — resolve(), relative_to()
OWASP Path Traversal — security background
Transcript MCP 201 — David Soria Parra giải thích roots design

Nội dung này có hữu ích không?