Trung cấpHướng dẫnClaude APINguồn: Anthropic

Tạo Embeddings với Voyage AI — Text thành vector

Minh TuấnCTO, Transform GroupTheo dõi

26/03/2026 716 0 6 phút đọc

Nghe bài viết

00:00

1 Để áp dụng embedding là gì? hiệu quả, bạn cần nắm rõ: Hãy tưởng tượng mỗi từ hay câu được đặt vào một "bản đồ ý nghĩa" nhiều chiều. Các từ/câu có nghĩa tương tự sẽ nằm gần nhau, khác nghĩa sẽ nằm xa — đây là bước quan trọng giúp tối ưu quy trình làm việc với AI trong thực tế.
2 Điểm cần cân nhắc khi sử dụng voyage ai models: Voyage AI cung cấp nhiều models cho các use cases khác nhau: Model Dimension Use case Context voyage-3 1024 General purpose, cân bằng 32K tokens voyage-3-lite 512 Cost-efficient, high volume 32K tokens voyage-code-3 1024 Source code — không phải mọi trường hợp đều phù hợp, cần đánh giá bối cảnh cụ thể trước khi áp dụng.
3 Theo phân tích tính cosine similarity, Cosine similarity đo góc giữa hai vectors — giá trị từ -1 ngược chiều đến 1 cùng chiều. Thực tế với embeddings: 0.8+ rất tương tự, 0.5-0.8 liên quan, dưới 0.5 khác nhau. vec2: """Tính cosine similarity giữa hai vectors — con số thực tế này đáng để tham khảo khi lập kế hoạch triển khai cho dự án của bạn.
4 Để áp dụng semantic search function hiệu quả, bạn cần nắm rõ: corpus_texts, corpus_embeddingsNone, top_k3: """ Tìm kiếm semantic trong corpus. Args: query: Câu hỏi tìm kiếm corpus_texts: List văn bản cần search corpus_embeddings: Pre-computed embeddings optional — đây là bước quan trọng giúp tối ưu quy trình làm việc với AI trong thực tế.
5 Về multilingual embeddings, thực tế cho thấy Voyage AI hỗ trợ cross-lingual search — query tiếng Việt có thể match documents tiếng Anh: mixed_corpus "Claude API requires authentication with an API key", "Claude API cần xác thực bằng API key", "Rate limits apply to all API endpoints" — đây là con dao hai lưỡi nếu không hiểu rõ giới hạn và điều kiện áp dụng của nó.

a bunch of blue ice on a black background

Embeddings là nền tảng của mọi RAG system — chúng chuyển đổi văn bản thành vector số học sao cho các đoạn văn có nghĩa gần nhau sẽ có vector gần nhau trong không gian đa chiều. Voyage AI là nhà cung cấp embedding models được Anthropic khuyến nghị chính thức, được tối ưu hóa để hoạt động tốt nhất với Claude.

Bài viết này giải thích embedding là gì, cách dùng Voyage AI API, và cách tích hợp vào pipeline RAG của bạn.

Embedding là gì?

Hãy tưởng tượng mỗi từ hay câu được đặt vào một "bản đồ ý nghĩa" nhiều chiều. Các từ/câu có nghĩa tương tự sẽ nằm gần nhau, khác nghĩa sẽ nằm xa. Một embedding model chuyển văn bản thành tọa độ trong bản đồ đó.

Ví dụ:

"Con chó" và "Con mèo" — vector gần nhau (cùng là thú cưng)
"Con chó" và "Xe ô tô" — vector xa nhau (khác category)
"API key bị lỗi" và "Không thể xác thực" — gần nhau (cùng về authentication issues)

Voyage AI Models

Voyage AI cung cấp nhiều models cho các use cases khác nhau:

Model	Dimension	Use case	Context
voyage-3	1024	General purpose, cân bằng	32K tokens
voyage-3-lite	512	Cost-efficient, high volume	32K tokens
voyage-code-3	1024	Source code, technical docs	32K tokens
voyage-finance-2	1024	Financial documents	32K tokens
voyage-law-2	1024	Legal documents	16K tokens

Cài đặt và khởi tạo

pip install voyageai numpy

import os
import voyageai
import numpy as np

client = voyageai.Client(api_key=os.environ.get("VOYAGE_API_KEY"))
print("Voyage AI client ready")

Tạo embeddings cơ bản

def embed_texts(texts, model="voyage-3", input_type=None):
    """
    Tạo embeddings cho danh sách văn bản.

    input_type options:
    - None: Không specify (dùng cho general tasks)
    - "document": Văn bản cần index (tối ưu cho retrieval)
    - "query": Câu hỏi tìm kiếm (tối ưu cho search)
    """
    result = client.embed(
        texts=texts,
        model=model,
        input_type=input_type
    )
    return result.embeddings

# Ví dụ cơ bản
texts = [
    "Hướng dẫn cài đặt Claude API",
    "Claude API installation guide",
    "Cách nấu phở bò ngon",
    "Lỗi 401 Unauthorized khi gọi API",
    "Không thể xác thực — thiếu API key"
]

embeddings = embed_texts(texts)
print(f"Embedded {len(texts)} texts")
print(f"Embedding dimension: {len(embeddings[0])}")

Tính Cosine Similarity

Cosine similarity đo góc giữa hai vectors — giá trị từ -1 (ngược chiều) đến 1 (cùng chiều). Thực tế với embeddings: 0.8+ = rất tương tự, 0.5-0.8 = liên quan, dưới 0.5 = khác nhau.

def cosine_similarity(vec1, vec2):
    """Tính cosine similarity giữa hai vectors."""
    vec1 = np.array(vec1)
    vec2 = np.array(vec2)

    dot_product = np.dot(vec1, vec2)
    norm1 = np.linalg.norm(vec1)
    norm2 = np.linalg.norm(vec2)

    if norm1 == 0 or norm2 == 0:
        return 0.0

    return dot_product / (norm1 * norm2)

# So sánh similarity giữa các cặp
print("Similarity scores:")
reference = texts[0]  # "Hướng dẫn cài đặt Claude API"
ref_embedding = embeddings[0]

for i, (text, emb) in enumerate(zip(texts, embeddings)):
    if i == 0:
        continue
    score = cosine_similarity(ref_embedding, emb)
    print(f"  [{score:.3f}] {text}")

Kết quả mong đợi:

  [0.912] Claude API installation guide  (bản dịch tiếng Anh = rất giống)
  [0.234] Cách nấu phở bò ngon           (hoàn toàn khác topic)
  [0.687] Lỗi 401 Unauthorized khi gọi API  (cùng về API nhưng khác vấn đề)
  [0.645] Không thể xác thực — thiếu API key  (liên quan đến API issues)

Semantic Search function

def semantic_search(query, corpus_texts, corpus_embeddings=None, top_k=3):
    """
    Tìm kiếm semantic trong corpus.

    Args:
        query: Câu hỏi tìm kiếm
        corpus_texts: List văn bản cần search
        corpus_embeddings: Pre-computed embeddings (optional, tránh re-embed)
        top_k: Số kết quả trả về
    """
    # Embed query với input_type="query"
    query_embedding = embed_texts([query], input_type="query")[0]

    # Embed corpus nếu chưa có
    if corpus_embeddings is None:
        corpus_embeddings = embed_texts(corpus_texts, input_type="document")

    # Tính similarity với tất cả documents
    similarities = [
        cosine_similarity(query_embedding, doc_emb)
        for doc_emb in corpus_embeddings
    ]

    # Sắp xếp và lấy top_k
    ranked = sorted(
        zip(similarities, corpus_texts),
        key=lambda x: x[0],
        reverse=True
    )

    return ranked[:top_k]

# Test semantic search
knowledge_base = [
    "Claude API yêu cầu API key để xác thực. Lấy key tại console.anthropic.com",
    "Giới hạn rate: 50 requests/phút cho Tier 1, 1000 requests/phút cho Tier 4",
    "Model claude-haiku-4-5 có chi phí thấp nhất, phù hợp cho production high-volume",
    "Streaming response cho phép hiển thị text từng token, cải thiện UX",
    "Context window tối đa: 200K tokens cho claude-opus-4-5",
    "SDK Python: pip install anthropic. SDK TypeScript: npm install @anthropic-ai/sdk",
    "Function calling (tool use) cho phép Claude gọi external APIs",
    "Vision API: gửi base64 images trong messages để Claude phân tích ảnh"
]

# Pre-compute embeddings một lần
kb_embeddings = embed_texts(knowledge_base, input_type="document")

# Search
query = "Làm sao cài đặt thư viện Python?"
results = semantic_search(query, knowledge_base, kb_embeddings)

print(f"
Query: {query}")
print("Top kết quả:")
for score, text in results:
    print(f"  [{score:.3f}] {text}")

Batch Embedding cho hiệu suất cao

Voyage AI cho phép embed nhiều texts trong một call. Batch processing tăng throughput đáng kể:

def batch_embed(texts, model="voyage-3", input_type="document", batch_size=128):
    """
    Embed danh sách lớn texts theo batches.
    Voyage AI giới hạn 128 texts hoặc 1M tokens per request.
    """
    all_embeddings = []

    for i in range(0, len(texts), batch_size):
        batch = texts[i:i + batch_size]
        result = client.embed(
            texts=batch,
            model=model,
            input_type=input_type
        )
        all_embeddings.extend(result.embeddings)

        # Report progress
        processed = min(i + batch_size, len(texts))
        print(f"  Processed {processed}/{len(texts)} texts")

    return all_embeddings

# Ví dụ: embed 1000 documents
import random
import string

large_corpus = [
    f"Document {i}: " + "".join(random.choices(string.ascii_letters, k=100))
    for i in range(1000)
]

print("Batch embedding 1000 documents...")
embeddings_large = batch_embed(large_corpus, batch_size=128)
print(f"Done! {len(embeddings_large)} embeddings created")

Multilingual Embeddings

Voyage AI hỗ trợ cross-lingual search — query tiếng Việt có thể match documents tiếng Anh:

mixed_corpus = [
    "Claude API requires authentication with an API key",
    "Claude API cần xác thực bằng API key",
    "Rate limits apply to all API endpoints",
    "Giới hạn request áp dụng cho tất cả endpoints",
    "Streaming is supported for real-time responses",
]

mixed_embeddings = embed_texts(mixed_corpus, input_type="document")

# Query tiếng Việt
vi_query = "Làm sao xác thực với API?"
results = semantic_search(vi_query, mixed_corpus, mixed_embeddings)

print(f"Query (VN): {vi_query}")
print("Results:")
for score, text in results:
    print(f"  [{score:.3f}] {text}")

Reranking với Voyage AI

Sau khi retrieve top-50 candidates, dùng reranker để chọn top-5 chất lượng nhất:

def rerank(query, candidates, top_k=5, model="rerank-2"):
    """
    Rerank candidates theo relevance với query.
    Tốt hơn cosine similarity cho precision.
    """
    result = client.rerank(
        query=query,
        documents=candidates,
        model=model,
        top_k=top_k
    )

    reranked = [
        {"text": r.document, "score": r.relevance_score}
        for r in result.results
    ]
    return reranked

# Two-stage retrieval: vector search -> rerank
initial_results = [text for _, text in semantic_search(
    "Claude authentication",
    knowledge_base,
    kb_embeddings,
    top_k=5
)]

reranked = rerank("How to authenticate with Claude API?", initial_results)
print("Reranked results:")
for r in reranked:
    print(f"  [{r['score']:.3f}] {r['text']}")

Kết luận

Voyage AI cung cấp embeddings chất lượng cao, được tối ưu cho tiếng Việt và multilingual, với domain-specific models cho code, finance, và legal. Kết hợp với Claude, đây là foundation lý tưởng cho mọi RAG system.

Bước tiếp theo: Áp dụng kiến thức này vào RAG với Pinecone + Claude để xây dựng production-ready RAG, hoặc đọc về LlamaIndex + Claude để dùng framework high-level.

Gợi ý cho bạn

Tool Search với Embeddings — Tìm tool phù hợp bằng semantic search

Tạo Embeddings với Voyage AI — Text thành vector

Điểm nổi bật

Embedding là gì?

Voyage AI Models

Cài đặt và khởi tạo

Tạo embeddings cơ bản

Tính Cosine Similarity

Semantic Search function

Batch Embedding cho hiệu suất cao

Multilingual Embeddings

Reranking với Voyage AI

Kết luận

Bài viết liên quan

Gợi ý cho bạn

Tool Search với Embeddings — Tìm tool phù hợp bằng semantic search

RAG với Pinecone + Claude — Vector database cho AI

Đánh giá Claude Cowork: Tính năng, Giá cả và Giới hạn thực tế

Claude Context Window Optimization — Tận dụng 1M token hiệu quả

Tin liên quan nên xem

Tóm tắt trang web với Claude Haiku — Nhanh và rẻ

Claude Cowork Giải Phóng 60GB Dung Lượng Máy Tính: Trải Nghiệm Thực Tế

Contextual Retrieval — Nâng cấp RAG với embeddings ngữ cảnh

Parallel Tool Calls — Gọi nhiều tools đồng thời với Claude

Tạo Embeddings với Voyage AI — Text thành vector

Điểm nổi bật

Embedding là gì?

Voyage AI Models

Cài đặt và khởi tạo

Tạo embeddings cơ bản

Tính Cosine Similarity

Semantic Search function

Batch Embedding cho hiệu suất cao

Multilingual Embeddings

Reranking với Voyage AI

Kết luận

Bài viết liên quan

Gợi ý cho bạn

Tool Search với Embeddings — Tìm tool phù hợp bằng semantic search

RAG với Pinecone + Claude — Vector database cho AI

Đánh giá Claude Cowork: Tính năng, Giá cả và Giới hạn thực tế

Claude Context Window Optimization — Tận dụng 1M token hiệu quả

Tin liên quan nên xem

Tóm tắt trang web với Claude Haiku — Nhanh và rẻ

Claude Cowork Giải Phóng 60GB Dung Lượng Máy Tính: Trải Nghiệm Thực Tế

Contextual Retrieval — Nâng cấp RAG với embeddings ngữ cảnh

Parallel Tool Calls — Gọi nhiều tools đồng thời với Claude

Đăng ký nhận bản tin