llama-text-embed-v2 | NVIDIA

Hosted

METRICcosine

DIMENSION1024, 2048, 768, 512, 384

MAX INPUT TOKENS2048

TASKembedding

PRICE$0.16

Overview

nvidia/llama-text-embed-v2 is a state-of-the-art embedding model available natively in Pinecone Inference. Developed by NVIDIA Research, it is built on the Llama 3.2 1B architecture and optimized for high retrieval quality with low-latency inference. Also known as llama-3_2-nv-embedqa-1b-v2, the model distills techniques from NVIDIA’s industry-leading NV-2 (7B parameters) into an efficient, production-ready solution.

Retrieval quality: The model surpasses OpenAI’s text-embedding-3-large across multiple benchmarks, in some cases improving accuracy by more than 20%
Real-time queries: Predictable and consistent query speeds for responsive search with p99 latencies 12x faster than OpenAI Large
Multilingual: Supports 26 languages, including English, Spanish, Chinese, Hindi, Japanese, Korean, French, and German

Installation

pip install --upgrade pinecone

Create index

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

# Create a dense index with integrated inference
index_name = "llama-text-embed-v2"

pc.create_index_for_model(
    name=index_name,
    cloud="aws",
    region="us-east-1",
    embed={
        "model": "llama-text-embed-v2",
        "field_map": {
            "text": "text"  # Map the record field to be embedded
        }
    }
)

index = pc.Index(index_name)

Embed & upsert

data = [
    {"id": "vec1", "text": "Apple is a popular fruit known for its sweetness and crisp texture."},
    {"id": "vec2", "text": "The tech company Apple is known for its innovative products like the iPhone."},
    {"id": "vec3", "text": "Many people enjoy eating apples as a healthy snack."},
    {"id": "vec4", "text": "Apple Inc. has revolutionized the tech industry with its sleek designs and user-friendly interfaces."},
    {"id": "vec5", "text": "An apple a day keeps the doctor away, as the saying goes."},
    {"id": "vec6", "text": "Apple Computer Company was founded on April 1, 1976, by Steve Jobs, Steve Wozniak, and Ronald Wayne as a partnership."}
]

index.upsert_records(
    namespace="example-namespace",
    records=data
)

Query

query_payload = {
    "inputs": {
        "text": "Tell me about the tech company known as Apple."
    },
    "top_k": 3
}

results = index.search(
    namespace="example-namespace",
    query=query_payload
)

print(results)

Lorem Ipsum

Models

llama-text-embed-v2 | NVIDIA

​Overview

​Installation

​Create index

​Embed & upsert

​Query

Overview

Installation

Create index

Embed & upsert

Query