voyage-law-2 | Voyage AI

METRICcosine, dot product

DIMENSION1024

MAX INPUT TOKENS16000

TASKembedding

Overview

Optimized for legal retrieval and RAG. See blog post for details. Visit the Voyage documentation for an overview of all Voyage embedding models and rerankers.Access to models is through the Voyage Python client. You must register for Voyage API keys to access.

Using the model

Installation

!pip install -qU voyageai pinecone

Create Index

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="API_KEY")

# Create Index
index_name = "voyage-law-2"

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        dimension=1024,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws,
            region="us-east-1"
        )
    )

index = pc.Index(index_name)

Embed & Upsert

# Embed data
data = [
    {"id": "vec1", "text": "The plaintiff alleges breach of contract and seeks damages for financial losses incurred."},
    {"id": "vec2", "text": "This Agreement shall commence on the Effective Date and remain in force unless terminated earlier."},
    {"id": "vec3", "text": "Apple Inc. is named in a class-action lawsuit alleging monopolistic practices in its App Store policies."},
    {"id": "vec4", "text": "All disputes arising under this Agreement shall be resolved through binding arbitration in accordance with applicable laws."},
    {"id": "vec5", "text": "The parties hereby agree to maintain confidentiality regarding any proprietary information shared."},
]

import voyageai

vo = voyageai.Client(api_key=VOYAGE_API_KEY)

model_id = "voyage-law-2"

def embed(docs: list[str], input_type: str) -> list[list[float]]:
    embeddings = vo.embed(
		    docs,
		    model=model_id,
		    input_type=input_type
		).embeddings
    return embeddings

# Use "document" input type for documents
embeddings = embed([d["text"] for d in data], input_type="document")

vectors = []
for d, e in zip(data, embeddings):
    vectors.append({
        "id": d['id'],
        "values": e,
        "metadata": {'text': d['text']}
    })

index.upsert(
    vectors=vectors,
    namespace="ns1"
)

Query

query = "Tell me about the tech company known as Apple"

# Use "query" input type for queries
x = embed([query], input_type="query")

results = index.query(
    namespace="ns1",
    vector=x[0],
    top_k=3,
    include_values=False,
    include_metadata=True
)

print(results)

Lorem Ipsum

Models

voyage-law-2 | Voyage AI

​Overview

​Using the model

​Installation

​Create Index

​Embed & Upsert