voyage-finance-2 | Voyage AI

METRIC

cosine, dot product

DIMENSION

1024

MAX INPUT TOKENS

32000

TASK

embedding

Overview

Optimized for finance retrieval and RAG. See blog post for details. Visit the Voyage documentation for an overview of all Voyage embedding models and rerankers.

Access to models is through the Voyage Python client. You must register for Voyage API keys to access.

Using the model

Installation

!pip install -qU voyageai pinecone

Create Index

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="API_KEY")

# Create Index
index_name = "voyage-finance-2"

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        dimension=1024,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws,
            region="us-east-1"
        )
    )

index = pc.Index(index_name)

Embed & Upsert

# Embed data
data = [
    {"id": "vec1", "text": "The stock market saw a sharp decline in response to rising interest rates."},
    {"id": "vec2", "text": "Investors are shifting towards bonds as a safer investment amid economic uncertainty."},
    {"id": "vec3", "text": "Apple's quarterly earnings exceeded expectations, driving its stock price higher."},
    {"id": "vec4", "text": "Cryptocurrencies like Bitcoin remain volatile but attract significant investor interest."},
    {"id": "vec5", "text": "The Federal Reserve hinted at a potential pause in rate hikes to assess inflation trends."},
]

import voyageai

vo = voyageai.Client(api_key=VOYAGE_API_KEY)

model_id = "voyage-finance-2"

def embed(docs: list[str], input_type: str) -> list[list[float]]:
    embeddings = vo.embed(
		    docs,
		    model=model_id,
		    input_type=input_type
		).embeddings
    return embeddings

# Use "document" input type for documents
embeddings = embed([d["text"] for d in data], input_type="document")

vectors = []
for d, e in zip(data, embeddings):
    vectors.append({
        "id": d['id'],
        "values": e,
        "metadata": {'text': d['text']}
    })

index.upsert(
    vectors=vectors,
    namespace="ns1"
)

Query

query = "Tell me about the tech company known as Apple"

# Use "query" input type for queries
x = embed([query], input_type="query")

results = index.query(
    namespace="ns1",
    vector=x[0],
    top_k=3,
    include_values=False,
    include_metadata=True
)

print(results)

Lorem Ipsum

Was this page helpful?