Sparse-dense vectors
Overview
Pinecone supports vectors with sparse and dense values, which allows you to perform hybrid search, or semantic and keyword search over your data in one query and combine the results for more relevant results. This topic describes the sparse-dense vector format in Pinecone.
To see sparse-dense embeddings in action, see the Ecommerce hybrid search example.
Sparse-dense vector format
Pinecone represents sparse values as a dictionary of two arrays: indices
and values
.
Example
The following example defines two records with sparse and dense values.
index = pinecone.Index('example-index')
records=[
{'id': 'vec1',
# The 'values' are dense vector values.
'values': [0.1, 0.2, 0.3],
'metadata': {'genre': 'drama'},
'sparse_values': {
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}
},
{'id': 'vec2',
'values': [0.2, 0.3, 0.4],
'metadata': {'genre': 'action'},
'sparse_values': {
'indices': [15, 40, 11],
'values': [0.4, 0.5, 0.2]
}
}
]
You can upsert these values inside a vector parameter to upsert a sparse-dense record.
Examples
The following example upserts two vectors with sparse and dense values.
index = pinecone.Index('example-index')
upsert_response = index.upsert(
vectors=[
{'id': 'vec1',
'values': [0.1, 0.2, 0.3],
'metadata': {'genre': 'drama'},
'sparse_values': {
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}},
{'id': 'vec2',
'values': [0.2, 0.3, 0.4],
'metadata': {'genre': 'action'},
'sparse_values': {
'indices': [15, 40, 11],
'values': [0.4, 0.5, 0.2]
}}
],
namespace='example-namespace'
)
The following example queries an index using a sparse-dense vector.
query_response = index.query(
namespace="example-namespace",
top_k=10,
vector=[0.1, 0.2, 0.3],
sparse_vector={
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}
)
To learn about weighting your sparse and dense vectors in queries, see Weighting sparse and dense vectors.
Next steps
Updated 8 days ago