Sparse-dense embeddings
Overview
Pinecone supports vectors with sparse and dense values, which allows you to perform semantic and keyword search over your data in one query and combine the results for more relevant results. This topic describes how sparse-dense vectors work in Pinecone.
To see sparse-dense embeddings in action, see the Ecommerce hybrid search example.
Pinecone sparse-dense vectors allow keyword-aware semantic search
Pinecone sparse-dense vectors allows you to perform keyword-aware semantic search. Semantic
search results for out-of-domain queries can be less relevant; combining these
with keyword search results can improve relevance.
Because Pinecone allows you to create your own sparse vectors, you can use sparse-dense queries to solve the Maximum Inner Product Search (MIPS) problem for sparse-dense vectors of any real values. This includes emerging use-cases such as retrieval over learnt sparse representations for text data using SPLADE.
Sparse-dense workflow
Using sparse-dense vectors involves the following general steps:
- Create dense vectors using an external embedding model.
- Create sparse vectors using an external model.
- Create an index that supports sparse-dense vectors (s1 or p1 with the
dotproduct
metric). - Upsert dense and sparse vectors to your index.
- Search the index using sparse-dense vectors.
- Pinecone returns sparse-dense vectors.
Sparse versus dense vectors in Pinecone
Pinecone supports dense and sparse embeddings as a single vector. These types of embeddings represent different types of information and enable distinct kinds of search. Dense vectors enable semantic search. Semantic search returns the most similar results according to a specific distance metric even if no exact matches are present. This is possible because dense vectors generated by embedding models such as SBERT are numerical representations of semantic meaning.
Sparse vectors have very large number of dimensions, where only a small proportion of values are non-zero. When used for keywords search, each sparse vector represents a document; the dimensions represent words from a dictionary, and the values represent the importance of these words in the document. Keyword search algorithms like the BM25 algorithm compute the relevance of text documents based on the number of keyword matches, their frequency, and other factors.
Creating sparse vector embeddings
Keyword-aware semantic search requires vector representations of documents. Because Pinecone indexes accept sparse indexes rather than documents, you can control the generation of sparse vectors to represent documents.
For examples of sparse vector generation, see SPLADE for Sparse Vector Search Explained, our SPLADE generation notebook, and our BM25 generation notebook.
Note
Pinecone supports sparse vector values of sizes up to 1000 non-zero values.
Pinecone creates sparse-dense vectors from your sparse and dense embeddings
In Pinecone, each vector consists of dense vector values and, optionally, sparse vector values as well. Pinecone does not support vectors with only sparse values.
p1 and s1 indexes using the dotproduct
metric support sparse-dense vectors
dotproduct
metric support sparse-dense vectorsPinecone stores sparse-dense vectors in p1 and s1 indexes. In order to query an index using sparse values, the index must use the dotproduct
metric. Attempting to query any other index with sparse values returns an error.
Indexes created before February 22, 2023 do not support sparse values.
Sparse-dense queries include sparse and dense vector values
To query your sparse-dense vectors, you provide a query vector containing both sparse and dense values. Pinecone ranks vectors in your index by considering the full dot product over the entire vector; the score of a vector is the sum of the dot product of its dense values with the dense part of the query, together with the dot product of its sparse values with the sparse part of the query.
Sparse-dense vector format
Pinecone represents sparse values as a dictionary of two arrays: indices
and values
. You can upsert these values inside a vector parameter to upsert a sparse-dense vector.
Example
The following example upserts two vectors with sparse and dense values.
index = pinecone.Index('example-index')
upsert_response = index.upsert(
vectors=[
{'id': 'vec1',
'values': [0.1, 0.2, 0.3],
'metadata': {'genre': 'drama'},
'sparse_values': {
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}},
{'id': 'vec2',
'values': [0.2, 0.3, 0.4],
'metadata': {'genre': 'action'},
'sparse_values': {
'indices': [15, 40, 11],
'values': [0.4, 0.5, 0.2]
}}
],
namespace='example-namespace'
)
The following example queries an index using a sparse-dense vector.
query_response = index.query(
namespace="example-namespace",
top_k=10,
vector=[0.1, 0.2, 0.3],
sparse_vector={
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}
)
Sparse-dense queries do not support explicit weighting
Because Pinecone's index views your sparse-dense vector as a single vector, it does not offer a built-in parameter to adjust the weight of a query's dense part against its sparse part; the index is agnostic to density or sparsity of coordinates in your vectors. You may, however, incorporate a linear weighting scheme by customizing your query vector, as we demonstrate in the function below.
Examples
The following example transforms vector values using an alpha parameter.
def hybrid_score_norm(dense, sparse, alpha: float):
"""Hybrid score using a convex combination
alpha * dense + (1 - alpha) * sparse
Args:
dense: Array of floats representing
sparse: a dict of `indices` and `values`
alpha: scale between 0 and 1
"""
if alpha < 0 or alpha > 1:
raise ValueError("Alpha must be between 0 and 1")
hs = {
'indices': sparse['indices'],
'values': [v * (1 - alpha) for v in sparse['values']]
}
return [v * alpha for v in dense], hs
The following example transforms a vector using the above function, then queries a Pinecone index.
sparse_vector = {
'indices': [10, 45, 16],
'values': [0.5, 0.5, 0.2]
}
dense_vector = [0.1, 0.2, 0.3]
hdense, hsparse = hybrid_score_norm(dense_vector, sparse_vector, alpha=0.75)
query_response = index.query(
namespace="example-namespace",
top_k=10,
vector=hdense,
sparse_vector=hsparse
)
Next steps
Updated about 1 month ago