Sparse-dense vectors

Overview

Pinecone supports vectors with sparse and dense values, which allows you to perform hybrid search, or semantic and keyword search over your data in one query and combine the results for more relevant results. This topic describes the sparse-dense vector format in Pinecone.

To see sparse-dense embeddings in action, see the Ecommerce hybrid search example.

Sparse-dense vector format

Pinecone represents sparse values as a dictionary of two arrays: indices and values.

Example

The following example defines two records with sparse and dense values.

index = pinecone.Index('example-index') 

records=[
    {'id': 'vec1',
     # The 'values' are dense vector values.
     'values': [0.1, 0.2, 0.3],
     'metadata': {'genre': 'drama'},
     'sparse_values': {
         'indices': [10, 45, 16],
         'values': [0.5, 0.5, 0.2]
     }
    },
    {'id': 'vec2',
     'values': [0.2, 0.3, 0.4],
     'metadata': {'genre': 'action'},
     'sparse_values': {
         'indices': [15, 40, 11],
         'values': [0.4, 0.5, 0.2]
     }
    }
]

You can upsert these values inside a vector parameter to upsert a sparse-dense record.

Examples

The following example upserts two vectors with sparse and dense values.

index = pinecone.Index('example-index') 

upsert_response = index.upsert(
    vectors=[
        {'id': 'vec1',
         'values': [0.1, 0.2, 0.3],
         'metadata': {'genre': 'drama'},
         'sparse_values': {
             'indices': [10, 45, 16],
             'values': [0.5, 0.5, 0.2]
         }},
        {'id': 'vec2',
         'values': [0.2, 0.3, 0.4],
         'metadata': {'genre': 'action'},
         'sparse_values': {
             'indices': [15, 40, 11],
             'values': [0.4, 0.5, 0.2]
         }}
    ],
    namespace='example-namespace'
)

The following example queries an index using a sparse-dense vector.

query_response = index.query(
    namespace="example-namespace",
    top_k=10,
    vector=[0.1, 0.2, 0.3],
    sparse_vector={
        'indices': [10, 45, 16],
        'values':  [0.5, 0.5, 0.2]
    }
)

To learn about weighting your sparse and dense vectors in queries, see Weighting sparse and dense vectors.

Next steps