This page shows you how to query your sparse-dense vectors (hybrid search) and explains how Pinecone ranks hybrid search results.

Only indexes using the dotproduct distance metric support querying sparse-dense vectors.

This feature is in public preview.

Query records with sparse-dense values

To query records with sparse-dense values, use the query operation, specifying a value for sparse_vector, which is an object containing the key-value pairs indices and values.

The following example queries an index using a sparse-dense vector:

The value of query_response is like the following:

Shell
{'matches': [{'id': 'vec5', 'score': 0.44, 'values': []},
             {'id': 'vec1', 'score': 0.32, 'values': []},
             {'id': 'vec2', 'score': 0.26000002, 'values': []},
             {'id': 'vec3', 'score': 0.200000018, 'values': []},
             {'id': 'vec4', 'score': 0.140000015, 'values': []}]
}

Query a sparse-dense index with explicit weighting

Because Pinecone views your sparse-dense vector as a single vector, it does not offer a built-in parameter to adjust the weight of a query’s dense part against its sparse part; the index is agnostic to density or sparsity of coordinates in your vectors. You may, however, incorporate a linear weighting scheme by customizing your query vector, as we demonstrate in the function below.

Examples

The following example transforms vector values using an alpha parameter.

Python
def hybrid_score_norm(dense, sparse, alpha: float):
    """Hybrid score using a convex combination

    alpha * dense + (1 - alpha) * sparse

    Args:
        dense: Array of floats representing
        sparse: a dict of `indices` and `values`
        alpha: scale between 0 and 1
    """
    if alpha < 0 or alpha > 1:
        raise ValueError("Alpha must be between 0 and 1")
    hs = {
        'indices': sparse['indices'],
        'values':  [v * (1 - alpha) for v in sparse['values']]
    }
    return [v * alpha for v in dense], hs

The following example transforms a vector using the above function, then queries a Pinecone index.

Python
sparse_vector = {
   'indices': [10, 45, 16],
   'values':  [0.5, 0.5, 0.2]
}
dense_vector = [0.1, 0.2, 0.3]

hdense, hsparse = hybrid_score_norm(dense_vector, sparse_vector, alpha=0.75)

query_response = index.query(
    namespace="example-namespace",
    top_k=10,
    vector=hdense,
    sparse_vector=hsparse
)

See also