Pinecone supports vectors with sparse and dense values, which allows you to perform hybrid search on your Pinecone index. Hybrid search combines semantic and keyword search in one query for more relevant results. Semantic search results for out-of-domain queries can be less relevant; combining these with keyword search results can improve relevance. This topic describes how hybrid search with sparse-dense vectors works in Pinecone.

To see sparse-dense embeddings in action, see the Ecommerce hybrid search example.

This feature is in public preview. Consider the current limitations and considerations for serverless indexes, and test thoroughly before using it in production.

Hybrid search in Pinecone

In Pinecone, you perform hybrid search with sparse-dense vectors. Sparse-dense vectors combine dense and sparse embeddings as a single vector. Sparse and dense vectors represent different types of information and enable distinct kinds of search.

Dense vectors

The basic vector type in Pinecone is a dense vector. Dense vectors enable semantic search. Semantic search returns the most similar results according to a specific distance metric even if no exact matches are present. This is possible because dense vectors generated by embedding models such as SBERT are numerical representations of semantic meaning.

Sparse vectors

Sparse vectors have very large number of dimensions, where only a small proportion of values are non-zero. When used for keywords search, each sparse vector represents a document; the dimensions represent words from a dictionary, and the values represent the importance of these words in the document. Keyword search algorithms like the BM25 algorithm compute the relevance of text documents based on the number of keyword matches, their frequency, and other factors.

Sparse-dense workflow

Using sparse-dense vectors involves the following general steps:

  1. Create dense vectors using an external embedding model.
  2. Create sparse vectors using an external model.
  3. Create an index with the dotproduct metric.
  4. Upsert sparse-dense vectors to your index.
  5. Search the index using sparse-dense vectors.
  6. Pinecone returns sparse-dense vectors.

Considerations for serverless indexes

Query execution

The implementation of hybrid search is meaningfully different between pod-based and serverless indexes. If you switch from one to the other, you may experience a regression in accuracy.

When you query a serverless index, query planners choose clusters of records based on their similarity to the dense vector value in the query. Query executors then select records based on the similarity of both their dense and sparse vector values to the dense and sparse vector values in the query. Because the initial selection of clusters is based only on dense vector values, this process can affect accuracy in cases where records contain dense and sparse values that are from different data representations (e.g., image and text) or are otherwise not strongly correlated.

For more details about the execution of hybrid queries, see Serverless architecture.

Sparse retrieval costs

Retrieving sparse vectors from a serverless index incurs additional RUs. See Understanding cost for more details.

Limitations

Pinecone sparse-dense vectors have the following limitations:

  • Records with sparse vector values must also contain dense vector values.

  • Sparse vector values can contain up to 1000 non-zero values and 4.2 billion dimensions.

  • Only indexes using the dotproduct distance metric support querying sparse-dense vectors.

    Upserting, updating, and fetching sparse-dense vectors in indexes with a different distance metric will succeed, but querying will return an error.

  • Indexes created before February 22, 2023 do not support sparse vectors.

See also