Encode sparse vectors
This page shows you two ways to encode sparse vectors for use in hybrid search: using Pinecone Inference with the pinecone-sparse-english-v0
embedding model, or using the Pinecone Text Client with the BM25 or SPLADE algorithm.
In most cases, Pinecone Inference with the pinecone-sparse-english-v0
model will produce better results. However, if you cannot send text to Pinecone’s endpoints due to privacy considerations, you can run the Pinecone Text Client locally and send Pinecone just the vectors.
Use Pinecone Inference
Pinecone Inference is a service that gives you access to embedding and reranking models hosted on Pinecone’s infrastructure, including the pinecone-sparse-english-v0
model for sparse embeddings. Built on the innovations of the DeepImpact architecture, pinecone-sparse-english-v0
estimates the lexical importance of tokens by leveraging their context, unlike traditional retrieval models like BM25, which rely solely on term frequency.
To encode sparse vectors with Pinecone Inference, do the following:
-
Install the latest Pinecone Python SDK and integrated inference plugin as follows:
The
pinecone-plugin-records
plugin is not currently compatible with thepinecone[grpc]
version of the Python SDK. -
Use the
embed
operation, setting themodel
parameter topinecone-sparse-english-v0
and theinput_type
parameter topassage
orquery
. If you want to include string tokens in the response, also setreturn_tokens
totrue
.The returned object looks like this:
Use the Pinecone Text Client
The Pinecone Text Client is a public Python package that provides text utilities designed for seamless integration with Pinecone’s sparse-dense (hybrid) search.
To convert your text corpus to sparse vectors, you can either use BM25 or SPLADE. This guide uses BM25, which is more common.
-
Install the Pinecone Text Client:
-
Initialize the BM25 encoder and fit it to your corpus of documents.
The following example initializes a
BM25Encoder
object and calls thefit()
function on the corpus, formatted as an array of strings:PythonIf you want to use the default parameters for
BM25Encoder
, you can call thedefault
method. The default parameters were fitted on the MS MARCO passage ranking dataset.Python -
After the encoder is initialized and fit, you can encode documents and queries as sparse vectors.
The following example encodes a new document as a sparse vector for upsert into a Pinecone index:
PythonThe contents of
doc_sparse_vector
look like this:JSONThis example encodes a string as a sparse vector for use in a hybrid search query:
PythonThe contents of
query_sparse_vector
look like this:JSON
See also
Was this page helpful?