Upsert and search with integrated inference
This page shows you how to use integrated inference to upsert and search without extra steps for embedding data and reranking results.
This feature is in public preview.
1. Install dependencies
Install the latest Pinecone Python SDK and integrated inference plugin as follows:
The pinecone-plugin-records
plugin is not currently compatible with the pinecone[grpc]
version of the Python SDK.
2. Create an index
Use the create_for_model
operation to create a serverless index configured for a specific embedding model. The name
, cloud
, region
, model
, and field_map
are required. Other parameters are optional and default based on the embedding model chosen.
- For
model
, specify one of Pinecone’s hosted embedding models. - For
field_map
, specify the name of the field in your source document that contains the data for embedding.
The response will look like this:
3. Upsert data
Once you have an index configured for a specific embedding model, use the /records/upsert
operation to convert your source data to embeddings and upsert them into a namespace in the index.
Note the following requirements for each document in the request body:
- Each document must contain a unique
_id
, which will serve as the unique record identifier in the index namespace. - Each document must contain a field with the data for embedding. This field must match the
field_map
specified when creating the index. - Any additional fields in the document will be stored in the index and can be returned in search results or used to filter search results.
- When using the API directly, documents are specified using the NDJSON format, also known as line-delimited JSON or JSONL, with one document per line. The Python SDK transforms the list of dictionary entries into the correct NDJSON format for you.
4. Search the index
Use the /records/search
operation to convert a query to a vector embedding and then search your namespace for the most semantically similar records, along with their similarity scores.
Note the following:
- The
inputs
field must betext
. - The
top_k
parameter must specify the number of similar records to return. - Optionally, you can specify:
- The
fields
to return. If not specified, the response will include all fields. - A
filter
to narrow down the search results. rerank
parameters to rerank the initial search results based on relevance to the query.
- The
Basic search
In the previous step, you upserted 8 documents, some about Apple, the technology company, and some about apple, the fruit.
First, search for the 4 documents most semantically related to the query, “Disease prevention”:
Notice that the response includes only documents about the fruit, not the tech company:
Search with reranking
To rerank initial search results based on relevance to the query, add the rerank
parameter, including the reranking model you want to use, the number of reranked results to return, and the fields to use for reranking, if different than the main query.
For example, repeat the search for the 4 documents most semantically related to the query, “Disease prevention”, but this time rerank the results and return only the 2 most relevant documents:
Notice that the 2 returned documents are the most relevant for the query, the first relating to reducing chronic diseases, the second relating to preventing diabetes:
Search with filtering
Your upserted documents also contain a category
field. Now use that field as a filter to search for the 2 documents related to Apple, the tech company, that are in the “product” category:
Notice that the response includes only documents about Apple, the tech company, that are in the “product” category:
Was this page helpful?