Integrated inference requires a serverless index configured for a specific embedding model. You can either create a new index for a model or configure an existing index for a model.
To create an index that accepts source text and converts it to vectors automatically using an embedding model hosted by Pinecone, use the create_for_model operation as follows:
Once you have an index configured for a specific embedding model, use the /records/upsert operation to convert your source data to embeddings and upsert them into a namespace in the index.
Note the following requirements for each document in the request body:
Each document must contain a unique _id, which will serve as the unique record identifier in the index namespace.
Each document must contain a field with the data for embedding. This field must match the field_map specified when creating the index.
Any additional fields in the document will be stored in the index and can be returned in search results or used to filter search results.
When using the API directly, documents are specified using the NDJSON format, also known as line-delimited JSON or JSONL, with one document per line. The Python SDK transforms the list of dictionary entries into the correct NDJSON format for you.
# Target the created index for upsert and searchindex = pc.Index(index_name)index.upsert_records("example-namespace",[{"_id":"rec1","chunk_text":"Apple's first product, the Apple I, was released in 1976 and was hand-built by co-founder Steve Wozniak.","category":"product",},{"_id":"rec2","chunk_text":"Apples are a great source of dietary fiber, which supports digestion and helps maintain a healthy gut.","category":"nutrition",},{"_id":"rec3","chunk_text":"Apples originated in Central Asia and have been cultivated for thousands of years, with over 7,500 varieties available today.","category":"cultivation",},{"_id":"rec4","chunk_text":"In 2001, Apple released the iPod, which transformed the music industry by making portable music widely accessible.","category":"product",},{"_id":"rec5","chunk_text":"Apple went public in 1980, making history with one of the largest IPOs at that time.","category":"milestone",},{"_id":"rec6","chunk_text":"Rich in vitamin C and other antioxidants, apples contribute to immune health and may reduce the risk of chronic diseases.","category":"nutrition",},{"_id":"rec7","chunk_text":"Known for its design-forward products, Apple's branding and market strategy have greatly influenced the technology sector and popularized minimalist design worldwide.","category":"influence",},{"_id":"rec8","chunk_text":"The high fiber content in apples can also help regulate blood sugar levels, making them a favorable snack for people with diabetes.","category":"nutrition",},],)time.sleep(10)# Wait for the upserted vectors to be indexed
Use the /records/search operation to convert a query to a vector embedding and then search your namespace for the most semantically similar records, along with their similarity scores.
Note the following:
The inputs field must be text.
The top_k parameter must specify the number of similar records to return.
Optionally, you can specify:
The fields to return. If not specified, the response will include all fields.
A filter to narrow down the search results.
rerank parameters to rerank the initial search results based on relevance to the query.
Notice that the response includes only documents about the fruit, not the tech company:
{'result':{'hits':[{'_id':'rec6','_score':0.8197098970413208,'fields':{'category':'nutrition','chunk_text':'Rich in vitamin C and other ''antioxidants, apples ''contribute to immune health ''and may reduce the risk of ''chronic diseases.'}},{'_id':'rec2','_score':0.7929002642631531,'fields':{'category':'nutrition','chunk_text':'Apples are a great source of ''dietary fiber, which supports ''digestion and helps maintain a ''healthy gut.'}},{'_id':'rec8','_score':0.7800688147544861,'fields':{'category':'nutrition','chunk_text':'The high fiber content in ''apples can also help regulate ''blood sugar levels, making ''them a favorable snack for ''people with diabetes.'}},{'_id':'rec3','_score':0.7553971409797668,'fields':{'category':'cultivation','chunk_text':'Apples originated in Central ''Asia and have been cultivated ''for thousands of years, with ''over 7,500 varieties available ''today.'}}]},'usage':{'embed_total_tokens':8,'read_units':6}}
To rerank initial search results based on relevance to the query, add the rerank parameter, including the reranking model you want to use, the number of reranked results to return, and the fields to use for reranking, if different than the main query.
For example, repeat the search for the 4 documents most semantically related to the query, “Disease prevention”, but this time rerank the results and return only the 2 most relevant documents:
Notice that the 2 returned documents are the most relevant for the query, the first relating to reducing chronic diseases, the second relating to preventing diabetes:
{'result':{'hits':[{'_id':'rec6','_score':0.004433765076100826,'fields':{'category':'nutrition','chunk_text':'Rich in vitamin C and other ''antioxidants, apples ''contribute to immune health ''and may reduce the risk of ''chronic diseases.'}},{'_id':'rec8','_score':0.0029121784027665854,'fields':{'category':'nutrition','chunk_text':'The high fiber content in ''apples can also help regulate ''blood sugar levels, making ''them a favorable snack for ''people with diabetes.'}}]},'usage':{'embed_total_tokens':8,'read_units':6,'rerank_units':1}}
Your upserted documents also contain a category field. Now use that field as a filter to search for the 2 documents related to Apple, the tech company, that are in the “product” category:
Notice that the response includes only documents about Apple, the tech company, that are in the “product” category:
{'result':{'hits':[{'_id':'rec1','_score':0.8127778768539429,'fields':{'category':'product','chunk_text':"Apple's first product, the "'Apple I, was released in 1976 ''and was hand-built by ''co-founder Steve Wozniak.'}},{'_id':'rec4','_score':0.7763848900794983,'fields':{'category':'product','chunk_text':'In 2001, Apple released the ''iPod, which transformed the ''music industry by making ''portable music widely ''accessible.'}}]},'usage':{'embed_total_tokens':8,'read_units':6}}