Using LangChain and Pinecone to add knowledge to LLMs
PineconeVectorStore
class provided by LangChain can be used to interact with Pinecone indexes. It’s important to remember that you must have an existing Pinecone index before you can create a PineconeVectorStore
object.
PineconeVectorStore
object, you must provide the name of the Pinecone index and an Embeddings
object initialized through LangChain. There are two general approaches to initializing a PineconeVectorStore
object:
from_existing_index
method of LangChain’s PineconeVectorStore
class to initialize a vector store.
from_documents
and from_texts
methods of LangChain’s PineconeVectorStore
class add records to a Pinecone index and return a PineconeVectorStore
object.
The from_documents
method accepts a list of LangChain’s Document
class objects, which can be created using LangChain’s CharacterTextSplitter
class. The from_texts
method accepts a list of strings. Similarly to above, you must provide the name of an existing Pinecone index and an Embeddings
object.
Both of these methods handle the embedding of the provided text data and the creation of records in your Pinecone index.
PineconeVectorStore
object, you can add more records to the underlying Pinecone index (and thus also the linked LangChain object) using either the add_documents
or add_texts
methods.
Like their counterparts that also initialize a PineconeVectorStore
object, both of these methods also handle the embedding of the provided text data and the creation of records in your Pinecone index.
similarity_search
on a PineconeVectorStore
object returns a list of LangChain Document
objects most similar to the query provided. While the similarity_search
uses a Pinecone query to find the most similar results, this method includes additional steps and returns results of a different type.
The similarity_search
method accepts raw text and automatically embeds it using the Embedding
object provided when you initialized the PineconeVectorStore
. You can also provide a k
value to determine the number of LangChain Document
objects to return. The default value is k=4
.
PineconeVectorStore
class support using namespaces. You can also initialize your PineconeVectorStore
object with a namespace to restrict all further operations to that space.
PineconeVectorStore
object without a namespace, you can specify the target namespace within the operation.
total_vector_count
of 0
, as you haven’t added any vectors yet.
text_field
parameter sets the name of the metadata field that stores the raw text when you upsert records using a LangChain operation such as vectorstore.from_documents
or vectorstore.add_texts
.
This metadata field is used as the page_content
in the Document
objects retrieved from query-like LangChain operations such as vectorstore.similarity_search
.
If you do not specify a value for text_field
, it will default to "text"
.
vectorstore.similarity_search
:
RetrievalQA
object like so:
RetrievalQA
called RetrievalQAWithSourcesChain
:
delete_index
operation to delete it: