Understanding Pinecone Assistant
Pinecone Assistant is a service that allows you to upload documents, ask questions, and receive responses that reference your documents. This is known as retrieval-augmented generation (RAG). You can access assistant using the Pinecone console, a Python plugin, or the Assistant API. The JavaScript and Java SDKs do not support Pinecone Assistant.
This feature is in public preview.
How it works
When you upload a document, your assistant processes the contents by chunking and embedding the text. Then, the assistant stores the embeddings in a vector database. When you chat with your assistant, it queries a large language model (LLM) with your prompt and any relevant information from your data sources. With this context, the LLM can provide responses grounded in your documents.
Assistant manages embedding generation and storage and prompting the LLM: you do not directly access these parts of the system. You upload the files and chat with the model, and Assistant manages all other components.
SDK support
You can use the Assistant API directly or via the Pinecone Python SDK.
To interact with Pinecone Assistant using the Python SDK, upgrade the client and install the pinecone-plugin-assistant
package as follows:
Limitations
Pinecone Assistant has the following limitations:
- Supported file types: .txt and .pdf
- Max input tokens per query: 64,000
Starter plans
The following limitations apply to each Starter organization:
- Max number of assistants: 3
- Max tokens per minute (TPM) input: 30,000
- Max number of total LLM processed tokens: 1,500,000
- Max total output tokens: 200,000
The following limitations apply to each assistant in Starter organizations:
- Max file storage: 1GB
- Max files uploaded: 10
Standard and Enterprise plans
The following limitations apply to each Standard or Enterprise organization:
- Max number of assistants: unlimited
- Max tokens per minute (TPM) input: 150,000
- Max number of total LLM processed tokens: unlimited
- Max total output tokens: unlimited
The following limitations apply to each assistant in Standard or Enterprise organizations:
- Max file storage: 10GB
- Max files uploaded: 10,000
Pricing
See Pricing for up-to-date pricing information.
Was this page helpful?