Pricing and limits
This page describes the pricing and limits of Pinecone Assistant.
Pricing
The cost of using Pinecone Assistant is determined by the following factors:
Invoice line item | Description |
---|---|
Assistants Context Tokens Processed | Number of tokens processed for context retrieval. |
Assistants Evaluation Tokens Out | Number of tokens used to calculate evaluation metrics. |
Assistants Evaluation Tokens Processed | Number of tokens used to prompt evaluation metrics. |
Assistants Hourly Count | Number of hours the assistant is available. |
Assistants Input Tokens | Number of tokens processed by the assistant. |
Assistants Output Tokens | Number of tokens output by the assistant. |
Assistants Total Storage GB/Hours | Total size of files stored in the assistant per month. |
See Pricing for up-to-date pricing information.
Token usage
Pinecone Assistant usage is measured in tokens, with different counts and cost for input and output tokens.
Pinecone Assistant consumes input tokens for both planning and retrieval. Input token usage is calculated based on the chat history, the document structure and data density (e.g., how many words are in a page), and the number of documents that meet the filter criteria. This means that, in general, the total number of input tokens used is the sum of the chat history token count plus in the order of 10,000 tokens used for document retrieval. The maximum input tokens per query is 64,000.
Output tokens are the number of tokens generated as part of the answer generation. The total number depends on the complexity of the question and the number of documents that were retrieved and are relevant for the question. The output typically ranges from a few dozen to several hundred tokens.
Limits
The following Pinecone Assistant limit apply to each organization and vary based on pricing plan:
Metric | Starter plan | Standard plan | Enterprise plan |
---|---|---|---|
Max number of assistants | 3 | Unlimited | Unlimited |
Max tokens per minute (TPM) input | 30,000 | 150,000 | 150,000 |
Max number of total LLM processed tokens | 1,500,000 | Unlimited | Unlimited |
Max input tokens per query | 64,000 | 64,000 | 64,000 |
Max total output tokens | 200,000 | Unlimited | Unlimited |
The following file limits apply to each assistant and vary based on pricing plan:
Starter plan | Standard plan | Enterprise plan | |
---|---|---|---|
Max file size | 10MB | 10MB | 10MB |
Max PDF file size | 10MB | 100MB | 100MB |
Max file storage | 1GB | 10GB | 10GB |
Max files uploaded | 10 | 10,000 | 10,000 |
Was this page helpful?