This page describes the pricing and limits of Pinecone Assistant.
The Standard and Enterprise pricing plans include a monthly minimum usage committment:
Plan | Minimum usage |
---|---|
Standard | $50/month |
Enterprise | $500/month |
Beyond the monthly minimum, customers are charged for what they use each month.
Examples
Usage below monthly minimum
In this case, the August invoice would include line items for each service you used (totaling $20), plus a single line item covering the rest of the minimum usage commitment ($30).
Usage exceeds monthly minimum
In this case, the August invoice would only show line items for each service you used (totaling $100). Since your usage exceeds the minimum usage commitment, you are only charged for your actual usage and no additional minimum usage line item appears on your invoice.
Customers who signed up for the Standard or Enteprise plan before July 1, 2025 will continue to pay a monthly platform fee until September 1, 2025. After that date, the minimum usage commitment explained above will replace the platform fee.
Plan | Platform fee | Usage credits |
---|---|---|
Standard | $25/month | $15/month |
Enterprise | $500/month | $150/month |
Usage credits do not roll over from month to month. Platform fees do not apply to organizations on the Starter plan or with annual commits.
The cost of using Pinecone Assistant is determined by the following factors:
Invoice line item | Description |
---|---|
Assistants Context Tokens Processed | Number of tokens processed for context retrieval. |
Assistants Evaluation Tokens Out | Number of tokens used to calculate evaluation metrics. |
Assistants Evaluation Tokens Processed | Number of tokens used to prompt evaluation metrics. |
Assistants Hourly Count | Number of hours the assistant is available. |
Assistants Input Tokens | Number of tokens processed by the assistant. |
Assistants Output Tokens | Number of tokens output by the assistant. |
Assistants Total Storage GB/Hours | Total size of files stored in the assistant per month. |
See Pricing for up-to-date pricing information.
Pinecone Assistant usage is measured in tokens, with different counts and cost for input and output tokens.
Pinecone Assistant consumes input tokens for both planning and retrieval. Input token usage is calculated based on the chat history, the document structure and data density (e.g., how many words are in a page), and the number of documents that meet the filter criteria. This means that, in general, the total number of input tokens used is the sum of the chat history token count plus in the order of 10,000 tokens used for document retrieval. The maximum input tokens per query is 64,000.
Output tokens are the number of tokens generated as part of the answer generation. The total number depends on the complexity of the question and the number of documents that were retrieved and are relevant for the question. The output typically ranges from a few dozen to several hundred tokens.
Pinecone Assistant limits vary based on pricing plan.
The following limits apply to each project:
Metric | Starter plan | Standard plan | Enterprise plan |
---|---|---|---|
Max number of assistants | 3 | Unlimited | Unlimited |
Max file storage | 1 GB | Unlimited | Unlimited |
Max tokens per minute (TPM) input | 30,000 | 150,000 | 150,000 |
Max number of total LLM processed tokens | 1,500,000 | Unlimited | Unlimited |
Max input tokens per query | 64,000 | 64,000 | 64,000 |
Max total output tokens | 200,000 | Unlimited | Unlimited |
Region | United States | Any | Any |
The following file limits apply to each assistant:
Metric | Starter plan | Standard plan | Enterprise plan |
---|---|---|---|
Max file size (.docx, .json, .md, .txt) | 10 MB | 10 MB | 10 MB |
Max file size (.pdf) | 10 MB | 100 MB | 100 MB |
Max metadata size per file | 1 KB | 1 KB | 1 KB |
Max files uploaded | 10 | 10,000 | 10,000 |