This page describes the pricing and limits of Pinecone Assistant.

Minimum usage

The Standard and Enterprise pricing plans include a monthly minimum usage committment:
PlanMinimum usage
Standard$50/month
Enterprise$500/month
Beyond the monthly minimum, customers are charged for what they use each month. Examples
Customers who signed up for the Standard or Enteprise plan before July 1, 2025 will continue to pay a monthly platform fee until September 1, 2025. After that date, the minimum usage commitment explained above will replace the platform fee.
PlanPlatform feeUsage credits
Standard$25/month$15/month
Enterprise$500/month$150/month
Usage credits do not roll over from month to month. Platform fees do not apply to organizations on the Starter plan or with annual commits.

Usage cost

The cost of using Pinecone Assistant is determined by the following factors:
Invoice line itemDescription
Assistants Context Tokens ProcessedNumber of tokens processed for context retrieval.
Assistants Evaluation Tokens OutNumber of tokens used to calculate evaluation metrics.
Assistants Evaluation Tokens ProcessedNumber of tokens used to prompt evaluation metrics.
Assistants Hourly CountNumber of hours the assistant is available.
Assistants Input TokensNumber of tokens processed by the assistant.
Assistants Output TokensNumber of tokens output by the assistant.
Assistants Total Storage GB/HoursTotal size of files stored in the assistant per month.
See Pricing for up-to-date pricing information.

Token usage

Pinecone Assistant usage is measured in tokens, with different counts and cost for input and output tokens.
For chat, tokens are defined as follows:
  • Input tokens are based on the messages sent to the assistant and the context snippets retrieved from the assistant and sent to a model. Messages sent to the assistant can include messages from the chat history in addition to the newest message. Input tokens appear as prompt_tokens in API responses and as Assistants Input Tokens on invoices.
  • Output tokens are based on the answer from the model. Output tokens appear as completion_tokens in API responses and as Assistants Output Tokens on invoices.
You can use API response details to monitor ongoing token usage, and you can control the size of context snippets retrieved from the assistant to limit token usage.

Limits

Pinecone Assistant limits vary based on pricing plan. The following limits apply to each project:
MetricStarter planStandard planEnterprise plan
Max number of assistants3UnlimitedUnlimited
Max file storage1 GBUnlimitedUnlimited
Max tokens per minute (TPM) input30,000150,000150,000
Max number of total LLM processed tokens1,500,000UnlimitedUnlimited
Max input tokens per query64,00064,00064,000
Max total output tokens200,000UnlimitedUnlimited
RegionUnited StatesAnyAny
The following file limits apply to each assistant:
MetricStarter planStandard planEnterprise plan
Max file size (.docx, .json, .md, .txt)10 MB10 MB10 MB
Max file size (.pdf)10 MB100 MB100 MB
Max metadata size per file1 KB1 KB1 KB
Max files uploaded1010,00010,000