Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.pinecone.io/llms.txt

Use this file to discover all available pages before exploring further.

Pinecone Assistant limits vary based on subscription plan.

Object limits

Object limits are restrictions on the number or size of assistant-related objects. Limits below are scoped per organization except for Assistants per project, which is scoped per project.
MetricStarter planBuilder planStandard planEnterprise plan
Assistants per project5200UnlimitedUnlimited
File storage per org1 GB3 GBUnlimitedUnlimited
Chat input tokens per org500,000 / month*2,000,000 / monthUnlimitedUnlimited
Chat output tokens per org300,000 / month1,000,000 / monthUnlimitedUnlimited
Context retrieval tokens per org500,000 / month2,000,000 / monthUnlimitedUnlimited
Ingestion units per org1,000 / month10,000 / monthUnlimitedUnlimited
File size (.docx, .json, .md, .txt)10 MB10 MB10 MB10 MB
File size (.pdf)10 MB50 MB100 MB100 MB
Metadata size per file16 KB16 KB16 KB16 KB
*1,000,000 input tokens/month to explore Marketplace apps until June 30, 2026. Additionally, the following limits apply to multimodal PDFs (currently in public preview): Multimodal PDF processing uses the same ingestion unit as standard uploads; it is billed at about twice the standard per-unit rate (see Pricing and limits). Object and rate limits for assistants also apply—see #limits and #rate-limits.
MetricStarter planBuilder planStandard planEnterprise plan
Max file size10 MB10 MB50 MB50 MB
Page limit100100100100

Rate limits

Rate limits help protect your applications from misuse and maintain the health of our shared infrastructure. These limits are designed to support typical production workloads while ensuring reliable performance for all users. Most rate limits can be adjusted upon request. If you need higher limits to scale your application, contact Support with details about your use case. Requests that exceed a rate limit fail and return a 429 - TOO_MANY_REQUESTS status.
To handle rate limits, implement retry logic with exponential backoff.
MetricStarter planBuilder planStandard planEnterprise plan
Assistant list/get requests per minute4050100500
Assistant create/update requests per minute202550100
Assistant delete requests per minute202550100
File get requests per minute1001503006,000
File list requests per minute50751503,000
File upload requests per minute51520300
Multimodal PDF upload requests per minute5102040
File delete requests per minute51520300
Chat input tokens per minute100,000200,000300,0001,000,000
Chat history tokens per query64,00064,00064,00064,000
Evaluation input tokens per minuteNot availableNot available150,000500,000