Production checklist

This page provides recommendations and best practices for preparing your Pinecone indexes for production, anticipating production issues, and enabling reliability and growth.

Prepare your project structure

One of the first steps towards building a production-ready Pinecone index is configuring your project correctly.

Consider creating a separate project for your development and production indexes, to allow for testing changes to your index before deploying them to production.
Ensure that you have properly configured user access to the Pinecone console, so that only those users who need to access the production index can do so.
Ensure that you have properly configured access through the API by managing API keys and using API key permissions.

Consider how best to manage the API keys associated with your production project. In order to make calls to the Pinecone API, you must provide a valid API key for the relevant Pinecone project.

Enforce security

Use Pinecone’s security features to protect your production data:

Data security
- Private endpoints
- Customer-managed encryption keys (CMEK)
Authorization
- API keys
- Role-based access control (RBAC)
- Organization single sign-on (SSO)
Audit logs
Bring your own cloud

Design your indexes for scale

Follow these best practices when designing and populating your indexes:

Data ingestion: For large datasets (10M+ records), import from object storage for the most efficient and cost-effective ingestion. For ongoing ingestion, upsert in batches to optimize speed and efficiency. See the data ingestion overview for details.
Dimensionality: Consider the dimensionality of your vectors. Higher dimensions can offer more accuracy but require more resources.
Data modeling: Use structured IDs (e.g., document_id#chunk_number) for efficient operations. Design metadata to support filtering, linking related chunks, and traceability. See the data modeling guide for details.
Namespaces: When indexing, try to use namespaces to keep your data among tenants separate, and do not use multiple indexes for this purpose. Namespaces are more efficient and more affordable in the long run.

Understand database limits

Architect your application to work within Pinecone’s database limits:

Rate limits: Serverless indexes have per-second operation limits for queries, upserts, updates, and deletes. Implement error handling with exponential backoff to handle rate limit errors gracefully.
Size limits: Be aware of constraints on vector dimensionality, metadata size per record, record ID length, maximum top_k values, and query result sizes. Design your data model accordingly.
Index limits: Plan for index capacity based on your plan tier. Use namespaces to partition data within indexes rather than creating multiple indexes.
Plan limits: Starter plans have monthly read/write unit limits. Upgrade to Standard or Enterprise for unlimited read/write units and higher throughput needs.

Test your query results

Before you move your index to production, make sure that your index is returning accurate results in the context of your application by identifying the appropriate metrics for evaluating your results.

Optimize performance

Before serving production workloads, optimize your Pinecone implementation:

Increase search relevance: Use techniques like reranking, metadata filtering, hybrid search, and chunking strategies to improve result quality. See increase search relevance for details.
Increase throughput: Import from object storage, upsert in batches, use parallel operations, and leverage Python SDK optimizations like gRPC. See increase throughput for details.
Decrease latency: Use namespaces, filter by metadata, target indexes by host, reuse connections, and deploy in the same cloud region as your index. See decrease latency for details.

Backup up your indexes

In order to enable long-term retention, compliance archiving, and deployment of new indexes, consider backing up your production indexes by creating a backup or collection.

Implement error handling

Prepare your application to handle errors gracefully:

Implement error handling and retry logic with exponential backoff
Handle different error types appropriately (4xx vs 5xx)
Monitor error rates and set up alerts
Check status.pinecone.io before escalating issues

Configure monitoring

Prepare to monitor the production performance and availability of your indexes.

Configure CI/CD

Use Pinecone in CI/CD to safely test changes before deploying them to production.

Know how to get support

If you need help, contact Support, or talk to the Pinecone community. Ensure that your plan tier matches the support and availability SLAs you need. This may require you to upgrade to Enterprise.

Get started

Index data

Search

Optimize

Manage data

Manage cost

Move to production

Admin

Operations

Using pods

Prepare your project structure

Enforce security

Design your indexes for scale

Understand database limits

Test your query results

Optimize performance

Backup up your indexes

Implement error handling

Configure monitoring

Configure CI/CD

Know how to get support

Get started

Index data

Search

Optimize

Manage data

Manage cost

Move to production

Admin

Operations

Using pods

​Prepare your project structure

​Enforce security

​Design your indexes for scale

​Understand database limits

​Test your query results

​Optimize performance

​Backup up your indexes

​Implement error handling

​Configure monitoring

​Configure CI/CD

​Know how to get support

Prepare your project structure

Enforce security

Design your indexes for scale

Understand database limits

Test your query results

Optimize performance

Backup up your indexes

Implement error handling

Configure monitoring

Configure CI/CD

Know how to get support