Skip to main content
This page provides recommendations and best practices for preparing your Pinecone indexes for production, anticipating production issues, and enabling reliability and growth.

Prepare your project structure

One of the first steps towards building a production-ready Pinecone index is configuring your project correctly.
  • Consider creating a separate project for your development and production indexes, to allow for testing changes to your index before deploying them to production.
  • Ensure that you have properly configured user access to the Pinecone console, so that only those users who need to access the production index can do so.
  • Ensure that you have properly configured access through the API by managing API keys and using API key permissions.
Consider how best to manage the API keys associated with your production project. In order to make calls to the Pinecone API, you must provide a valid API key for the relevant Pinecone project.

Enforce security

Use Pinecone’s security features to protect your production data:
  • Data security
    • Private endpoints
    • Customer-managed encryption keys (CMEK)
  • Authorization
    • API keys
    • Role-based access control (RBAC)
    • Organization single sign-on (SSO)
  • Audit logs
  • Bring your own cloud

Design your indexes for scale

Follow these best practices when designing and populating your indexes:

Understand database limits

Architect your application to work within Pinecone’s database limits:
  • Rate limits: Serverless indexes have per-second operation limits for queries, upserts, updates, and deletes. Implement error handling with exponential backoff to handle rate limit errors gracefully.
  • Size limits: Be aware of constraints on vector dimensionality, metadata size per record, record ID length, maximum top_k values, and query result sizes. Design your data model accordingly.
  • Index limits: Plan for index capacity based on your plan tier. Use namespaces to partition data within indexes rather than creating multiple indexes.
  • Plan limits: Starter plans have monthly read/write unit limits. Upgrade to Standard or Enterprise for unlimited read/write units and higher throughput needs.

Test your query results

Before you move your index to production, make sure that your index is returning accurate results in the context of your application by identifying the appropriate metrics for evaluating your results.

Optimize performance

Before serving production workloads, optimize your Pinecone implementation:
  • Increase search relevance: Use techniques like reranking, metadata filtering, hybrid search, and chunking strategies to improve result quality. See increase search relevance for details.
  • Increase throughput: Import from object storage, upsert in batches, use parallel operations, and leverage Python SDK optimizations like gRPC. See increase throughput for details.
  • Decrease latency: Use namespaces, filter by metadata, target indexes by host, reuse connections, and deploy in the same cloud region as your index. See decrease latency for details.

Backup up your indexes

In order to enable long-term retention, compliance archiving, and deployment of new indexes, consider backing up your production indexes by creating a backup or collection.

Implement error handling

Prepare your application to handle errors gracefully:

Configure monitoring

Prepare to monitor the production performance and availability of your indexes.

Configure CI/CD

Use Pinecone in CI/CD to safely test changes before deploying them to production.

Know how to get support

If you need help, contact Support, or talk to the Pinecone community. Ensure that your plan tier matches the support and availability SLAs you need. This may require you to upgrade to Enterprise.
I