Customers who sign up for a Standard or Enterprise plan on or after August 18, 2025 cannot create pod-based indexes. Instead, create serverless indexes, and consider using dedicated read nodes for large workloads (millions of records or more, and moderate or high query rates).
Describe a pod-based index
Use thedescribe_index
endpoint to get a complete description of a specific index:
Do not target an index by name in production.When you target an index by name for data operations such as
upsert
and query
, the SDK gets the unique DNS host for the index using the describe_index
operation. This is convenient for testing but should be avoided in production because describe_index
uses a different API than data operations and therefore adds an additional network call and point of failure. Instead, you should get an index host once and cache it for reuse or specify the host directly.Delete a pod-based index
Use thedelete_index
operation to delete a pod-based index and all of its associated resources.
You are billed for a pod-based index even when it is not in use.
403 - FORBIDDEN
status with the following error:
You can delete an index using the Pinecone console. For the index you want to delete, click the three dots to the right of the index name, then click Delete.
Selective metadata indexing
For pod-based indexes, Pinecone indexes all metadata fields by default. When metadata fields contains many unique values, pod-based indexes will consume significantly more memory, which can lead to performance issues, pod fullness, and a reduction in the number of possible vectors that fit per pod. To avoid indexing high-cardinality metadata that is not needed for filtering your queries and keep memory utilization low, specify which metadata fields to index using themetadata_config
parameter.
Since high-cardinality metadata does not cause high memory utilization in serverless indexes, selective metadata indexing is not supported.
metadata_config
parameter is a JSON object containing the names of the metadata fields to index.
JSON
genre
metadata field. Queries against this index that filter for the genre
metadata field may return results; queries that filter for other metadata fields behave as though those fields do not exist.
Prevent index deletion
This feature requires Pinecone API version
2024-07
, Python SDK v5.0.0, Node.js SDK v3.0.0, Java SDK v2.0.0, or Go SDK v1.0.0 or later.deletion_protection
parameter to enabled
.
To enable deletion protection when creating a new index:
403 - FORBIDDEN
status with the following error:
Disable deletion protection
Before you can delete an index with deletion protection enabled, you must first disable deletion protection as follows:Delete an entire namespace
In pod-based indexes, reads and writes share compute resources, so deleting an entire namespace with many records can increase the latency of read operations. In such cases, consider deleting records in batches.Delete records in batches
In pod-based indexes, reads and writes share compute resources, so deleting an entire namespace or a large number of records can increase the latency of read operations. To avoid this, delete records in batches of up to 1000, with a brief sleep between requests. Consider using smaller batches if the index has active read traffic.Delete records by metadata
In pod-based indexes, if you are targeting a large number of records for deletion and the index has active read traffic, consider deleting records in batches.
delete
operation. This deletes all records in the namespace that match the filter expression.
For example, the following code deletes all records with a genre
field set to documentary
from namespace example-namespace
:
Tag an index
When configuring an index, you can tag the index to help with index organization and management. For more details, see Tag an index.Manage costs
Set a project pod limit
To control costs, project owners can set the maximum total number of pods allowed across all pod-based indexes in a project. The default pod limit is 5.- Go to Settings > Projects.
- For the project you want to update, click the ellipsis (…) menu > Configure.
- In the Pod Limit section, update the number of pods.
- Click Save Changes.
Back up inactive pod-based indexes
For each pod-based index, billing is determined by the per-minute price per pod and the number of pods the index uses, regardless of index activity. When a pod-based index is not in use, back it up using collections and delete the inactive index. When you’re ready to use the vectors again, you can create a new index from the collection. This new index can also use a different index type or size. Because it’s relatively cheap to store collections, you can reduce costs by only running an index when it’s in use.Choose the right index type and size
Pod sizes are designed for different applications, and some are more expensive than others. Choose the appropriate pod type and size, so you pay for the resources you need. For example, thes1
pod type provides large storage capacity and lower overall costs with slightly higher query latencies than p1
pods. By switching to a different pod type, you may be able to reduce costs while still getting the performance your application needs.
For pod-based indexes, project owners can set limits for the total number of pods across all indexes in the project. The default pod limit is 5.
Monitor performance
Pinecone generates time-series performance metrics for each Pinecone index. You can monitor these metrics directly in the Pinecone console or with tools like Prometheus or Datadog.Use the Pinecone Console
To view performance metrics in the Pinecone console:- Open the Pinecone console.
- Select the project containing the index you want to monitor.
- Go to Database > Indexes.
- Select the index.
- Go to the Metrics tab.
Use Datadog
To monitor Pinecone with Datadog, use Datadog’s Pinecone integration.This feature is available on Standard and Enterprise plans.
Use Prometheus
This feature is available on Standard and Enterprise plans. When using Bring Your Own Cloud, you must configure Prometheus monitoring within your VPC.
scrape_configs
section of your prometheus.yml
file and update it with values for your Prometheus integration:
-
Replace
API_KEY
with an API key for the project you want to monitor. If necessary, you can create an new API key in the Pinecone console. -
Replace
ENVIRONMENT
with the environment of the pod-based indexes you want to monitor.
Available metrics
The following metrics are available when you integrate Pinecone with Prometheus:Name | Type | Description |
---|---|---|
pinecone_vector_count | gauge | The number of records per pod in the index. |
pinecone_request_count_total | counter | The number of data plane calls made by clients. |
pinecone_request_error_count_total | counter | The number of data plane calls made by clients that resulted in errors. |
pinecone_request_latency_seconds | histogram | The distribution of server-side processing latency for pinecone data plane calls. |
pinecone_index_fullness | gauge | The fullness of the index on a scale of 0 to 1. |
Metric labels
Each metric contains the following labels:Label | Description |
---|---|
pid | Process identifier. |
index_name | Name of the index to which the metric applies. |
project_name | Name of the project containing the index. |
request_type | Type of request: upsert , delete , fetch , query , or describe_index_stats . This label is included only in pinecone_request_* metrics. |
Example queries
Return the average latency in seconds for all requests against the Pinecone indexdocs-example
:
docs-example
:
docs-example
over one minute:
docs-example
over one minute:
docs-example
over one minute:
docs-example
:
Troubleshooting
Index fullness errors
Serverless indexes automatically scale as needed. However, pod-based indexes can run out of capacity. When that happens, upserting new records will fail with the following error:console
High-cardinality metadata and over-provisioning
This Loom video walkthrough shows you how to manage two scenarios:- The first scenario involves customers loading an index replete with high cardinality metadata. This can trigger a series of unforeseen challenges, and hence, it’s vital to comprehend how to manage this situation effectively. This methodology can be applied whenever you need to change your metadata configuration.
- The second scenario that we will address involves customers who have over-provisioned the number of pods they need. More specifically, we will discuss the process of re-scaling an index in instances where the customer has previously scaled vertically and now desires to scale the index back down.