Customers who sign up for a Standard or Enterprise plan on or after August 18, 2025 cannot create pod-based indexes. Instead, create serverless indexes, and consider using dedicated read capacity for large workloads (millions of records or more, and moderate or high query rates).
This page shows you how to manage pod-based indexes. For guidance on serverless indexes, see Manage serverless indexes.

Describe a pod-based index

Use the describe_index endpoint to get a complete description of a specific index:
from pinecone.grpc import PineconeGRPC as Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

pc.describe_index(name="docs-example")

# Response:
# {'dimension': 1536,
#  'host': 'docs-example-4mkljsz.svc.aped-4627-b74a.pinecone.io',
#  'metric': 'cosine',
#  'name': 'docs-example',
#  'spec': {'pod': {'environment': 'us-east-1-aws',
#                   'pod_type': 's1.x1',
#                   'pods': 1,
#                   'replicas': 1,
#                   'shards': 1}},
#  'status': {'ready': True, 'state': 'Ready'}}

Delete a pod-based index

Use the delete_index operation to delete a pod-based index and all of its associated resources.
You are billed for a pod-based index even when it is not in use.
# pip install "pinecone[grpc]"
from pinecone.grpc import PineconeGRPC as Pinecone, PodSpec

pc = Pinecone(api_key="YOUR_API_KEY")

pc.delete_index(name="docs-example")
If deletion protection is enabled on an index, requests to delete it will fail and return a 403 - FORBIDDEN status with the following error:
Deletion protection is enabled for this index. Disable deletion protection before retrying.
Before you can delete such an index, you must first disable deletion protection.
You can delete an index using the Pinecone console. For the index you want to delete, click the three dots to the right of the index name, then click Delete.

Selective metadata indexing

For pod-based indexes, Pinecone indexes all metadata fields by default. When metadata fields contains many unique values, pod-based indexes will consume significantly more memory, which can lead to performance issues, pod fullness, and a reduction in the number of possible vectors that fit per pod. To avoid indexing high-cardinality metadata that is not needed for filtering your queries and keep memory utilization low, specify which metadata fields to index using the metadata_config parameter.
Since high-cardinality metadata does not cause high memory utilization in serverless indexes, selective metadata indexing is not supported.
The value for the metadata_config parameter is a JSON object containing the names of the metadata fields to index.
JSON
{
    "indexed": [
        "metadata-field-1",
        "metadata-field-2",
        "metadata-field-n"
    ]
}
Example The following example creates a pod-based index that only indexes the genre metadata field. Queries against this index that filter for the genre metadata field may return results; queries that filter for other metadata fields behave as though those fields do not exist.
from pinecone.grpc import PineconeGRPC as Pinecone, PodSpec

pc = Pinecone(api_key="YOUR_API_KEY")

pc.create_index(
  name="docs-example",
  dimension=1536,
  metric="cosine",
  spec=PodSpec(
    environment="us-west1-gcp",
    pod_type="p1.x1",
    pods=1,
    metadata_config = {
      "indexed": ["genre"]
    }
  ),
  deletion_protection="disabled"

)

Prevent index deletion

This feature requires Pinecone API version 2024-07, Python SDK v5.0.0, Node.js SDK v3.0.0, Java SDK v2.0.0, or Go SDK v1.0.0 or later.
You can prevent an index and its data from accidental deleting when creating a new index or when configuring an existing index. In both cases, you set the deletion_protection parameter to enabled. To enable deletion protection when creating a new index:
# pip install "pinecone[grpc]"
from pinecone.grpc import PineconeGRPC as Pinecone, PodSpec

pc = Pinecone(api_key="YOUR_API_KEY")

pc.create_index(
  name="docs-example",
  dimension=1536,
  metric="cosine",
  spec=PodSpec(
    environment="us-west1-gcp",
    pod_type="p1.x1",
    pods=1
  ),
    deletion_protection="enabled"
)
To enable deletion protection when configuring an existing index:
from pinecone.grpc import PineconeGRPC as Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

pc.configure_index(
   name="docs-example", 
   deletion_protection="enabled"
)
When deletion protection is enabled on an index, requests to delete the index fail and return a 403 - FORBIDDEN status with the following error:
Deletion protection is enabled for this index. Disable deletion protection before retrying.

Disable deletion protection

Before you can delete an index with deletion protection enabled, you must first disable deletion protection as follows:
from pinecone.grpc import PineconeGRPC as Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

pc.configure_index(
   name="docs-example", 
   deletion_protection="disabled"
)

Delete an entire namespace

In pod-based indexes, reads and writes share compute resources, so deleting an entire namespace with many records can increase the latency of read operations. In such cases, consider deleting records in batches.

Delete records in batches

In pod-based indexes, reads and writes share compute resources, so deleting an entire namespace or a large number of records can increase the latency of read operations. To avoid this, delete records in batches of up to 1000, with a brief sleep between requests. Consider using smaller batches if the index has active read traffic.
from pinecone import Pinecone
import numpy as np
import time

pc = Pinecone(api_key='API_KEY')

INDEX_NAME = 'INDEX_NAME'
NAMESPACE = 'NAMESPACE_NAME'
# Consider using smaller batches if you have a high RPS for read operations
BATCH = 1000

index = pc.Index(name=INDEX_NAME)
dimensions = index.describe_index_stats()['dimension']

# Create the query vector
query_vector = np.random.uniform(-1, 1, size=dimensions).tolist()
results = index.query(vector=query_vector, namespace=NAMESPACE, top_k=BATCH)

# Delete in batches until the query returns no results
while len(results['matches']) > 0:
    ids = [i['id'] for i in results['matches']]
    index.delete(ids=ids, namespace=NAMESPACE)
    time.sleep(0.01)
    results = index.query(vector=query_vector, namespace=NAMESPACE, top_k=BATCH)

Delete records by metadata

In pod-based indexes, if you are targeting a large number of records for deletion and the index has active read traffic, consider deleting records in batches.
To delete records from a namespace based on their metadata values, pass a metadata filter expression to the delete operation. This deletes all records in the namespace that match the filter expression. For example, the following code deletes all records with a genre field set to documentary from namespace example-namespace:
# pip install "pinecone[grpc]"
from pinecone.grpc import PineconeGRPC as Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

# To get the unique host for an index, 
# see https://docs.pinecone.io/guides/manage-data/target-an-index
index = pc.Index(host="INDEX_HOST")

index.delete(
    filter={
        "genre": {"$eq": "documentary"}
    },
    namespace="example-namespace" 
)

Tag an index

When configuring an index, you can tag the index to help with index organization and management. For more details, see Tag an index.

Manage costs

Set a project pod limit

To control costs, project owners can set the maximum total number of pods allowed across all pod-based indexes in a project. The default pod limit is 5.
  1. Go to Settings > Projects.
  2. For the project you want to update, click the ellipsis (…) menu > Configure.
  3. In the Pod Limit section, update the number of pods.
  4. Click Save Changes.

Back up inactive pod-based indexes

For each pod-based index, billing is determined by the per-minute price per pod and the number of pods the index uses, regardless of index activity. When a pod-based index is not in use, back it up using collections and delete the inactive index. When you’re ready to use the vectors again, you can create a new index from the collection. This new index can also use a different index type or size. Because it’s relatively cheap to store collections, you can reduce costs by only running an index when it’s in use.

Choose the right index type and size

Pod sizes are designed for different applications, and some are more expensive than others. Choose the appropriate pod type and size, so you pay for the resources you need. For example, the s1 pod type provides large storage capacity and lower overall costs with slightly higher query latencies than p1 pods. By switching to a different pod type, you may be able to reduce costs while still getting the performance your application needs.
For pod-based indexes, project owners can set limits for the total number of pods across all indexes in the project. The default pod limit is 5.

Monitor performance

Pinecone generates time-series performance metrics for each Pinecone index. You can monitor these metrics directly in the Pinecone console or with tools like Prometheus or Datadog.

Use the Pinecone Console

To view performance metrics in the Pinecone console:
  1. Open the Pinecone console.
  2. Select the project containing the index you want to monitor.
  3. Go to Database > Indexes.
  4. Select the index.
  5. Go to the Metrics tab.

Use Datadog

To monitor Pinecone with Datadog, use Datadog’s Pinecone integration.
This feature is available on Standard and Enterprise plans.

Use Prometheus

This feature is available on Standard and Enterprise plans. When using Bring Your Own Cloud, you must configure Prometheus monitoring within your VPC.
To monitor all pod-based indexes in a specific region of a project, insert the following snippet into the scrape_configs section of your prometheus.yml file and update it with values for your Prometheus integration:
scrape_configs:
  - job_name: "pinecone-pod-metrics"
    scheme: https
    metrics_path: '/metrics'
    authorization:
      credentials: API_KEY
    static_configs:
      - targets: ["metrics.ENVIRONMENT.pinecone.io" ]
  • Replace API_KEY with an API key for the project you want to monitor. If necessary, you can create an new API key in the Pinecone console.
  • Replace ENVIRONMENT with the environment of the pod-based indexes you want to monitor.
For more configuration details, see the Prometheus docs.

Available metrics

The following metrics are available when you integrate Pinecone with Prometheus:
NameTypeDescription
pinecone_vector_countgaugeThe number of records per pod in the index.
pinecone_request_count_totalcounterThe number of data plane calls made by clients.
pinecone_request_error_count_totalcounterThe number of data plane calls made by clients that resulted in errors.
pinecone_request_latency_secondshistogramThe distribution of server-side processing latency for pinecone data plane calls.
pinecone_index_fullnessgaugeThe fullness of the index on a scale of 0 to 1.

Metric labels

Each metric contains the following labels:
LabelDescription
pidProcess identifier.
index_nameName of the index to which the metric applies.
project_nameName of the project containing the index.
request_typeType of request: upsert, delete, fetch, query, or describe_index_stats. This label is included only in pinecone_request_* metrics.

Example queries

Return the average latency in seconds for all requests against the Pinecone index docs-example:
avg by (request_type) (pinecone_request_latency_seconds{index_name="docs-example"})
Return the vector count for the Pinecone index docs-example:
sum ((avg by (app) (pinecone_vector_count{index_name="docs-example"})))
Return the total number of requests against the Pinecone index docs-example over one minute:
sum by (request_type)(increase(pinecone_request_count_total{index_name="docs-example"}[60s]))
Return the total number of upsert requests against the Pinecone index docs-example over one minute:
sum by (request_type)(increase(pinecone_request_count_total{index_name="docs-example", request_type="upsert"}[60s]))
Return the total errors returned by the Pinecone index docs-example over one minute:
sum by (request_type) (increase(pinecone_request_error_count{
      index_name="docs-example"}[60s]))
Return the index fullness metric for the Pinecone index docs-example:
round(max (pinecone_index_fullness{index_name="docs-example"} * 100))

Troubleshooting

Index fullness errors

Serverless indexes automatically scale as needed. However, pod-based indexes can run out of capacity. When that happens, upserting new records will fail with the following error:
console
Index is full, cannot accept data.

High-cardinality metadata and over-provisioning

This Loom video walkthrough shows you how to manage two scenarios:
  • The first scenario involves customers loading an index replete with high cardinality metadata. This can trigger a series of unforeseen challenges, and hence, it’s vital to comprehend how to manage this situation effectively. This methodology can be applied whenever you need to change your metadata configuration.
  • The second scenario that we will address involves customers who have over-provisioned the number of pods they need. More specifically, we will discuss the process of re-scaling an index in instances where the customer has previously scaled vertically and now desires to scale the index back down.