This page shows you how to monitor the overall usage and costs for your Pinecone organization as well as usage and performance metrics for individual indexes.
To view usage and costs across your Pinecone organization, you must be an organization owner. Also, this feature is available only to organizations on the Standard or Enterprise plans.
The Usage dashboard in the Pinecone console gives you a detailed report of usage and costs across your organization, broken down by each billable SKU or aggregated by project or service. You can view the report in the console or download it as a CSV file.
Dates are shown in UTC to match billing invoices. Cost data is delayed up to three days from the actual usage date.
You can monitor index-level usage directly in the Pinecone console, or you can pull them into Prometheus. For more details, see Monitoring.
Query, fetch, and list by ID requests return a usage
parameter with the read unit consumption of each request that is made.
While Pinecone tracks read unit usage with decimal precision, the Pinecone API and SDKs round these values up to the nearest whole number in query, fetch, and list responses. For example, if a query uses 0.45 read units, the API and SDKs will report it as 1 read unit.
For precise read unit reporting, see index-level metrics or the organization-wide Usage dashboard.
Example query request:
The response looks like this:
For a more in-depth demonstration of how to use read units to inspect read costs, see this notebook.
Requests to one of Pinecone’s hosted embedding models, either directly via the embed
operation or automatically when upserting or querying an index with integrated embedding, return a usage
parameter with the total tokens generated.
For example, the following request to use the multilingual-e5-large
model to generate embeddings for sentences related to the word “apple” might return this request and summary of embedding tokens generated:
The returned object looks like this:
This page shows you how to monitor the overall usage and costs for your Pinecone organization as well as usage and performance metrics for individual indexes.
To view usage and costs across your Pinecone organization, you must be an organization owner. Also, this feature is available only to organizations on the Standard or Enterprise plans.
The Usage dashboard in the Pinecone console gives you a detailed report of usage and costs across your organization, broken down by each billable SKU or aggregated by project or service. You can view the report in the console or download it as a CSV file.
Dates are shown in UTC to match billing invoices. Cost data is delayed up to three days from the actual usage date.
You can monitor index-level usage directly in the Pinecone console, or you can pull them into Prometheus. For more details, see Monitoring.
Query, fetch, and list by ID requests return a usage
parameter with the read unit consumption of each request that is made.
While Pinecone tracks read unit usage with decimal precision, the Pinecone API and SDKs round these values up to the nearest whole number in query, fetch, and list responses. For example, if a query uses 0.45 read units, the API and SDKs will report it as 1 read unit.
For precise read unit reporting, see index-level metrics or the organization-wide Usage dashboard.
Example query request:
The response looks like this:
For a more in-depth demonstration of how to use read units to inspect read costs, see this notebook.
Requests to one of Pinecone’s hosted embedding models, either directly via the embed
operation or automatically when upserting or querying an index with integrated embedding, return a usage
parameter with the total tokens generated.
For example, the following request to use the multilingual-e5-large
model to generate embeddings for sentences related to the word “apple” might return this request and summary of embedding tokens generated:
The returned object looks like this: