Monitor organization-level usage and costs
To view usage and costs across your Pinecone organization, you must be an organization owner. Also, this feature is available only to organizations on the Standard or Enterprise plans.
- Go to Settings > Usage in the Pinecone console.
- Select the time range to report on. This defaults to the last 30 days.
- Select the scope for your report:
- SKU: The usage and cost for each billable SKU, for example, read units per cloud region, storage size per cloud region, or tokens per embedding model.
- Project: The aggregated cost for each project in your organization.
- Service: The aggregated cost for each service your organization uses, for example, database (includes serverless back up and restore), assistants, inference (embedding and reranking), and collections.
- Choose the specific SKUs, projects, or services you want to report on. This defaults to all.
- To download the report as a CSV file, click Download.
Monitor index-level usage
You can monitor index-level usage directly in the Pinecone console, or you can pull them into Prometheus. For more details, see Monitoring.Monitor operation-level usage
Read units
Query, fetch, and list by ID requests return ausage
parameter with the read unit consumption of each request that is made.
While Pinecone tracks read unit usage with decimal precision, the Pinecone API and SDKs round these values up to the nearest whole number in query, fetch, and list responses. For example, if a query uses 0.45 read units, the API and SDKs will report it as 1 read unit.For precise read unit reporting, see index-level metrics or the organization-wide Usage dashboard.
Embedding tokens
Requests to one of Pinecone’s hosted embedding models, either directly via theembed
operation or automatically when upserting or querying an index with integrated embedding, return a usage
parameter with the total tokens generated.
For example, the following request to use the multilingual-e5-large
model to generate embeddings for sentences related to the word “apple” might return this request and summary of embedding tokens generated: