Pinecone Docs

Home

Guides

Reference

Examples

Models

Integrations

Troubleshooting

Releases

Sign up free

Status

Support

Log In

2025-01 (release candidate)

2024-10 (latest)

2024-07

2024-04

Notebooks

Overview

Pinecone Documentation

Database quickstart

Pinecone Database quickstart

Assistant quickstart

Pinecone Assistant quickstart

Key features

Build a RAG chatbot

Semantic search

Image search

Multimodal search

Recommender

Threat detection

More examples

Glossary

Understanding indexes

Create an index

Create a serverless index

Implement multitenancy

Implement multitenancy using namespaces

Manage indexes

Manage serverless indexes

Understanding backups

Back up an index

Restore an index

Understanding pod-based indexes

Choose a pod type

Choose a pod type and size

Create a pod-based index

Migrate a pod-based index to serverless

Manage pod-based indexes

Scale pod-based indexes

Upsert data

Query data

Fetch data

Update data

Delete data

Understanding imports

Import data

List record IDs

Target an index

Understanding metadata

Manage RAG documents

Understanding data freshness

Check data freshness

Understanding hybrid search

Encode sparse vectors

Upsert sparse-dense vectors

Query sparse-dense vectors

Understanding Pinecone Inference

Embed data

Rerank documents

Upsert and search with integrated inference

Understanding Pinecone Assistant

Create an assistant

Manage files

Chat with an assistant

Manage assistants

Understanding evaluation

Evaluate answers

Understanding context snippets

Retrieve context snippets

Local development

Local development with Pinecone Local

Automated testing

CI/CD with Pinecone Local and GitHub Actions

Move to production

Performance tuning

Understanding security

Configure SSO with Okta

Configure CMEK

Configure customer-managed encryption keys

Connect to AWS PrivateLink

Manage storage integrations

Integrate with Amazon S3

Monitoring

Understanding organizations

Change your billing plan

Set up billing through AWS Marketplace

Set up billing through Azure Marketplace

Set up billing through GCP Marketplace

Understanding subscription status

Understanding cost

Monitor your usage

Manage cost

Manage organization members

Understanding projects

Create a project

Manage API keys

Manage project members

Rename a project

Introduction

API reference

Authentication

Versioning

API versioning

Database limits

Pinecone Database limits

Inference limits

Pinecone Inference limits

Assistant limits

Pinecone Assistant limits

Known limitations

Errors

This operation returns a list of all indexes in a project.

List indexes

This operation deploys a Pinecone index. This is where you specify the measure of similarity, the dimension of vectors to be stored in the index, which cloud provider you would like to deploy with, and more.
  
For guidance and examples, see [Create an index](https://docs.pinecone.io/guides/indexes/create-an-index#create-a-serverless-index).


Describe an index

This operation deletes an existing index.

Delete an index

This operation configures an existing index. 

For serverless indexes, you can configure index deletion protection, tags, and integrated inference embedding settings for the index. For pod-based indexes, you can configure the pod size, number of replicas, tags, and index deletion protection.

It is not possible to change the pod type of a pod-based index. However, you can create a collection from a pod-based index and then [create a new pod-based index with a different pod type](http://docs.pinecone.io/guides/indexes/pods/create-a-pod-based-index#create-a-pod-index-from-a-collection) from the collection. For guidance and examples, see [Configure an index](http://docs.pinecone.io/guides/indexes/pods/manage-pod-based-indexes).

Configure an index

The `describe_index_stats` operation returns statistics about the contents of an index, including the vector count per namespace, the number of dimensions, and the index fullness.

Serverless indexes scale automatically as needed, so index fullness is relevant only for pod-based indexes.

Get index stats

The `upsert` operation writes vectors into a namespace. If a new value is upserted for an existing vector ID, it will overwrite the previous value.

For guidance and examples, see [Upsert data](https://docs.pinecone.io/guides/data/upsert-data).

Upsert vectors

The `query` operation searches a namespace, using a query vector. It retrieves the ids of the most similar items in a namespace, along with their similarity scores.

For guidance and examples, see [Query data](https://docs.pinecone.io/guides/data/query-data).

Query vectors

The `fetch` operation looks up and returns vectors, by ID, from a single namespace. The returned vectors include the vector data and/or metadata.

For guidance and examples, see [Fetch data](https://docs.pinecone.io/guides/data/fetch-data).

Fetch vectors

The `update` operation updates a vector in a namespace. If a value is included, it will overwrite the previous value. If a `set_metadata` is included, the values of the fields specified in it will be added or overwrite the previous value.

For guidance and examples, see [Update data](https://docs.pinecone.io/guides/data/update-data).

Update a vector

The `delete` operation deletes vectors, by id, from a single namespace.

For guidance and examples, see [Delete data](https://docs.pinecone.io/guides/data/delete-data).

Delete vectors

The `list` operation lists the IDs of vectors in a single namespace of a serverless index. An optional prefix can be passed to limit the results to IDs with a common prefix.

`list` returns up to 100 IDs at a time by default in sorted order (bitwise "C" collation). If the `limit` parameter is set, `list` returns up to that number of IDs instead. Whenever there are additional IDs to return, the response also includes a `pagination_token` that you can use to get the next batch of IDs. When the response does not include a `pagination_token`, there are no more IDs to return.

For guidance and examples, see [List record IDs](https://docs.pinecone.io/guides/data/list-record-ids).

**Note:** `list` is supported only for serverless indexes.

List vector IDs

The `start_import` operation starts an asynchronous import of vectors from object storage into an index.

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

Start import

The `list_imports` operation lists all recent and ongoing import operations.

By default, `list_imports` returns up to 100 imports per page. If the `limit` parameter is set, `list` returns up to that number of imports instead. Whenever there are additional IDs to return, the response also includes a `pagination_token` that you can use to get the next batch of imports. When the response does not include a `pagination_token`, there are no more imports to return.

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

List imports

The `describe_import` operation returns details of a specific import operation. 

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

Describe an import

The `cancel_import` operation cancels an import operation if it is not yet finished. It has no effect if the operation is already finished.

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

Cancel an import

This operation returns a list of all collections in a project.
Serverless indexes do not support collections.


List collections

This operation creates a Pinecone collection.
  
Serverless indexes do not support collections.


Create a collection

This operation gets a description of a collection.
Serverless indexes do not support collections.


Describe a collection

This operation deletes an existing collection.
Serverless indexes do not support collections.


Delete a collection

This operation configures an existing index. 

For serverless indexes, you can configure only index deletion protection and tags. For pod-based indexes, you can configure the pod size, number of replicas, tags, and index deletion protection. 

It is not possible to change the pod type of a pod-based index. However, you can create a collection from a pod-based index and then [create a new pod-based index with a different pod type](http://docs.pinecone.io/guides/indexes/pods/create-a-pod-based-index#create-a-pod-index-from-a-collection) from the collection. For guidance and examples, see [Configure an index](http://docs.pinecone.io/guides/indexes/pods/manage-pod-based-indexes).

The `list` operation lists the IDs of vectors in a single namespace of a serverless index. An optional prefix can be passed to limit the results to IDs with a common prefix.

By default, `list` returns up to 100 IDs per page in sorted order (bitwise "C" collation). If the `limit` parameter is set, `list` returns up to that number of IDs instead. Whenever there are additional IDs to return, the response also includes a `pagination_token` that you can use to get the next batch of IDs. When the response does not include a `pagination_token`, there are no more IDs to return.

For guidance and examples, see [List record IDs](https://docs.pinecone.io/guides/data/list-record-ids).

**Note:** `list_vectors` is supported only for serverless indexes.

The `start_import` operation starts an asynchronous import of vectors from object storage into an index. 

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

The `list_imports` operation lists all recent and ongoing import operations. 

By default, `list_imports` returns up to 100 imports per page. If the `limit` parameter is set, `list` returns up to that number of imports instead. Whenever there are additional IDs to return, the response also includes a `pagination_token` that you can use to get the next batch of imports. When the response does not include a `pagination_token`, there are no more imports to return.

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

The `cancel_import` operation cancels an import operation if it is not yet finished. It has no effect if the operation is already finished. 

For guidance and examples, see [Import data](https://docs.pinecone.io/guides/data/import-data).

This operation deploys a Pinecone index. This is where you specify the measure of similarity, the dimension of vectors to be stored in the index, which cloud provider you would like to deploy with, and more.

For guidance and examples, see [Create an index](https://docs.pinecone.io/guides/indexes/create-an-index#create-a-serverless-index).


This operation configures the pod size and number of replicas for a pod-based index.

It is not possible to change the pod type of an index. However, you can create a collection from an index and then [create a new index with a different pod type](http://docs.pinecone.io/guides/indexes/pods/create-a-pod-based-index#create-a-pod-index-from-a-collection) from the collection.

This operation returns a list of all collections in a project.

Serverless indexes do not support collections.


This operation creates a Pinecone collection.

Serverless indexes do not support collections.


This operation gets a description of a collection.

Serverless indexes do not support collections.


This operation deletes an existing collection.

Serverless indexes do not support collections.


Generate embeddings for input data.

For guidance and examples, see [Generate embeddings](https://docs.pinecone.io/guides/inference/generate-embeddings).

Rerank documents according to their relevance to a query.

For guidance and examples, see [Rerank documents](https://docs.pinecone.io/guides/inference/rerank).

This operation creates a serverless integrated inference index for a specific embedding model.

Refer to the [model guide](https://docs.pinecone.io/guides/inference/understanding-inference#embedding-models) for available models and model details.

Create an index for an embedding model

This operation converts input data to vector embeddings and then upserts the embeddings into a namespace.

Upsert records into a namespace

This operation converts a query to a vector embedding and then searches a namespace using the embedding. It returns the most similar records in the namespace, along with their similarity scores.

Search a namespace

This operation returns a list of all assistants in a project.

List assistants

The `create_assistant` endpoint [creates a Pinecone Assistant](https://docs.pinecone.io/guides/assistant/create-assistant). This is where you specify the underlying training model, which cloud provider you would like to deploy with, and more.

The `get_assistant` endpoint [gets the status](https://docs.pinecone.io/guides/assistant/manage-assistants#get-the-status-of-an-assistant) of an assistant.

Check assistant status

The `update_assistant` endpoint [updates an existing assistant](https://docs.pinecone.io/guides/assistant/manage-assistants#update-an-existing-assistant). You can modify the assistant's instructions and metadata.

Update an assistant

The `delete_assistant` endpoint [deletes an existing assistant](https://docs.pinecone.io/guides/assistant/manage-assistants#delete-an-assistant).

Delete an assistant

The `list_files` endpoint returns a [list of all files in an assistant](https://docs.pinecone.io//guides/assistant/manage-files#list-files-in-an-assistant), with an option to filter files with metadata.

List Files

The `upload_file` endpoint [uploads a file](https://docs.pinecone.io/guides/assistant/upload-file) to the specified assistant.

Upload file to assistant

The `describe_file` endpoint provides the [current status and metadata of a file](https://docs.pinecone.io/guides/assistant/manage-files#get-the-status-of-a-file) uploaded to an assistant.

Describe a file upload

The `delete_file` endpoint [deletes an uploaded file](https://docs.pinecone.io/guides/assistant/manage-files#delete-a-file) from an assistant.

Delete an uploaded file

The `chat_assistant` endpoint allows you to [chat with an assistant](https://docs.pinecone.io/guides/assistant/chat-with-assistant) and get back citations in structured form. 

This is the recommended way to chat with an assistant, as it offers more functionality and control over the assistant's responses and references than the `chat_completion_assistant` endpoint.

The `chat_completion_assistant` endpoint is used to [chat with an assistant](https://docs.pinecone.io/guides/assistant/chat-with-assistant). This endpoint is based on the OpenAI Chat Completion API, a commonly used and adopted API. 

It is useful if you need inline citations or OpenAI-compatible responses, but has limited functionality compared to the [`chat_assistant`](https://docs.pinecone.io/reference/api/2024-07/assistant/chat_assistant) operation.

Chat through an OpenAI-compatible interface

The `metrics_alignment` endpoint [evaluates](https://docs.pinecone.io/guides/assistant/understanding-evaluation)  the correctness, completeness, and alignment of a generated answer with respect to a question and a ground truth answer.  The correctness and completeness are evaluated based on the precision and recall of the generated answer with respect to the ground truth answer facts.  Alignment is the harmonic mean of correctness and completeness.


Examples

Notebooks

Semantic Search

Retrieval Enhanced Generative Question Answering

Chatbot Agents with LangChain

Langchain Retrieval Augmentation

GPT4 with Retrieval Augmentation

Reranking Search Results

Import from object storage