This page shows you how to create a dense or sparse serverless index.

  • Dense indexes store dense vectors, which are numerical representations of the meaning and relationships of text, images, or other types of data. You use dense indexes for semantic search or in combination with sparse indexes for hybrid search.

  • Sparse indexes store sparse vectors, which are numerical representations of the words or phrases in a document. You use sparse indexes for lexical search, or in combination with dense indexes for hybrid search.

You can create an index using the Pinecone console.

Create a dense index

You can create a dense index with integrated vector embedding or a dense index for storing vectors generated with an external embedding model.

Integrated embedding

Indexes with integrated embedding do not support updating or importing with text.

If you want to upsert and search with source text and have Pinecone convert it to dense vectors automatically, create a dense index with integrated embedding as follows:

  • Provide a name for the index.
  • Set cloud and region to the cloud and region where the index should be deployed.
  • Set embed.model to one of Pinecone’s hosted embedding models.
  • Set embed.field_map to the name of the field in your source document that contains the data for embedding.

Other parameters are optional. See the API reference for details.

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

index_name = "example-index"

if not pc.has_index(index_name):
    pc.create_index_for_model(
        name=index_name,
        cloud="aws",
        region="us-east-1",
        embed={
            "model":"multilingual-e5-large",
            "field_map":{"text": "chunk_text"}
        }
    )

Bring your own vectors

If you use an external embedding model to convert your data to dense vectors, use the create a dense index as follows:

  • Provide a name for the index.
  • Set the vector_type to dense.
  • Specify the dimension and similarity metric of the vectors you’ll store in the index. This should match the dimension and metric supported by your embedding model.
  • Set spec.cloud and spec.region to the cloud and region where the index should be deployed. For Python, you also need to import the ServerlessSpec class.

Other parameters are optional. See the API reference for details.

from pinecone.grpc import PineconeGRPC as Pinecone
from pinecone import ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")

index_name = "example-index"

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        vector_type="dense",
        dimension=1536,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws",
            region="us-east-1"
        ),
        deletion_protection="disabled",
        tags={
            "environment": "development"
        }
    )

Create a sparse index

This feature is in public preview.

You can create a dense index with integrated vector embedding or a dense index for storing vectors generated with an external embedding model.

Integrated embedding

If you want to upsert and search with source text and have Pinecone convert it to sparse vectors automatically, create a sparse index with integrated embedding as follows:

Other parameters are optional. See the API reference for details.

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

index_name = "example-index"

if not pc.has_index(index_name):
    pc.create_index_for_model(
        name=index_name,
        cloud="aws",
        region="us-east-1",
        embed={
            "model":"pinecone-sparse-english-v0",
            "field_map":{"text": "chunk_text"}
        }
    )

Bring your own vectors

If you use an external embedding model to convert your data to sparse vectors, create a sparse index as follows:

  • Provide a name for the index.
  • Set the vector_type to sparse.
  • Set the distance metric to dotproduct. Sparse indexes do not support other distance metrics.
  • Set spec.cloud and spec.region to the cloud and region where the index should be deployed.

Other parameters are optional. See the API reference for details.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")

index_name = "example-index"

if not pc.has_index(index_name):
    pc.create_index(
        name=index_name,
        vector_type="sparse",
        metric="dotproduct",
        spec=ServerlessSpec(cloud="aws", region="eu-west-1")
    )

Create an index from a backup

You can create a dense or sparse index from a backup. For more details, see Restore an index.

Index options

Cloud regions

When creating an index, you must choose the cloud and region where you want the index to be hosted. The following table lists the available public clouds and regions and the plans that support them:

CloudRegionSupported plansAvailability phase
awsus-east-1 (Virginia)Starter, Standard, EnterpriseGeneral availability
awsus-west-2 (Oregon)Standard, EnterpriseGeneral availability
awseu-west-1 (Ireland)Standard, EnterpriseGeneral availability
gcpus-central1 (Iowa)Standard, EnterpriseGeneral availability
gcpeurope-west4 (Netherlands)Standard, EnterpriseGeneral availability
azureeastus2 (Virginia)Standard, EnterpriseGeneral availability

The cloud and region cannot be changed after a serverless index is created.

On the free Starter plan, you can create serverless indexes in the us-east-1 region of AWS only. To create indexes in other regions, upgrade your plan.

Similarity metrics

When creating a dense index, you can choose from the following similarity metrics. For the most accurate results, choose the similarity metric used to train the embedding model for your vectors. For more information, see Vector Similarity Explained.

Sparse indexes must use the dotproduct metric.

Embedding models

Dense vectors and sparse vectors are the basic units of data in Pinecone and what Pinecone was specially designed to store and work with. Dense vectors represents the semantics of data such as text, images, and audio recordings, while sparse vectors represent documents or queries in a way that captures keyword information.

To transform data into vector format, you use an embedding model. Pinecone hosts several embedding models so it’s easy to manage your vector storage and search process on a single platform. You can use a hosted model to embed your data as an integrated part of upserting and querying, or you can use a hosted model to embed your data as a standalone operation.

The following embedding models are hosted by Pinecone.

To understand how cost is calculated for embedding, see Understanding cost.