This page defines concepts in Pinecone and how they relate to each other.

Organization

A organization is a group of one or more projects that use the same billing. Organizations allow one or more users to control billing and permissions for all of the projects belonging to the organization.

For more information, see Understanding organizations.

Project

A project belongs to an organization and contains one or more indexes. Each project belongs to exactly one organization, but only users who belong to the project can access the indexes in that project. API keys and Assistants are project-specific.

For more information, see Understanding projects.

Index

There are two types of serverless indexes, dense and sparse.

For details on pod-based indexes, see Using pods.

Dense index

A dense index stores dense vectors, which are a series of numbers that represent the meaning and relationships of text, images, or other types of data. Each number in a dense vector corresponds to a point in a multidimensional space. Vectors that are closer together in that space are semantically similar.

When you query a dense index, Pinecone retrieves the dense vectors that are the most semantically similar to the query. This is often called semantic search, nearest neighbor search, similarity search, or just vector search.

Sparse index

A sparse index stores sparse vectors, which are a series of numbers that represent the words or phrases in a document. Sparse vectors have a very large number of dimensions, where only a small proportion of values are non-zero.

When you search a sparse index, Pinecone retrieves the sparse vectors that most exactly match the words or phrases in the query. Query terms are scored independently and then summed, with the most similar records scored highest. This is often called lexical search or keyword search.

Namespace

A namespace is a partition within a dense or sparse index. It divides records in an index into separate groups.

All upserts, queries, and other data operations always target one namespace:

For more information, see Use namespaces.

Record

A record is a basic unit of data and consists of a record ID, a dense vector or a sparse vector (depending on the type of index), and optional metadata.

For more information, see Upsert data.

Record ID

A record ID is a record’s unique ID. Use ID prefixes to segment your data beyond namespaces.

Dense vector

A dense vector, also referred to as a vector embedding or simply a vector, is a series of numbers that represent the meaning and relationships of data. Each number in a dense vector corresponds to a point in a multidimensional space. Vectors that are closer together in that space are semantically similar.

Dense vectors are stored in dense indexes.

You use a dense embedding model to convert data to dense vectors. The embedding model can be external to Pinecone or hosted on Pinecone infrastructure and integrated with an index.

For more information about dense vectors, see What are vector embeddings?.

Sparse vector

A sparse vector, also referred to as a sparse vector embedding, has a large number of dimensions, but only a small proportion of those values are non-zero. Sparse vectors are often used to represent documents or queries in a way that captures keyword information. Each dimension in a sparse vector typically represents a word from a dictionary, and the non-zero values represent the importance of these words in the document.

Sparse vectors are store in sparse indexes.

You use a sparse embedding model to convert data to sparse vectors. The embedding model can be external to Pinecone or hosted on Pinecone infrastructure and integrated with an index.

Metadata

Metadata is additional information that can be attached to vector embeddings to provide more context and enable additional filtering capabilities. For example, the original text of the embeddings can be stored in the metadata.

Other concepts

Although not represented in the diagram above, Pinecone also contains the following concepts:

API key

An API key is a unique token that authenticates and authorizes access to the Pinecone APIs. API keys are project-specific.

User

A user is a member of organizations and projects. Users are assigned specific roles at the organization and project levels that determine the user’s permissions in the Pinecone console.

For more information, see Manage organization members and Manage project members.

Backup or collection

A backup is a static copy of a serverless index.

Backups only consume storage. They are non-queryable representations of a set of records. You can create a backup from an index, and you can create a new index from that backup. The new index configuration can differ from the original source index: for example, it can have a different name. However, it must have the same number of dimensions and similarity metric as the source index.

For more information, see Understanding backups.

Pinecone Assistant

Pinecone Assistant is a service that allows you to upload documents, ask questions, and receive responses that reference your documents. This is known as retrieval-augmented generation (RAG).

For more information, see Pinecone Assistant Overview.

Pinecone Inference

Pinecone Inference is an API service that provides access to embedding models hosted on Pinecone’s infrastructure.

For more information, see Understanding Pinecone Inference.

Learn more