This page defines concepts in Pinecone and how they relate to each other.

Organization

A organization is a group of one or more projects that use the same billing. Organizations allow one or more users to control billing and permissions for all of the projects belonging to the organization.

For more information, see Understanding organizations.

Project

A project belongs to an organization and contains one or more indexes. Each project belongs to exactly one organization, but only users who belong to the project can access the indexes in that project. API keys and Assistants are project-specific.

For more information, see Understanding projects.

Index

An index is the highest-level organizational unit of data. It defines the dimension (i.e., number of values in a vector) of the vectors to be stored and the similarity metric to be used when querying them. Normally, you choose a dimension and similarity metric based on the embedding model used to create your vectors.

In Pinecone, there are two types of indexes: serverless and pod-based.

For more information, see Understanding indexes.

Namespace

A namespace is a partition within an index. It divides records in an index into separate groups.

All upserts, queries, and other data operations always target one namespace:

For more information, see Use namespaces.

Record

A record is a basic unit of data and consists of the following:

For more information, see Upsert data.

Record ID

A record ID is a record’s unique ID. Use ID prefixes to segment your data beyond namespaces.

Dense vector

A dense vector, also referred to as a vector embedding or simply a vector, is the basic vector type in Pinecone. It is a series of numerical values that represent different dimensions of the data that are essential for understanding patterns, relationships, and underlying structures (i.e., its semantic information). A vector is a type of data representation that is generated by AI models, such as LLMs.

Metadata

Metadata is additional information that can be attached to vector embeddings to provide more context and enable additional filtering capabilities. For example, the original text of the embeddings can be stored in the metadata.

Sparse vector

A sparse vector, also referred to as a sparse vector embedding, has a large number of dimensions, but only a small proportion of those values are non-zero. Sparse vectors are often used to represent documents or queries in a way that captures keyword information. Each dimension in a sparse vector typically represents a word from a dictionary, and the non-zero values represent the importance of these words in the document.

For more information, see Understanding hybrid search.

Other concepts

Although not represented in the diagram above, Pinecone also contains the following concepts:

API key

An API key is a unique token that authenticates and authorizes access to the Pinecone APIs. API keys are project-specific.

User

A user is a member of organizations and projects. Users are assigned specific roles at the organization and project levels that determine the user’s permissions in the Pinecone console.

For more information, see Manage organization members and Manage project members.

Collection

A collection is a static copy of a pod-based index. It is a non-queryable representation of a set of vectors and metadata. A collection can only be created from a pod-based index, but you can create either a serverless index or a pod-based index from a collection.

For more information, see Understanding collections.

Pinecone Assistant

Pinecone Assistant is a service that allows you to upload documents, ask questions, and receive responses that reference your documents. This is known as retrieval-augmented generation (RAG).

For more information, see Understanding Pinecone Assistant.

Pinecone Inference

Pinecone Inference is an API service that provides access to embedding models hosted on Pinecone’s infrastructure.

For more information, see Understanding Pinecone Inference.

Learn more