In Retrieval Augmented Generation (RAG) use cases, it is best practice to chunk large documents into smaller segments, embed each chunk separately, and then store each embedded chunk as a distinct record in Pinecone. This page shows you how to model, store, and manage such records in serverless indexes.

Use ID prefixes

ID prefixes enable you to query segments of content, which is especially useful for lists and mass deletion. Prefixes are commonly used to represent the following:

  • Hierarchical relationships: When you have multiple records representing chunks of a single document, use a common ID prefix to reference the document. This is the main use of ID prefixes for RAG.
  • Versioning: Assign a multi-level ID prefix to denote the version of the content.
  • Content typing: For multi-modal search, assign an ID prefix to identify different kind of objects (e.g., text, images, videos) in the database.
  • Source identification: Assign an ID prefix to denote the source of the content. For example, if you want to disconnect a given user’s account that was a data source, you can easily find and delete all of the records associated with the user.

Use ID prefixes to reference parent documents

When you have multiple records representing chunks of a single document, use a common ID prefix to reference the document.

You can use any prefix pattern you like, but make sure you use a consistent prefix pattern for all child records of a document. For example, the following are all valid prefixes for the first chunk of doc1:

  • doc1#chunk1
  • doc1_chunk1
  • doc1___chunk1
  • doc1:chunk1
  • doc1chunk1

Prefixes can also be multi-level. For example, doc1#v1#chunk1 and doc1#v2#chunk1 can represent different versions of the first chunk of doc1.

ID prefixes are not validated on upsert or update. It is useful to pick a unique and consistent delimiter that will not be used in the ID elsewhere.

from pinecone import Pinecone, ServerlessSpec

pc = Pinecone(api_key="YOUR_API_KEY")

pc.create_index(
  name="serverless-index",
  dimension=8,
  metric="cosine",
  spec=ServerlessSpec(
    cloud="aws",
    region="us-east-1"
  )
)

index = pc.Index("serverless-index")

index.upsert(
  vectors=[
    {"id": "doc1#chunk1", "values": [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]},
    {"id": "doc1#chunk2", "values": [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]},
    {"id": "doc1#chunk3", "values": [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]},
    {"id": "doc1#chunk4", "values": [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]}
  ],
  namespace="ns1"
)

List all record IDs for a parent document

When all records related to a document use a common ID prefix, you can use the list operation (Python or REST API) or the listPaginated operation (Node.js) with the namespace and prefix parameters to fetch the IDs of the records.

The list and listPaginated operations are supported only for serverless indexes.

from pinecone import Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')
index = pc.Index("pinecone-index")

# To iterate over all result pages using a generator function
for ids in index.list(prefix='doc1#', namespace='ns1'):
    print(ids)
# To manually control pagination, use list_paginate().
# See https://docs.pinecone.io/docs/get-record-ids#paginate-through-results for details.

# Response:
# ['doc1#chunk1', 'doc1#chunk2', 'doc1#chunk3']

When there are additional IDs to return, the response includes a pagination_token that you can use to get the next batch of IDs. For more details, see Paginate through list results

With the record IDs, you can then use the fetch operation to fetch the content of the records.

Delete all records for a parent document

To delete all records representing chunks of a single document, first list the record IDs based on their common ID prefix, and then delete the records by ID:

from pinecone import Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')
index = pc.Index("pinecone-index")

for ids in index.list(prefix='doc1#', namespace='ns1'):
  print(ids) # ['doc1#chunk1', 'doc1#chunk2', 'doc1#chunk3']
  index.delete(ids=ids, namespace=namespace)

Work with multi-level ID prefixes

The examples above are based on a simple ID prefix (doc1#), but it’s also possible to work with more complex, multi-level prefixes.

For example, let’s say you use the prefix pattern document#version#chunk to differentiate between different versions of a document. If you wanted to delete all records for one version of a document, first list the record IDs based on the relevant document#version# prefix and then delete the records by ID:

from pinecone import Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')
index = pc.Index("pinecone-index")

for ids in index.list(prefix='doc1#v1', namespace='ns1'):
    print(ids) # ['doc1#v1#chunk1', 'doc1#v1#chunk2', 'doc1#v1#chunk3']
    index.delete(ids=ids, namespace=namespace)

However, if you wanted to delete all records across all versions of a document, you would list the record IDs based on the doc1# part of the prefix that is common to all versions and then delete the records by ID:

from pinecone import Pinecone

pc = Pinecone(api_key='YOUR_API_KEY')
index = pc.Index("pinecone-index")

for ids in index.list(prefix='doc1#', namespace='ns1'):
    print(ids) # ['doc1#v1#chunk1', 'doc1#v1#chunk2', 'doc1#v1#chunk3', 'doc1#v2#chunk1', 'doc1#v2#chunk2', 'doc1#v2#chunk3']
    index.delete(ids=ids, namespace=namespace)

RAG using pod-based indexes

The list operation does not support pod-based indexes. Instead of using ID prefixes to reference parent documents, use a metadata key-value pair. If you later need to delete the records, you can pass a metadata filter expression to the delete operation.