After creating a Pinecone index, you can start inserting vector embeddings and metadata into the index.

Inserting records

  1. Create a client instance and target an index:
import pinecone

pinecone.init(api_key="YOUR_API_KEY", environment="YOUR_ENVIRONMENT")
index = pinecone.Index("pinecone-index")
  1. Use the upsert operation to write records into the index:
# Insert sample data (5 8-dimensional vectors)
    ("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1]),
    ("B", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2]),
    ("C", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3]),
    ("D", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4]),
    ("E", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5])

Immediately after the upsert response is received, records may not be visible to queries yet. This is because Pinecone is eventually consistent. In most situations, you can check if the records have been received by checking for the record counts returned by describe_index_stats() to be updated. Keep in mind that if you have multiple replicas, they may not all become consistent at the same time.


Batching upserts

For clients upserting larger amounts of data, you should insert data into an index in batches of 100 vectors or fewer over multiple upsert requests.


import random
import itertools

def chunks(iterable, batch_size=100):
    """A helper function to break an iterable into chunks of size batch_size."""
    it = iter(iterable)
    chunk = tuple(itertools.islice(it, batch_size))
    while chunk:
        yield chunk
        chunk = tuple(itertools.islice(it, batch_size))

vector_dim = 128
vector_count = 10000

# Example generator that generates many (id, vector) pairs
example_data_generator = map(lambda i: (f'id-{i}', [random.random() for _ in range(vector_dim)]), range(vector_count))

# Upsert data with 100 vectors per upsert request
for ids_vectors_chunk in chunks(example_data_generator, batch_size=100):
    index.upsert(vectors=ids_vectors_chunk)  # Assuming `index` defined elsewhere

Sending upserts in parallel

By default, all vector operations sent using the Python client block until the response has been received. But using our client they can be made asynchronous. For the Batching Upserts example this can be done as follows:

# Upsert data with 100 vectors per upsert request asynchronously
# - Create pinecone.Index with pool_threads=30 (/legacy/limits to 30 simultaneous requests)
# - Pass async_req=True to index.upsert()
with pinecone.Index('example-index', pool_threads=30) as index:
    # Send requests in parallel
    async_results = [
        index.upsert(vectors=ids_vectors_chunk, async_req=True)
        for ids_vectors_chunk in chunks(example_data_generator, batch_size=100)
    # Wait for and retrieve responses (this raises in case of error)
    [async_result.get() for async_result in async_results]

Pinecone is thread-safe, so you can launch multiple read requests and multiple write requests in parallel. Launching multiple requests can help with improving your throughput. However, reads and writes can’t be performed in parallel, therefore writing in large batches might affect query latency and vice versa.

If you experience slow uploads, see Performance tuning for advice.

Partitioning an index into namespaces

You can organize the records added to an index into partitions, or “namespaces,” to limit queries and other vector operations to only one such namespace at a time. For more information, see: Namespaces.

Inserting records with metadata

You can insert records that contain metadata as key-value pairs.

You can then use the metadata to filter for those criteria when sending the query. Pinecone will search for similar vector embeddings only among those items that match the filter. For more information, see: Metadata Filtering.

    ("A", [0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1], {"genre": "comedy", "year": 2020}),
    ("B", [0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2, 0.2], {"genre": "documentary", "year": 2019}),
    ("C", [0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3, 0.3], {"genre": "comedy", "year": 2019}),
    ("D", [0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4, 0.4], {"genre": "drama"}),
    ("E", [0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5], {"genre": "drama"})

Upserting records with sparse values

Sparse vector values can be upserted alongside dense vector values.

index = pinecone.Index('example-index') 

upsert_response = index.upsert(
    {'id': 'vec1',
      'values': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8],
      'metadata': {'genre': 'drama'},
      'sparse_values': {
          'indices': [1, 5],
          'values': [0.5, 0.5]
    {'id': 'vec2',
      'values': [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9],
      'metadata': {'genre': 'action'},
      'sparse_values': {
          'indices': [5, 6],
          'values': [0.4, 0.5]


The following limitations apply to upserting records with sparse vectors:

  • You cannot upsert a record with sparse vector values without dense vector values.
  • Only s1 and p1 pod types using the dotproduct metric support querying sparse vectors. There is no error at upsert time: if you attempt to query any other pod type using sparse vectors, Pinecone returns an error.
  • You can only upsert sparse vector values of sizes up to 1000 non-zero values.
  • Indexes created before February 22, 2023 do not support sparse values.

Troubleshooting index fullness errors

When upserting data, you may receive the following error:

Index is full, cannot accept data.

New upserts may fail as the capacity becomes exhausted. While your index can still serve queries, you need to scale your environment to accommodate more vectors.

To resolve this issue, you can scale your index.