To ingest data into an index, you can import from object storage or use the upsert operation.

Import from object storage

Importing from object storage is most efficient and cost-effective method to load large numbers of records into an index. You store your data as Parquet files in object storage, integrate your object storage with Pinecone, and then start an asynchronous, long-running operation that imports and indexes your records.

This feature is in public preview and available only on Standard and Enterprise plans.

Import limitations

  • Import does not support integrated embedding.
  • Import only supports AWS S3 as a data source.
  • You cannot import data from S3 Express One Zone storage.
  • You cannot import data into existing namespaces.
  • Each import request can import up 1TB of data into a maximum of 100 namespaces. Note that you cannot import more than 10GB per file and no more than 100,000 files per import.
  • Each import will take at least 10 minutes to complete.

Import cost

Upsert

For ongoing ingestion into an index, either one record at a time or in batches, use the upsert operation. Batch uperting can improve throughput performance and is a good option for larger numbers of records (up to 1000 per batch) if you cannot work around import’s current limitations.

Upsert limits

MetricLimit
Max upsert size2MB or 1000 records
Max metadata size per record40 KB
Max length for a record ID512 characters
Max dimensionality for dense vectors20,000
Max non-zero values for sparse vectors1000
Max dimensionality for sparse vectors4.2 billion

Upsert cost

Data freshness

Pinecone is eventually consistent, so there can be a slight delay before new or changed records are visible to queries. You can view index stats to check data freshness.