Data ingestion overview
To ingest data into an index, you can import from object storage or use the upsert operation.
To control costs when ingesting large datasets (10,000,000+ records), use import instead of upsert.
Import from object storage
Importing from object storage is the most efficient and cost-effective method to load large numbers of records into an index. You store your data as Parquet files in object storage, integrate your object storage with Pinecone, and then start an asynchronous, long-running operation that imports and indexes your records.
This feature is in public preview and available only on Standard and Enterprise plans.
Upsert
For ongoing ingestion into an index, either one record at a time or in batches, use the upsert operation. Batch uperting can improve throughput performance and is a good option for larger numbers of records if you cannot work around import’s current limitations.
Ingestion cost
- To understand how cost is calculated for imports, see Import cost.
- To understand how cost is calculated for upserts, see Upsert cost.
- For up-to-date pricing information, see Pricing.
Data freshness
Pinecone is eventually consistent, so there can be a slight delay before new or changed records are visible to queries. You can view index stats to check data freshness.