Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.pinecone.io/llms.txt

Use this file to discover all available pages before exploring further.

To control costs when ingesting large datasets (10,000,000+ records), use import instead of upsert.

Import from object storage

Importing from object storage is the most efficient and cost-effective method to load large numbers of records into an index. You store your data as Parquet files in object storage, integrate your object storage with Pinecone, and then start an asynchronous, long-running operation that imports and indexes your records.
This feature is in public preview and available only on Standard and Enterprise plans.

Upsert

For ongoing ingestion into an index, either one record at a time or in batches, use the upsert operation. Batch upserting can improve throughput performance and is a good option for larger numbers of records if you cannot work around import’s current limitations.

When you only need embeddings

Import and upsert move vectors into Pinecone. For workflows where you only need vectors from hosted models (for example, to embed offline and upsert later), use the Inference API as follows: You can call the embed operation through Pinecone Inference to turn text into vectors without writing to an index. That differs from upsert_records on an index with integrated embedding, where each request embeds and stores records in one step. To see how embedding consumption appears in billing and usage reports, see Embedding tokens.

Ingestion cost

  • To understand how cost is calculated for imports, see Import cost.
  • To understand how cost is calculated for upserts, see Write unit pricing.
  • For up-to-date pricing information, see Pricing.

Data freshness

Pinecone is eventually consistent, so there can be a slight delay before new or changed records are visible to queries. You can view index stats to check data freshness.