Importing from object storage is the most efficient and cost-effective method to load large numbers of records into an index. You store your data as Parquet files in object storage, integrate your object storage with Pinecone, and then start an asynchronous, long-running operation that imports and indexes your records.
Upserting in batches is another efficient way to ingest large numbers of records (up to 1000 per batch). Batch upserting is also a good option if you cannot work around bulk import’s current limitations.
Use the Python SDK with gRPC extras to run data operations such as upserts and queries over gRPC rather than HTTP for a modest performance improvement.
To quickly ingest data when using the Python SDK, use the upsert_from_dataframe method. The method includes retry logic and batch_size, and is performant especially with Parquet file data sets.