Switch to a cloud environment. For example: EC2, GCE, Google Colab, GCP AI Platform Notebook, or SageMaker Notebook. If you experience slow uploads or high query latencies, it might be because you are accessing Pinecone from your home network.
Consider deploying your application in the same environment as your Pinecone service.
If you’re batching queries, try reducing the number of queries per call to 1 query vector. You can make these calls in parallel and expect roughly the same performance as with batching.
For on-demand indexes, since vector values are retrieved from object storage, operations that return vector values (fetch operations or queries with include_values=true) may have increased latency. If you don’t need the vector values, set include_values=false when querying, or use the query operation instead of fetch if you only need metadata or IDs. See Decrease latency for more details.