While Pinecone optimizes vector search for efficiency and accuracy, you can enhance it further with the features below.

Namespaces

Use namespaces to divide records within an index into separate groups. This has a few main benefits:

  • Faster queries: Every query targets one namespace in an index. When you divide records into namespaces in a logical way, you speed up queries by ensuring only relevant records are scanned. The same applies to fetching records, listing record IDs, and other data operations.

  • Multitenancy: When you need to isolate data between customers, use one namespace per customer and target each customer’s writes and queries to their dedicated namespace.

Learn more:

Metadata filtering

Every record in an index must contain an ID and dense vector embedding. In additon, you can include metadata key-value pairs to store additional information or context. When you query the index, you can then filter by metadata to ensure only relevant records are scanned. This can reduce latency and improve the accuracy of results.

For example, if an index contains records about books, you could use a metadata field to associate each record with a genre, like "genre": "fiction" or "genre": "poetry". When you query the index, you could then use a metadata filter to limit your search to records related to a specific genre.

Learn more:

Data ingestion

When loading large numbers of records into an index, consider the following methods:

  • Import from object storage: When using a serverless index, the import operation is the most efficient and cost-effective way to ingest large numbers of records. Store your data as Parquet files in object storage, integrate your object storage with Pinecone, and then start an asynchronous, long-running operation that imports and indexes your records.

  • Upsert data in batches: When using a pod-based index, batch upsert is the most efficient way to ingest large numbers of records (up to 1000 per batch). Batch upserting is also an efficient option if you are using a serverless index and cannot work around bulk import’s current limitations.

Learn more:

Embedding

Dense vector embeddings are the basic unit of data in Pinecone and what Pinecone was specially designed to store and work with. A dense vector embedding is a list of numbers that represents the semantics of data such as text, images, and audio recordings. To transform data into this format, you use an embedding model.

Pinecone hosts the multilinguage-e5-large embedding model so it’s easy to manage your vector storage and search process on a single platform, but there are many other embedding models available.

Learn more:

Reranking

Reranking is used as part of a two-stage vector retrieval process to improve the quality of results. You first query an index for a given number of relevant results, and then you send the query and results to a reranking model. The reranking model scores the results based on their semantic relevance to the query and returns a new, more accurate ranking. This approach is one the simplest methods for improving quality in retrieval augmented generation (RAG) pipelines.

Pinecone hosts the bge-reranker-v2-m3 reranking model so it’s easy to manage two-stage vector retrieval on a single platform, but there are many other reranking models available.

Learn more:

Pinecone records can contain two types of embeddings:

  • Dense vector embeddings are used for semantic search, which returns the most similar results according to a specific distance metric.

  • Sparse vector embeddings are used for keyword search, where the relevance of text documents is computed based on the number of keyword matches, their frequency, and other factors.

Hybrid search uses these two types of embeddings in a single query to combine the strengths of both semantic and keyword searching.

Learn more:

Was this page helpful?