Every record in an index must contain an ID and a dense or sparse vector. In addition, you can include metadata key-value pairs to store related information or context. When you search the index, you can then include a metadata filter to limit the search to records matching the filter expression.

Search with a metadata filter

The following code searches for the 3 records that are most semantically similar to a query and that have a category metadata field with the value digestive system.

Searching with text is supported only for indexes with integrated embedding.

from pinecone import Pinecone

pc = Pinecone(api_key="YOUR_API_KEY")

# To get the unique host for an index, 
# see https://docs.pinecone.io/guides/manage-data/target-an-index
index = pc.Index(host="INDEX_HOST")

filtered_results = index.search(
    namespace="example-namespace", 
    query={
        "inputs": {"text": "Disease prevention"}, 
        "top_k": 3,
        "filter": {"category": "digestive system"},
    },
    fields=["category", "chunk_text"]
)

print(filtered_results)

Metadata filter expressions

Pinecone’s filtering query language is based on MongoDB’s query and projection operators. Pinecone currently supports a subset of those selectors:

FilterDescriptionSupported types
$eqMatches with metadata values that are equal to a specified value. Example: {"genre": {"$eq": "documentary"}}Number, string, boolean
$neMatches with metadata values that are not equal to a specified value. Example: {"genre": {"$ne": "drama"}}Number, string, boolean
$gtMatches with metadata values that are greater than a specified value. Example: {"year": {"$gt": 2019}}Number
$gteMatches with metadata values that are greater than or equal to a specified value. Example:{"year": {"$gte": 2020}}Number
$ltMatches with metadata values that are less than a specified value. Example: {"year": {"$lt": 2020}}Number
$lteMatches with metadata values that are less than or equal to a specified value. Example: {"year": {"$lte": 2020}}Number
$inMatches with metadata values that are in a specified array. Example: {"genre": {"$in": ["comedy", "documentary"]}}String, number
$ninMatches with metadata values that are not in a specified array. Example: {"genre": {"$nin": ["comedy", "documentary"]}}String, number
$existsMatches with the specified metadata field. Example: {"genre": {"$exists": true}}Number, string, boolean
$andJoins query clauses with a logical AND. Example: {"$and": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}-
$orJoins query clauses with a logical OR. Example: {"$or": [{"genre": {"$eq": "drama"}}, {"year": {"$gte": 2020}}]}-

For example, the following has a "genre" metadata field with a list of strings:

JSON
{ "genre": ["comedy", "documentary"] }

This means "genre" takes on both values, and requests with the following filters will match:

JSON
{"genre":"comedy"}

{"genre": {"$in":["documentary","action"]}}

{"$and": [{"genre": "comedy"}, {"genre":"documentary"}]}

However, requests with the following filter will not match:

JSON
{ "$and": [{ "genre": "comedy" }, { "genre": "drama" }] }

Additionally, requests with the following filters will not match because they are invalid. They will result in a compilation error:

# INVALID QUERY:
{"genre": ["comedy", "documentary"]}
# INVALID QUERY:
{"genre": {"$eq": ["comedy", "documentary"]}}