Search documents - Pinecone Docs

# pip install --upgrade pinecone
import os
from pinecone import Pinecone

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.preview.index(name="articles")

NAMESPACE = "example-namespace"

# BM25 token matching
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "text", "field": "body", "query": "machine learning"}],
    include_fields=["title", "body", "category", "year"],
)
for match in response.matches:
    print(match._id, match._score, getattr(match, "title", ""))

# Lucene query string
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "query_string", "query": "title:(quantum) OR body:(machine learning)"}],
    include_fields=["title", "body"],
)

# Dense vector ranking with phrase-match filter
query_vector = [0.12, 0.34, 0.56]  # replace with your actual query vector
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{
        "type": "dense_vector",
        "field": "embedding",
        "values": query_vector,
    }],
    filter={"body": {"$match_phrase": "machine learning"}},
    include_fields=["title", "body"],
)

// Status: 200 OK
{
  "matches": [
    {
      "_id": "doc1",
      "_score": 0.8234,
      "title": "Machine learning in 2024",
      "body": "Machine learning models are revolutionizing natural language processing",
      "category": "technology",
      "year": 2024
    }
  ],
  "namespace": "__default__",
  "usage": { "read_units": 1 }
}

POST

namespaces

{namespace}

documents

# pip install --upgrade pinecone
import os
from pinecone import Pinecone

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.preview.index(name="articles")

NAMESPACE = "example-namespace"

# BM25 token matching
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "text", "field": "body", "query": "machine learning"}],
    include_fields=["title", "body", "category", "year"],
)
for match in response.matches:
    print(match._id, match._score, getattr(match, "title", ""))

# Lucene query string
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "query_string", "query": "title:(quantum) OR body:(machine learning)"}],
    include_fields=["title", "body"],
)

# Dense vector ranking with phrase-match filter
query_vector = [0.12, 0.34, 0.56]  # replace with your actual query vector
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{
        "type": "dense_vector",
        "field": "embedding",
        "values": query_vector,
    }],
    filter={"body": {"$match_phrase": "machine learning"}},
    include_fields=["title", "body"],
)

// Status: 200 OK
{
  "matches": [
    {
      "_id": "doc1",
      "_score": 0.8234,
      "title": "Machine learning in 2024",
      "body": "Machine learning models are revolutionizing natural language processing",
      "category": "technology",
      "year": 2024
    }
  ],
  "namespace": "__default__",
  "usage": { "read_units": 1 }
}

Full-text search is in public preview and uses API version 202601-alpha. APIs may continue to evolve before general availability.

Searches schema-defined documents in a namespace. A request includes a score_by array selecting one of the following scoring types:

type: "text" — BM25 token matching on a single text field. Multi-word queries use OR-style matching (case-insensitive). For exact-phrase ranking, use query_string with quoted terms.
type: "query_string" — Lucene query syntax. Supports boolean operators, phrase prefix matching, boosting, and cross-field queries. See the query syntax reference.
type: "dense_vector" — dense vector similarity ranking against a dense_vector field.
type: "sparse_vector" — sparse vector similarity ranking against a sparse_vector field.

Any scoring method can be combined with metadata filters (including text match operators $match_phrase / $match_all / $match_any and logical operators $and / $or / $not). Filters are applied before scoring — the search only considers documents that match the filter. Scoring-only operators — phrase slop ("phrase"~N), term boosting (^N), and phrase prefix ("phrase pre"*) — are available in query_string scoring but cannot be used inside filter. include_fields defaults to [] (returns only _id and _score); use ["*"] to return all stored fields.

A single search request ranks by one scoring type. Multi-field BM25 is supported: pass multiple text clauses (one per field) or a single query_string clause whose query targets several fields, and every contributing field weighs equally in 202601-alpha; there is no per-clause weight parameter. To combine BM25 ranking with dense_vector or sparse_vector ranking, restrict the dense (or sparse) search with a text-match filter ($match_phrase, $match_all, $match_any) on the lexical field, or run separate searches and merge the results client-side.

Filters — including text match operators — are only valid on this endpoint. The POST /namespaces/{namespace}/documents/fetch endpoint is ID-only, and POST /namespaces/{namespace}/documents/delete accepts only ids or delete_all. To act on documents matching a metadata expression, search first to get IDs, then fetch or delete by ID.

# pip install --upgrade pinecone
import os
from pinecone import Pinecone

pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
index = pc.preview.index(name="articles")

NAMESPACE = "example-namespace"

# BM25 token matching
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "text", "field": "body", "query": "machine learning"}],
    include_fields=["title", "body", "category", "year"],
)
for match in response.matches:
    print(match._id, match._score, getattr(match, "title", ""))

# Lucene query string
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{"type": "query_string", "query": "title:(quantum) OR body:(machine learning)"}],
    include_fields=["title", "body"],
)

# Dense vector ranking with phrase-match filter
query_vector = [0.12, 0.34, 0.56]  # replace with your actual query vector
response = index.documents.search(
    namespace=NAMESPACE,
    top_k=10,
    score_by=[{
        "type": "dense_vector",
        "field": "embedding",
        "values": query_vector,
    }],
    filter={"body": {"$match_phrase": "machine learning"}},
    include_fields=["title", "body"],
)

// Status: 200 OK
{
  "matches": [
    {
      "_id": "doc1",
      "_score": 0.8234,
      "title": "Machine learning in 2024",
      "body": "Machine learning models are revolutionizing natural language processing",
      "category": "technology",
      "year": 2024
    }
  ],
  "namespace": "__default__",
  "usage": { "read_units": 1 }
}

Authorizations

Api-Key

string

header

required

An API Key is required to call Pinecone APIs. Get yours from the console.

Headers

X-Pinecone-Api-Version

string

default:202601-alpha

required

Required date-based version header

Path Parameters

namespace

string

required

The namespace to search.

Body

application/json

The request for the search_documents operation.

score_by

object[]

required

The list of scoring methods to use for ranking documents.

Required array length: 1 - 100 elements

Show child attributes

top_k

integer<int32>

required

The number of top-ranked documents to return.

Required range: 1 <= x <= 10000

Example:

10

include_fields

string[]

The document fields to include in the search results.

Example:

["title", "content"]

filter

object

A metadata filter expression to restrict the documents searched.

Response

A successful search response.

The response for the search_documents operation.

matches

object[]

required

The matching documents, ordered from most to least similar.

Show child attributes

namespace

string

required

The namespace that was searched.

Example:

"my-namespace"

usage

object

required

Usage information for the search_documents operation.

Show child attributes

Example:

{ "read_units": 5 }

Upsert documents Fetch documents

APIs

Full-text search

Documentation Index

Authorizations

Headers

Path Parameters

Body

Response