Context Engine is currently in alpha, under active development. The alpha release does not include the full capabilities of Context Engine. It provides document repository functionality with basic retrieval. Advanced query planning, intelligent query rewriting, and sophisticated context expansion will be available in future releases.
1. Install the SDK
Install the specific version of the Pinecone Python SDK that includes Context Engine support:2. Create a repository
Create a repository with a schema that defines how to index your product data. The schema specifies which fields are embedded for semantic and lexical search (descriptions, reviews). All fields are filterable by default for precise matching. First, take a look at the structure of the JSON documents you’ll be upserting:- All field paths must start with
$.
(for example,$.description
or$.price
). - At least one field must be marked as embedded (
"embedded": True
in Python,"embedded": true
in API). - Field names can only contain letters, numbers, and underscores.
3. Upsert documents
Add JSON documents to your repository. Create a repository client using the host URL from the previous step, then upsert documents one at a time.4. Query the repository
Now you can search your catalog using natural language queries that customers would actually use on an e-commerce site. Context Engine automatically searches across all embedded fields — descriptions and reviews — to find relevant products. You can also apply filters on specific fields like:- Price ranges (
$.product_info.price
) - Product categories (
$.product_info.category
) - Availability (
$.product_info.in_stock
) - Review dates (
$.customer_feedback.reviews[*].created_on
) - Brands (
$.product_info.brand
)
5. Manage repositories
Use repository management operations to monitor your product catalog, inspect the schema structure, and manage your product data infrastructure.List repositories
View all repositories in your project:Describe a repository
Get detailed information about a repository, including its schema:Delete a repository
Remove a repository when you no longer need it. This operation removes all documents and associated data permanently.Deleting a repository permanently removes all documents and cannot be undone.
6. Manage documents
Manage individual products in your catalog for inventory updates, content verification, and catalog maintenance.List documents
Retrieve a paginated list of documents in your repository. This operation returns document metadata but excludes embedded content by default to improve performance:Fetch a document
Retrieve a specific document by its ID. This operation returns document metadata but excludes embedded content by default to improve performance:Delete a document
Remove one or more documents from your repository. This operation deletes the document and all its associated chunks and embedded content permanently:Deleting a document removes all associated chunks and embedded content. This operation cannot be undone.
Limitations
Pinecone Context Engine alpha release has several limitations to be aware of: General limitations- Pinecone Context Engine is available only on the
unstable
version of the API or using an alpha release of the Python SDK. - The alpha release does not include the full capabilities of Context Engine. It provides document repository functionality with basic retrieval. Advanced query planning, intelligent query rewriting, and sophisticated context expansion will be available in future releases.
- Field naming: Field keys must be alphanumeric (
A-Z
,a-z
,0-9
,_
) and must not start with a number or underscore. Unicode field names are not supported. - Embedding models: Only supports default embedding models (
llama-text-embed-v2
for dense,pinecone-sparse-english-v0
for sparse). Custom embedding models are not available. - Schema immutability: Schema updates and field modifications are not supported in the alpha release.
- Required embedded field: Each schema must have at least one field marked as embedded.
- Array limitations: Arrays of arrays (nested arrays) are not supported.
- Synchronous ingestion: All document operations are synchronous, targeting 1-2 second response times.
- No partial updates: Document updates require full document replacement; partial field updates are not supported.
- No automatic schema inference: Schema must be explicitly defined during repository creation. All fields are
"filterable": true
by default unless explicitly set to"filterable": false
. Use"filterable": false
for fields containing large content (like binary data or images) to save metadata space and reduce costs.
- Basic chunk expansion: Uses simple chunk count strategy with no configuration options.
- No query planning: Advanced query planning and rewriting features are not available in the alpha.