Pinecone Documentation
Pinecone is the leading AI infrastructure for building accurate, secure, and scalable AI applications. Use Pinecone Database to store and search vector data at scale, or start with Pinecone Assistant to get a RAG application running in minutes.
Database quickstart
Set up a fully managed vector database for high-performance semantic search
Assistant quickstart
Create an AI assistant that answers complex questions about your proprietary data
Inference
Leading embedding and reranking models hosted by Pinecone. Explore all models.
multilingual-e5
Leading multilingual embedding model
cohere-rerank-3.5
State of the art reranking model for search
sparse-english-v0
Sparse vector model for keyword-style search
Database workflows
Use integrated embedding to upsert and search with text and have Pinecone generate vectors automatically.
Create an index
Create an index that is integrated with one of Pinecone’s hosted embedding models. Dense indexes and vectors enable semantic search, while sparse indexes and vectors enable lexical search.
Upsert text
Upsert your source text and have Pinecone convert the text to vectors automatically. Use namespaces to partition data for faster queries and multitenant isolation between customers.
Search with text
Search the index with a query text. Again, Pinecone uses the index’s integrated model to convert the text to a vector automatically.
Improve relevance
Filter by metadata to limit the scope of your search, rerank results to increase search accuracy, or add lexical search to capture both semantic understanding and precise keyword matches.
Assistant workflow
The steps below can be done through the Pinecone console or Pinecone API.
Create an assistant
Create an assistant to answer questions about your documents.
Upload documents
Upload documents to your assistant. Your assistant manages chunking, embedding, and storage for you.
Chat with an assistant
Chat with your assistant and receive responses as a JSON object or as a text stream. For each chat, your assistant queries a large language model (LLM) with context from your documents to ensure the LLM provides grounded responses.
Evaluate answers
Evaluate the assistant’s responses for correctness and completeness.
Optimize performance
Use custom instructions to tailor your assistant’s behavior and responses to specific use cases or requirements. Filter by metadata associated with files to reduce latency and improve the accuracy of responses.
Retrieve context snippets
Retrieve context snippets to understand what relevant data snippets Pinecone Assistant is using to generate responses. You can use the retrieved snippets with your own LLM, RAG application, or agentic workflow.
Start building
API Reference
Comprehensive details about the Pinecone APIs, SDKs, utilities, and architecture.
Integrated Inference
Simplify vector search with integrated embedding & reranking.
Examples
Hands-on notebooks and sample apps with common AI patterns and tools.
Integrations
Pinecone’s growing number of third-party integrations.
Troubleshooting
Resolve common Pinecone issues with our troubleshooting guide.
Releases
News about features and changes in Pinecone and related tools.
Was this page helpful?