Pinecone Database quickstart
This guide shows you how to set up and use Pinecone Database for high-performance similarity search.
To get started in your browser, use the Quickstart colab notebook. To try Pinecone Database locally before creating an account, use Pinecone Local.
1. Install an SDK
Pinecone SDKs provide convenient programmatic access to the Pinecone APIs.
Install the SDK for your preferred language:
2. Get an API key
You need an API key to make calls to your Pinecone project.
Create a new API key in the Pinecone console, or use the widget below to generate a key. If you don’t have a Pinecone account, the widget will sign you up for the free Starter plan.
Your generated API key:
3. Generate vectors
A vector embedding is a numerical representation of data that enables similarity-based search in vector databases like Pinecone. To convert data into this format, you use an embedding model.
For this quickstart, use the multilingual-e5-large
embedding model hosted by Pinecone to create vector embeddings for sentences related to the word “apple”. Note that some sentences are about the tech company, while others are about the fruit.
The returned object looks like this:
4. Create an index
In Pinecone, you store data in an index.
Create a serverless index that matches the dimension (1024
) and similarity metric (cosine
) of the multilingual-e5-large
model you used in the previous step, and choose a cloud and region for hosting the index:
5. Upsert vectors
Target your index and use the upsert
operation to load your vector embeddings into a new namespace. Namespaces let you partition records within an index and are essential for implementing multitenancy when you need to isolate the data of each customer/user.
In production, target an index by its unique DNS host, not by its name.
To load large amounts of data, import from object storage or upsert in large batches.
Pinecone is eventually consistent, so there can be a delay before your upserted records are available to query. Use the describe_index_stats
operation to check if the current vector count matches the number of vectors you upserted (6):
The response looks like this:
6. Search the index
With data in your index, let’s say you now want to search for information about “Apple” the tech company, not “apple” the fruit.
Use the the multilingual-e5-large
model to convert your query into a vector embedding, and then use the query
operation to search for the three vectors in the index that are most semantically similar to the query vector:
Notice that the response includes only sentences about the tech company, not the fruit:
7. Clean up
When you no longer need the example-index
, use the delete_index
operation to delete it:
For production indexes, consider enabling deletion protection.
Next steps
-
Learn about key features to keep in mind as you start building with Pinecone.
-
Check out tutorials and sample apps for different use cases.
Was this page helpful?