Setup guide
Hugging Face Inference Endpoints allows access to straightforward model inference. Coupled with Pinecone we can generate and index high-quality vector embeddings with ease. Let’s get started by initializing an Inference Endpoint for generating vector embeddings.Create an endpoint
We start by heading over to the Hugging Face Inference Endpoints homepage and signing up for an account if needed. After, we should find ourselves on this page:



Create embeddings
Each endpoint is given an Endpoint URL, it can be found on the endpoint Overview page. We need to assign this endpoint URL to theendpoint_url
variable.

Python
https://huggingface.co/organizations/<ORG_NAME>/settings/profile
). This is assigned to the api_org
variable.

Python
Python
200
response.
Python
Python
Python
Python
Python
Create a Pinecone index
With our endpoint and dataset ready, all that we’re missing is a vector database. For this, we need to initialize our connection to Pinecone, this requires a free API key.Python
'hf-endpoints'
, the name isn’t important but the dimension
must align to our endpoint model output dimensionality (we found this in dim
above) and the model metric (typically cosine
is okay, but not for all models).
Python
Create and index embeddings
Now we have all of our components ready; endpoints, dataset, and Pinecone. Let’s go ahead and create our dataset embeddings and index them within Pinecone.Python
Python
Python
Python
Clean up
Shut down the endpoint by navigating to the Inference Endpoints Overview page and selecting Delete endpoint. Delete the Pinecone index with:Python
Once the index is deleted, you cannot use it again.