You can use the Inference API to generate embeddings for text data, such as queries or passages.

This feature is in public preview and is not recommended for production usage.

Before you begin

Ensure you have the following:

The Inference API is a stand-alone service. You can store your generated embeddings in a Pinecone vector database, but you are not required to do so.

1. Install the Python package

To use the Inference API via the Python client, upgrade the client and install the pinecone-plugin-inference package as follows:

pip install --upgrade pinecone-client pinecone-plugin-inference

2. Generate embeddings

To generate embeddings, use the inference.embed operation. Specify a supported embedding model and provide input data and any model-specific parameters.

The following example generates embeddings for a text passage using the multilingual-e5-large model:

The examples above return the following object containing embedding data:

EmbeddingsList(
  model='multilingual-e5-large',
  data=[
    {'values': [0.02117919921875, -0.0093994140625, ..., -0.0384521484375, 0.016326904296875]}
  ],
  usage={'total_tokens': 16}
)