Use the spark-pinecone connector to efficiently create, ingest, and update vector embeddings at scale with Databricks and Pinecone.

Install the Spark-Pinecone connector

  1. Install the Spark-Pinecone connector as a library.
  2. Configure the library as follows:
    1. Select File path/S3 as the Library Source.

    2. Enter the S3 URI for the Pinecone assembly JAR file:

      s3://pinecone-jars/1.1.0/spark-pinecone-uberjar.jar  
      

      Databricks platform users must use the Pinecone assembly jar listed above to ensure that the proper dependecies are installed.

    3. Click Install.

Batch upsert

To batch upsert embeddings to Pinecone:

For a guide on how to set up batch upserts, refer to the Databricks integration page.

Stream upsert

To stream upsert embeddings to Pinecone:

Learn more