> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pinecone.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Chat through an OpenAI-compatible interface

> Chat with an assistant. This endpoint is based on the OpenAI Chat Completion API, a commonly used and adopted API. 

It is useful if you need inline citations or OpenAI-compatible responses, but has limited functionality compared to the standard chat interface.

For guidance and examples, see [Chat with an assistant](https://docs.pinecone.io/guides/assistant/chat-with-assistant).

<RequestExample>
  ```bash curl | Default theme={null}
  PINECONE_API_KEY="YOUR_API_KEY"
  ASSISTANT_NAME="example-assistant"

  curl "https://prod-1-data.ke.pinecone.io/assistant/chat/$ASSISTANT_NAME/chat/completions" \
    -H "Api-Key: $PINECONE_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the maximum height of a red pine?"
      }
    ]
  }'
  ```

  ```bash curl | Streaming theme={null}
  PINECONE_API_KEY="YOUR_API_KEY"
  ASSISTANT_NAME="example-assistant"

  curl "https://prod-1-data.ke.pinecone.io/assistant/chat/$ASSISTANT_NAME/chat/completions" \
    -H "Api-Key: $PINECONE_API_KEY "\
    -H "Content-Type: application/json" \
    -H "X-Pinecone-Api-Version: 2026-04" \
    -d '{
    "messages": [
      {
        "role": "user",
        "content": "What is the maximum height of a red pine?"
      }
    ],
    "stream": true
  }'
  ```
</RequestExample>

<ResponseExample>
  ```JSON Default response theme={null}
  {"chat_completion":
    {
      "id":"chatcmpl-9OtJCcR0SJQdgbCDc9JfRZy8g7VJR",
      "choices":[
        {
          "finish_reason":"stop",
          "index":0,
          "message":{
            "role":"assistant",
            "content":"The maximum height of a red pine (Pinus resinosa) is up to 25 meters."
          }
        }
      ],
      "model":"my_assistant"
    }
  }
  ```

  ```text Streaming response theme={null}
  {
    'id': '000000000000000009de65aa87adbcf0',
    'choices': [
        {
        'index': 0,
        'delta':
          {
          'role': 'assistant',
          'content': 'The'
          },
        'finish_reason': None
        }
      ],
    'model': 'gpt-4o-2024-05-13'
  }

  ...

  {
    'id': '00000000000000007a927260910f5839',
    'choices': [
        {
        'index': 0,
        'delta':
          {
            'role': '',
            'content': 'The'
          },
        'finish_reason': None
        }
      ],
    'model': 'gpt-4o-2024-05-13'
  }

  ...

  {
    'id': '00000000000000007a927260910f5839',
    'choices': [
      {
        'index': 0,
        'delta':
          {
          'role': None,
          'content': None
          },
        'finish_reason': 'stop'
        }
      ],
    'model': 'gpt-4o-2024-05-13'
  }
  ```
</ResponseExample>


## OpenAPI

````yaml https://raw.githubusercontent.com/pinecone-io/pinecone-api/refs/heads/main/2026-04/assistant_data_2026-04.oas.yaml POST /chat/{assistant_name}/chat/completions
openapi: 3.0.3
info:
  title: Pinecone assistant data plane API
  description: >-
    Pinecone Assistant Engine is a context engine to store and retrieve relevant
    knowledge from millions of documents at scale. This API supports
    interactions with assistants.
  contact:
    name: Pinecone Support
    url: https://support.pinecone.io
    email: support@pinecone.io
  license:
    name: Apache 2.0
    url: https://www.apache.org/licenses/LICENSE-2.0
  version: 2026-04
servers:
  - url: https://{assistant_host}
    variables:
      assistant_host:
        default: unknown
        description: The host of the created assistant
security:
  - ApiKeyAuth: []
tags:
  - name: Manage Assistants
    description: Actions that manage Assistants
paths:
  /chat/{assistant_name}/chat/completions:
    post:
      tags:
        - Manage Assistants
      summary: Chat through an OpenAI-compatible interface
      description: >-
        Chat with an assistant. This endpoint is based on the OpenAI Chat
        Completion API, a commonly used and adopted API. 


        It is useful if you need inline citations or OpenAI-compatible
        responses, but has limited functionality compared to the standard chat
        interface.


        For guidance and examples, see [Chat with an
        assistant](https://docs.pinecone.io/guides/assistant/chat-with-assistant).
      operationId: chat_completion_assistant
      parameters:
        - in: header
          name: X-Pinecone-Api-Version
          description: Required date-based version header
          required: true
          schema:
            default: 2026-04
            type: string
          style: simple
        - in: path
          name: assistant_name
          description: The name of the assistant to be described.
          required: true
          schema:
            type: string
          example: test-assistant
          style: simple
      requestBody:
        description: >-
          The desired configuration to chat with an assistant through an
          OpenAI-compatible interface.
        content:
          application/json:
            schema:
              $ref: '#/components/schemas/SearchCompletions'
        required: true
      responses:
        '200':
          description: Search request successful.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ChatCompletionModel'
            text/event-stream:
              schema:
                $ref: '#/components/schemas/StreamChatCompletionChunkModel'
        '400':
          description: Bad request. The request body included invalid request parameters.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              examples:
                files-validation-error:
                  summary: Validation error on ingest.
                  value:
                    error:
                      code: INVALID_ARGUMENT
                      message: >-
                        Uploaded file can only currently be either a pdf or txt
                        file
                    status: 400
        '401':
          description: 'Unauthorized. Possible causes: Invalid API key.'
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              examples:
                unauthorized:
                  summary: Unauthorized
                  value:
                    error:
                      code: UNAUTHENTICATED
                      message: Invalid API key.
                    status: 401
        '404':
          description: Assistant not found.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              examples:
                assistant-not-found:
                  summary: Assistant not found.
                  value:
                    error:
                      code: NOT_FOUND
                      message: Assistant "example-assistant" not found.
                    status: 404
        '500':
          description: Internal server error.
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ErrorResponse'
              examples:
                internal-server-error:
                  summary: Internal server error
                  value:
                    error:
                      code: UNKNOWN
                      message: Internal server error
                    status: 500
components:
  schemas:
    SearchCompletions:
      description: Represents a request to chat with an assistant.
      type: object
      properties:
        messages:
          description: >-
            The list of messages sent to the assistant, used for context
            retrieval and generating response with the LLM.
          type: array
          items:
            $ref: '#/components/schemas/MessageModel'
        stream:
          description: >-
            If `false`, the assistant returns a single JSON response. If `true`,
            the assistant returns a stream of responses.
          default: false
          type: boolean
        model:
          description: The large language model used to generate responses.
          default: gpt-4o
          x-enum:
            - gpt-4o
            - gpt-4.1
            - o4-mini
            - claude-sonnet-4-5
            - gemini-2.5-pro
          type: string
        temperature:
          description: >-
            Controls the randomness of the model's output: lower values make
            responses more deterministic, while higher values increase
            creativity and variability. If the model does not support a
            temperature parameter, the parameter will be ignored.
          default: 0
          type: number
          format: float
        filter:
          example:
            genre:
              $ne: documentary
          description: >-
            Optional metadata-based filter to restrict which documents are
            retrieved for the assistant's response context.
          type: object
      required:
        - messages
    ChatCompletionModel:
      description: Describes the response format of a chat request.
      type: object
      properties:
        id:
          description: A unique identifier for this chat response.
          type: string
        choices:
          description: A list of chat completion choices.
          type: array
          items:
            $ref: '#/components/schemas/ChoiceModel'
        model:
          description: >-
            The name or identifier of the model used to generate this chat
            response.
          type: string
        usage:
          $ref: '#/components/schemas/UsageModel'
    StreamChatCompletionChunkModel:
      description: Describes the response format of a chat request.
      type: object
      properties:
        id:
          description: A unique identifier for this chat response.
          type: string
        choices:
          description: A list of chat completion choices.
          type: array
          items:
            $ref: '#/components/schemas/ChoiceChunkModel'
        model:
          description: >-
            The name or identifier of the model used to generate this chat
            response.
          type: string
    ErrorResponse:
      example:
        error:
          code: TOO_MANY_REQUESTS
          message: Too many get or list assistant requests, try again later
        status: 429
      description: The response shape used for all error responses.
      type: object
      properties:
        status:
          example: 500
          description: The HTTP status code of the error.
          type: integer
        error:
          example:
            code: INVALID_ARGUMENT
            message: 'Invalid region: Valid options are us, eu'
          description: Detailed information about the error that occurred.
          type: object
          properties:
            code:
              description: The status code associated with the error.
              x-enum:
                - OK
                - UNKNOWN
                - INVALID_ARGUMENT
                - DEADLINE_EXCEEDED
                - QUOTA_EXCEEDED
                - NOT_FOUND
                - ALREADY_EXISTS
                - PERMISSION_DENIED
                - UNAUTHENTICATED
                - RESOURCE_EXHAUSTED
                - FAILED_PRECONDITION
                - ABORTED
                - OUT_OF_RANGE
                - UNIMPLEMENTED
                - INTERNAL
                - UNAVAILABLE
                - DATA_LOSS
                - FORBIDDEN
                - TOO_MANY_REQUESTS
              type: string
            message:
              example: Message content cannot be empty
              description: A message providing details about the error.
              type: string
            details:
              description: >-
                Additional information about the error. This field is not
                guaranteed to be present.
              type: object
          required:
            - code
            - message
      required:
        - status
        - error
    MessageModel:
      description: Describes the format of a message in a chat.
      type: object
      properties:
        role:
          description: >-
            The role of the message author, it can be `user`, `assistant`, or
            `system`.
          type: string
        content:
          description: The textual content of this partial message.
          type: string
    ChoiceModel:
      description: Describes a single choice in a chat completion response.
      type: object
      properties:
        finish_reason:
          description: >-
            Indicates why the chat response generation stopped. This signals the
            end of the response.

            - `stop`: The model finished generating the response.  

            - `length`: Generation was cut off because the maximum number of
            tokens allowed was reached.

            - `content_filter`: Generation stopped because content was blocked
            by content filtering rules. 
              (for example, content that contains hate speech or violent material).

            - `tool_calls`: Generation stopped because a tool call was
            triggered.
          x-enum:
            - stop
            - length
            - content_filter
            - tool_calls
          type: string
        index:
          description: The index of this choice in the list of returned choices.
          type: integer
        message:
          $ref: '#/components/schemas/MessageModel'
    UsageModel:
      description: >-
        Describes the token usage associated with interactions with an
        assistant.
      type: object
      properties:
        prompt_tokens:
          description: >-
            For chat interactions, the number of tokens in the LLM request
            (message, context snippets, and system prompt).

            For context retrieval, the number of tokens in the LLM request used
            to generate search queries from the messages, plus the tokens in the
            retrieved context snippets.
          type: integer
        completion_tokens:
          description: >-
            For chat interactions, the number of tokens in the assistant's
            response.  

            For context retrieval, this is always 0.
          type: integer
        total_tokens:
          description: >-
            The total number of tokens used, equal to the sum of `prompt_tokens`
            and `completion_tokens`.
          type: integer
    ChoiceChunkModel:
      description: Describes a single choice in a chat completion response.
      type: object
      properties:
        finish_reason:
          description: >-
            Indicates why the chat response generation stopped. This signals the
            end of the response.

            - `stop`: The model finished generating the response.  

            - `length`: Generation was cut off because the maximum number of
            tokens allowed was reached.

            - `content_filter`: Generation stopped because content was blocked
            by content filtering rules. 
              (for example, content that contains hate speech or violent material).

            - `tool_calls`: Generation stopped because a tool call was
            triggered.
          x-enum:
            - stop
            - length
            - content_filter
            - tool_calls
          type: string
        index:
          description: The index of this choice in the list of returned choices.
          type: integer
        delta:
          description: Chat completion message
          type: object
          properties:
            role:
              description: >-
                The role of the message author, it can be `user`, `assistant`,
                or `system`.
              type: string
            content:
              description: The textual content of this partial message.
              type: string
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: Api-Key
      description: Pinecone API Key

````