Chat with an assistant. This endpoint is based on the OpenAI Chat Completion API, a commonly used and adopted API.
It is useful if you need inline citations or OpenAI-compatible responses, but has limited functionality compared to the standard chat interface.
For guidance and examples, see Chat with an assistant.
Pinecone API Key
Required date-based version header
The name of the assistant to be described.
The desired configuration to chat with an assistant through an OpenAI-compatible interface.
Represents a request to chat with an assistant.
The list of messages sent to the assistant, used for context retrieval and generating response with the LLM.
If false, the assistant returns a single JSON response. If true, the assistant returns a stream of responses.
The large language model used to generate responses.
Controls the randomness of the model's output: lower values make responses more deterministic, while higher values increase creativity and variability. If the model does not support a temperature parameter, the parameter will be ignored.
Optional metadata-based filter to restrict which documents are retrieved for the assistant's response context.
{ "genre": { "$ne": "documentary" } }Search request successful.
Describes the response format of a chat request.