Chat with an assistant. This endpoint is based on the OpenAI Chat Completion API, a commonly used and adopted API.
It is useful if you need inline citations or OpenAI-compatible responses, but has limited functionality compared to the standard chat interface.
For guidance and examples, see Chat with an assistant.
Pinecone API Key
The name of the assistant to be described.
The desired configuration to chat an assistant.
The list of queries / chats to chat an assistant
If false, the assistant will return a single JSON response. If true, the assistant will return a stream of responses.
The large language model to use for answer generation
gpt-4o, gpt-4.1, o4-mini, claude-3-5-sonnet, claude-3-7-sonnet, gemini-2.5-pro Controls the randomness of the model's output: lower values make responses more deterministic, while higher values increase creativity and variability. If the model does not support a temperature parameter, the parameter will be ignored.
Optionally filter which documents can be retrieved using the following metadata fields.
{ "genre": { "$ne": "documentary" } }