Chat with an assistant and get back citations in structured form.
This is the recommended way to chat with an assistant, as it offers more functionality and control over the assistant’s responses and references than the OpenAI-compatible chat interface.
For guidance and examples, see Chat with an assistant.
Pinecone API Key
The name of the assistant to be described.
The desired configuration to chat an assistant.
The list of queries / chats to chat an assistant
If false, the assistant will return a single JSON response. If true, the assistant will return a stream of responses.
The large language model to use for answer generation
gpt-4o, gpt-4.1, o4-mini, claude-3-5-sonnet, claude-3-7-sonnet, gemini-2.5-pro Controls the randomness of the model's output: lower values make responses more deterministic, while higher values increase creativity and variability. If the model does not support a temperature parameter, the parameter will be ignored.
Optionally filter which documents can be retrieved using the following metadata fields.
{ "genre": { "$ne": "documentary" } }If true, the assistant will be instructed to return a JSON response. Cannot be used with streaming.
If true, the assistant will be instructed to return highlights from the referenced documents that support its response.
Controls the context snippets sent to the LLM.
Search request successful.
The ChatModel describes the response format of a chat request from the citation api.