content
parameter in the request cannot be empty.signed_url
provides temporary, read-only access to the relevant file. Anyone with the link can access the file, so treat it as sensitive data. Expires in one hour.content
parameter in the request cannot be empty."role":"assistant"
, which indicates that the assistant is responding to the user’s message.content
field (e.g., "content":"The"
), which is part of the assistant’s streamed response to the user’s message."finish_reason":"stop"
, which indicates that the assistant has finished responding to the user’s message.json_response
parameter to instruct the assistant to return the response as JSON key-value pairs. This is useful if you need to parse the response programmatically.
stream
parameter.message.content
for the default chat responsedelta.content
for the streaming chat responsemessage.content
for the JSON responsegpt-4o
(default)gpt-4.1
o4-mini
claude-3-5-sonnet
claude-3-7-sonnet
gemini-2.5-pro
model
parameter in the request:
messages
object.
In the following example, the messages
object includes prior messages that are necessary for interpreting the newest message.
"resource": "encyclopedia"
.
2025-04
and later.top_k * snippet_size
. These parameters can be adjusted by setting context_options
in the request:
snippet_size
: Controls the max size of a snippet (default is 2048 tokens). Note that snippet size can vary and, in rare cases, may be bigger than the set snippet_size
. Snippet size controls the amount of context the model is given for each chunk of text.top_k
: Controls the max number of context snippets sent to the LLM (default is 16). top_k
controls the diversity of information sent to the model.top_k
and snippet_size
can help manage token consumption.
2025-04
and later.temperarture
parameter in the request. If a model does not support a temperature parameter, the parameter is ignored.
2025-04
and later.citation
object. The object includes a reference to the document that the assistant used to generate the response. Additionally, you can include highlights, which are the specific parts of the document that the assistant used to generate the response, by setting the include_highlights
parameter to true
in the request: