This feature is in early access and is not intended for production usage.
How it works
1
You provide a query and list of search results
2
Pinecone prompts an LLM to evaluate the the results
3
Pinecone returns an evaluation for each result
Generate evaluations
To evaluate the relevance of search results, use theeval
API endpoint with the following request parameters:
Parameter | Description |
---|---|
query.inputs.text | The query text. |
eval.fields | The field in each search result to evaluate. |
eval.mode | The mode of the prompt sent to the LLM. Accepted values: "search" or "rag" .This determines how the LLM evaluates and scores the relevance of results. For more details, see Relevance scores. |
hits | The search results to evaluate. |
curl
curl
Include evaluation details
To include more detailed scoring and justification in the response, seteval.debug
to true
:
curl
score
and justification
for each result. For details on how the LLM evaluates relevance, see Relevance scores.
curl
Set a relevance threshold
By default, any result given a relevance score of 2 (e.g., moderately relevant) or greater by the LLM is considered relevant. However, you can change this by setting theeval.relevance_threshold
parameter. For example, to set a threshold of 3 (e.g., highly relevant), use the following request:
curl
"relevant": true
in the response:
curl
Relevance scores
The relevance scores assigned to search results depend on themode
parameter in the request.
When the
mode
parameter is set to "search"
, the LLM is prompted to evaluate each result as a direct response to the query and give each result a score
between 0 and 3 based on the following criteria:Score | Description |
---|---|
3 | Highly relevant: Passage precisely addresses the core query, provides comprehensive and directly applicable information, contains minimal irrelevant content, and delivers factually accurate insights. |
2 | Moderately relevant: Passage addresses a substantial portion of the query but may miss some elements, provides useful information that lacks some depth or comprehensiveness, and contains only minor irrelevant details. |
1 | Partially relevant: Passage touches on query-related aspects but lacks depth or covers only a small part, contains notable irrelevant content, or requires additional context to be useful. |
0 | Not relevant: Passage fails to address the query, contains primarily irrelevant or off-topic content, or provides no meaningful insight for the query. |