For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
Guides and conceptsAPI ReferenceRelease NotesLLMUCookbooks
  • Get Started
    • Introduction
    • Installation
    • Creating a client
    • Playground
    • FAQs
  • Models
    • An Overview of Cohere's Models
    • Aya
    • Embed
    • Rerank
  • Text Generation
    • Introduction to Text Generation at Cohere
    • Using the Chat API
    • Reasoning
    • Image Inputs
    • Streaming Responses
    • Predictable Outputs
    • Advanced Generation Parameters
    • Tool Use
    • Tokens and Tokenizers
        • Evaluate your LLM response
    • Summarizing Text
    • Safety Modes
  • Embeddings (Vectors, Search, Retrieval)
    • Introduction to Embeddings at Cohere
    • Semantic Search with Embeddings
    • Multimodal Embeddings
    • Batch Embedding Jobs
  • Going to Production
    • API Keys and Rate Limits
    • Going Live
    • Deprecations
    • How Does Cohere's Pricing Work?
  • Integrations
    • Integrating Embedding Models with Other Tools
    • Cohere and LangChain
    • LlamaIndex and Cohere
  • Deployment Options
    • Overview
    • SDK Compatibility
  • Tutorials
    • Cookbooks
    • LLM University
    • Build Things with Cohere!
    • Agentic RAG
    • Cohere on Azure
  • Responsible Use
    • Security
    • Usage Policy
    • Command A Technical Report
    • Command R and Command R+ Model Card
  • Cohere Labs
    • Cohere Labs Acceptable Use Policy
  • More Resources
    • Cohere Toolkit
    • Datasets
    • Improve Cohere Docs
LogoLogodocs
DASHBOARDPLAYGROUNDDOCSCOMMUNITYLOG IN
Text GenerationPrompt EngineeringPrompt Library

How to Evaluate your LLM Response

Was this page helpful?
Edit this page
Previous

Summarizing Text with the Chat Endpoint

Next
Built with

You can leverage Command A to evaluate natural language responses that cannot be easily scored with manual rules.

Prompt

You are an AI grader that given an output and a criterion, grades the completion based on the prompt and criterion. Below is a prompt, a completion, and a criterion with which to
grade the completion. You need to respond according to the criterion instructions.
## Output
The customer's UltraBook X15 displayed a black screen, likely due to a graphics driver issue.
Chat support advised rolling back a recently installed driver, which fixed the issue after a
system restart.
## Criterion
Rate the ouput text with a score between 0 and 1. 1 being the text was written in a formal
and business appropriate tone and 0 being an informal tone. Respond only with the score.

Output

0.8

API Request

PYTHON
1import cohere
2
3co = cohere.ClientV2(api_key="<YOUR API KEY>")
4
5response = co.chat(
6 model="command-a-plus-05-2026",
7 messages=[
8 {
9 "role": "user",
10 "content": """
11 You are an AI grader that given an output and a criterion, grades the completion based on
12 the prompt and criterion. Below is a prompt, a completion, and a criterion with which to grade
13 the completion. You need to respond according to the criterion instructions.
14
15 ## Output
16 The customer's UltraBook X15 displayed a black screen, likely due to a graphics driver issue.
17 Chat support advised rolling back a recently installed driver, which fixed the issue after a
18 system restart.
19
20 ## Criterion
21 Rate the ouput text with a score between 0 and 1. 1 being the text was written in a formal
22 and business appropriate tone and 0 being an informal tone. Respond only with the score.
23 """,
24 }
25 ],
26)
27
28print(response.message.content[0].text)