Labeler (via Model)

Attaches a free-form label to each sample or solver output using a model. Use this to tag outputs with a label such as language, topic, or sentiment for filtering, analysis, or building labelled datasets. For pass/fail scoring, use the Model As A Judge Classifier instead.

Output

label: The label assigned to the sample by the model.

Label Validation

Use valid_labels to restrict the model to a fixed set of allowed labels. Enable use_structured_outputs if your model supports it for more reliable label extraction.

Examples

Example: Language Detection. Labels each model output with its language for QA purposes; valid_labels restricts the model to a fixed set of language codes.

...
definition:
  ...
  scorers:
    - type: "labeler_via_model"
      key: "language_labeller"
      purpose: "qa"
      model_key: "<< config.judge_model >>"
      system_prompt: >
        You are a language detection assistant.
        Respond with exactly one of these labels:
        - 'ENGLISH': if the text is entirely in English
        - 'OTHER': if the text is in any other language or mixed languages
      user_prompt: >
        <TEXT>{{ solver_output.output }}</TEXT>
        Label:
      valid_labels:
        - "ENGLISH"
        - "OTHER"

Configuration

Properties

type Literal "labeler_via_model" required

The type of the scorer.

model_key Key, TemplateValue required

The model to be used as the labeler.

system_prompt string, TemplateValue

The system prompt given to the labeler model. The prompt can refer to the following variable dynamically (using {{ }} syntax):

In all scenarios:

sample: Sample attributes (ex: {{ sample.answer }})

If the task has a solver:

model_output: The last model output (ex: {{ model_output }})
messages: The full list of input/output messages (ex: {{ messages[0]['content'] }})
input_prompt: The message contents of the last model output (only for chat completion tasks) (ex: {{ input_prompt }})

Default: You are a helpful assistant and will be used to label the output of another model or a dataset sample.

user_prompt string, TemplateValue required

The user prompt given to the labeler model. The prompt can refer to the following variable dynamically (using {{ }} syntax):

In all scenarios:

sample: Sample attributes (ex: {{ sample.answer }})

If the task has a solver:

model_output: The last model output (ex: {{ model_output }})
messages: The full list of input/output messages (ex: {{ messages[0]['content'] }})
input_prompt: The message contents of the last model output (only for chat completion tasks) (ex: {{ input_prompt }})

valid_labels array[string, TemplateValue]

The list of valid labels. To allow any label, use an empty list (default).

Default: []

use_structured_outputs boolean

Whether to use structured outputs. It is recommended to enable this if the model supports it.

Default: False

purpose ScorerPurpose

The purpose of this scorer.

score: The scorer is used to score the solver output or the dataset sample.
qa: The scorer is used to do QA over the solver output or the dataset sample.

Default: score

key string

Unique identifier assigned to the entity in AI GO!.

Default: None

display_name string

The display name of the scorer.

Default: None

metrics array[PythonMetricTemplate, BinaryClassificationMetricTemplate, MulticlassClassificationMetricTemplate, MeanMetricTemplate, MaxMetricTemplate, MinMetricTemplate, StdDevMetricTemplate, FrequencyMetricTemplate, RecallMetricTemplate, PrecisionMetricTemplate, F1ScoreMetricTemplate]

The metrics associated with this scorer, which will produce per-task metrics.

Default: None