String Equality (Multiple Choice)
Scores each sample by checking whether the first character of the model output matches the correct multiple-choice key, producing an is_correct score of 1.0 (match) or 0.0 (no match). Use this for standard MCQA tasks where the model responds with a single letter such as A, B, C, or D. For free-form answers, use String Equality or the Model Scorer instead.
Output
is_correct:1.0if the predicted choice matches the ground-truth key,0.0otherwise.
Validity Checking
If the model output does not unambiguously match any of the provided choices,
the sample is marked incorrect and the completion_validity metadata field
is set to INVALID. Only single-character choices are supported.
Examples
Example: Multiple Choice QA. Shows a task where the dataset provides the correct answer letter and the list of valid single-character choices.
...
definition:
...
scorers:
- type: "string_equals_mcqa"
ground_truth_choice: "{{ sample.<< config.answer_column >> }}"
choices: "{{ sample.choices }}"
metrics:
- type: "mean"
field: "is_correct"
name: "MCQA Accuracy"| prompt | answer | choices |
| :------------------------------------------------- | :----- | :-------------- |
| What is the correct answer? A. ..., B. ..., C. ... | B | ['A', 'B', 'C'] |
| What is the correct answer? A. ..., B. ..., C. ... | C | ['A', 'B', 'C'] |The
choicesfield expects single character choices. Multiple character choices (i.e., 'yes' or 'no') are not supported.
Configuration
Properties
type Literal "string_equals_mcqa" required
The type of the scorer.
ground_truth_choice string, TemplateValue
Jinja template that produces the ground truth choice.
The ground-truth choice can be:
- A hard-coded string (e.g.
"A") - Refer to a sample field (e.g.
"{{ sample.correct_answer }}") - Derived from sample data (e.g.
"{{ sample.choices[sample.correct_index] }}") - Or a mix of the above (e.g.
"{{ sample.answer_key \| upper }}")
sample represents the current row of the dataset (with a field for every dataset column).
The template should produce a single character choice (e.g., "B").
Default:None
choices string, TemplateValue
Jinja template that produces the list of choices.
The choices can be:
- A hard-coded list (e.g.
["A", "B", "C", "D"]) - Refer to a sample field (e.g.
"{{ sample.answer_choices }}") - Derived from sample data (e.g.
"{{ sample.options \| map(attribute='key') \| list }}")
sample represents the current row of the dataset (with a field for every dataset column).
The template should produce a list of single character choices (e.g., ["A", "B", "C", "D"]). The result can be a JSON string or a Python list.
Default:None
purpose ScorerPurpose
The purpose of this scorer.
score: The scorer is used to score the solver output or the dataset sample.qa: The scorer is used to do QA over the solver output or the dataset sample.
score
key string
Unique identifier assigned to the entity in AI GO!.
Default:None
display_name string
The display name of the scorer.
Default:None
metrics array[PythonMetricTemplate, BinaryClassificationMetricTemplate, MulticlassClassificationMetricTemplate, MeanMetricTemplate, MaxMetricTemplate, MinMetricTemplate, StdDevMetricTemplate, FrequencyMetricTemplate, RecallMetricTemplate, PrecisionMetricTemplate, F1ScoreMetricTemplate]
The metrics associated with this scorer, which will produce per-task metrics.
Default:None
ground_truth_choice_field string, TemplateValue
Column in the dataset that contains the ground truth choice. The column is expected to contain single character choices (e.g. 'B'). The first character of the model's output message content is string matched (string equality) against this value to evaluate a sample.
This field is deprecated and will be removed in future versions. Use 'ground_truth_choice' with a Jinja template instead (e.g., '{{ sample.field_name }}').
Default:None
choices_field string, TemplateValue
Column in the dataset that contains the choices e.g.
a column containing ['A', 'B', 'C', 'D'].
This field is deprecated and will be removed in future versions. Use 'choices' with a Jinja template instead (e.g., '{{ sample.field_name }}').
Default:None
