Multiclass Classification
Computes a set of multiclass classification metrics by comparing predicted and ground-truth scores across all samples. Use this when your scorer outputs both a prediction and a ground-truth label per sample and the label space has more than two classes. Class labels are inferred automatically from the data - you do not need to enumerate them. For two-class problems, use Binary Classification instead.
Output
Multiple metrics per class and overall:
accuracy: Fraction of samples correctly classified across all classes.{class}_precision,{class}_recall: Per-class metrics.
Class Inference
Class labels are derived from the values found in field_gt and field_pred
at runtime. No configuration is needed to specify the label set in advance.
Examples
Example: Sentiment Classification. A labeler assigns one of three sentiment labels to each response. The multiclass classification metric compares the assigned labels against a ground-truth sentiment column in the dataset.
scorers:
- type: python
compute_scores_snippet: !include compute_scores.py
# Produces 'gt_label' and 'pred_label' scores
metrics:
- type: multiclass-classification
field_gt: gt_label
field_pred: pred_labelConfiguration
Properties
type Literal "multiclass-classification" required
The type of the metric.
field_gt string, TemplateValue required
The field in the scores containing the ground-truth answer.
field_pred string, TemplateValue required
The field in the scores containing the predicted answer.
key string
Unique identifier assigned to the entity in AI GO!.
Default:None
