F1 Score
Computes the F1 score (2·TP / (2·TP + FP + FN)) - the harmonic mean of precision and recall - from per-sample true positive, false positive, and false negative counts. Use this with scorers that produce counts - for example, a scorer that computes how many items in the model's output were actually correct and how many were not. Pair with Precision and Recall to see the individual components.
Output
A single metric named f1_score by default, or the value of name if provided. Value is in the [0, 1] range; 1.0 means perfect precision and recall.
Examples
Example: Function Call Overlap. A Python scorer checks which required function calls the model made. F1 balances precision and recall into a single overlap score.
scorers:
- type: python
compute_scores_snippet: !include "function_overlap_scorer.py"
# scorer returns: num_true_positives, num_false_positives, num_false_negatives
metrics:
- type: precision
num_true_positives_field: num_true_positives
num_false_positives_field: num_false_positives
name: Call Precision
- type: recall
num_true_positives_field: num_true_positives
num_false_negatives_field: num_false_negatives
name: Call Recall
- type: f1_score
num_true_positives_field: num_true_positives
num_false_positives_field: num_false_positives
num_false_negatives_field: num_false_negatives
name: Call F1Configuration
Properties
type Literal "f1_score"
Type
Default:f1_score
num_true_positives_field string required
The field that contains the number of true positives.
num_false_positives_field string required
The field that contains the number of false positives.
num_false_negatives_field string required
The field that contains the number of false negatives.
name string
The name given to the metric value. If not specified, it is f1_score.
None
key string
Unique identifier assigned to the entity in AI GO!.
Default:None
