Continuously monitor the performance of your AI-enabled processes.
binary
): Simple yes/no or true/false evaluations. Useful for checking if an output meets a specific criterion.
continuous
): Numeric scores within a defined range. Useful for rating quality on a scale.
categorical
): Fixed set of categories or labels. Useful for classifying outputs into distinct categories.
ordinal
): Ordered categories. Useful when categories have a natural order.
binary
for clear yes/no criteriacontinuous
for nuanced quality assessmentscategorical
for distinct, unordered classificationsordinal
when categories have a natural progressionEvaluators
. Here’s an example of creating an evaluator:
output_name
: Name of the specific output being evaluated (optional)field_path
: JSON path to a specific field within the output (optional)value
: The evaluation value, must conform to the evaluation label’s typecreated_by_evaluator_id
: ID of the evaluator if created automatically