nucleus.validate.scenario_test_evaluation#

Data types for Scenario Test Evaluation results.

ScenarioTestEvaluation

The results and attributes of an evaluation of a scenario test.

ScenarioTestEvaluationStatus

The Job status of scenario test evaluation.

ScenarioTestItemEvaluation

Dataset item-level results of an evaluation of a scenario test.

class nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluation#

The results and attributes of an evaluation of a scenario test.

id#

The ID of this scenario test evaluation.

Type:

str

scenario_test_id#

The ID of the associated scenario test.

Type:

str

eval_function_id#

The ID of the associated evaluation function.

Type:

str

model_id#

The ID of the associated model.

Type:

str

status#

The status of the evaluation job.

Type:

str

result#

The float result of the evaluation.

Type:

Optional[float]

passed#

Whether the scenario test was passed.

Type:

bool

item_evals#

The individual results for each dataset item.

Type:

List[ScenarioTestItemEvaluation]

Return type:

List[ScenarioTestItemEvaluation]

connection#

The connection to the Nucleus API.

Type:

Connection

class nucleus.validate.scenario_test_evaluation.ScenarioTestEvaluationStatus#

The Job status of scenario test evaluation.

name()#

The name of the Enum member.

value()#

The value of the Enum member.

class nucleus.validate.scenario_test_evaluation.ScenarioTestItemEvaluation#

Dataset item-level results of an evaluation of a scenario test. Note that this class is immutable.

evaluation_id#

The ID of the associated scenario test evaluation

Type:

str

scenario_test_id#

The ID of the associated scenario test.

Type:

str

eval_function_id#

The ID of the associated evaluation function.

Type:

str

dataset_item_id#

The ID of the dataset item of this evaluation.

Type:

str

result#

The numerical result of the evaluation on this item.

Type:

Optional[float]

passed#

Whether the result was sufficient to pass the test for this item.

Type:

bool