mlflow.evaluation
-
class
mlflow.evaluation.
Assessment
(name: str, source: Optional[mlflow.entities.assessment_source.AssessmentSource] = None, value: Optional[Union[bool, float, str]] = None, rationale: Optional[str] = None, metadata: Optional[dict] = None, error_code: Optional[str] = None, error_message: Optional[str] = None)[source] Note
Experimental: This class may change or be removed in a future release without warning.
Assessment data associated with an evaluation result.
Assessment is an enriched output from the evaluation that provides more context, such as the rationale, source, and metadata for the evaluation result.
Example:
from mlflow.evaluation import Assessment assessment = Assessment( name="answer_correctness", value=0.5, rationale="The answer is partially correct.", )
-
classmethod
from_dictionary
(assessment_dict: dict) → mlflow.evaluation.assessment.Assessment[source] Create an Assessment object from a dictionary.
- Parameters
assessment_dict (dict) – Dictionary containing assessment information.
- Returns
The Assessment object created from the dictionary.
- Return type
-
to_dictionary
() → dict[source]
-
classmethod
-
class
mlflow.evaluation.
AssessmentSource
(source_type: str, source_id: str, metadata: Optional[dict] = None)[source] Note
Experimental: This class may change or be removed in a future release without warning.
Source of an assessment (human, LLM as a judge with GPT-4, etc).
-
classmethod
from_dictionary
(source_dict: dict) → mlflow.entities.assessment_source.AssessmentSource[source] Create a AssessmentSource object from a dictionary.
- Parameters
source_dict (dict) – Dictionary containing assessment source information.
- Returns
The AssessmentSource object created from the dictionary.
- Return type
-
to_dictionary
() → dict[source]
-
classmethod
-
class
mlflow.evaluation.
AssessmentSourceType
(source_type: str)[source] Note
Experimental: This class may change or be removed in a future release without warning.
-
class
mlflow.evaluation.
Evaluation
(inputs: dict, outputs: Optional[dict] = None, inputs_id: Optional[str] = None, request_id: Optional[str] = None, targets: Optional[dict] = None, error_code: Optional[str] = None, error_message: Optional[str] = None, assessments: Optional[list] = None, metrics: Optional[Union[dict, list]] = None, tags: Optional[dict] = None)[source] Note
Experimental: This class may change or be removed in a future release without warning.
Evaluation result data.
-
classmethod
from_dictionary
(evaluation_dict: dict)[source] Create an Evaluation object from a dictionary.
- Parameters
evaluation_dict (dict) – Dictionary containing evaluation information.
- Returns
The Evaluation object created from the dictionary.
- Return type
The evaluation tags.
-
to_dictionary
() → dict[source] Convert the Evaluation object to a dictionary.
- Returns
The Evaluation object represented as a dictionary.
- Return type
dict
-
classmethod
-
mlflow.evaluation.
log_evaluations
(*, evaluations: list, run_id: Optional[str] = None) → list[source] Note
Experimental: This function may change or be removed in a future release without warning.
Logs one or more evaluations to an MLflow Run.
- Parameters
evaluations (List[Evaluation]) – List of one or more MLflow Evaluation objects.
run_id (Optional[str]) – ID of the MLflow Run to log the evaluation. If unspecified, the current active run is used, or a new run is started.
- Returns
The logged Evaluation objects.
- Return type
List[EvaluationEntity]