examples/evaluation/README.md
The examples in this directory demonstrate how to use the mlflow.evaluate() API. Specifically,
they show how to evaluate a PyFunc model on a specified dataset using the builtin default evaluator
and specified extra metrics, where the resulting metrics & artifacts are logged to MLflow Tracking.
They also show how to specify validation thresholds for the resulting metrics to validate the quality
of your model. See full list of examples below:
evaluate_on_binary_classifier.py evaluates an xgboost XGBClassifier model on dataset loaded by
shap.datasets.adult.evaluate_on_multiclass_classifier.py evaluates a scikit-learn LogisticRegression model on dataset
generated by sklearn.datasets.make_classification.evaluate_on_regressor.py evaluate as scikit-learn LinearRegression model on dataset loaded by
sklearn.datasets.load_diabetesevaluate_with_custom_metrics.py evaluates a scikit-learn LinearRegression
model with a custom metric function on dataset loaded by sklearn.datasets.load_diabetesevaluate_with_custom_metrics_comprehensive.py evaluates a scikit-learn LinearRegression model
with a comprehensive list of custom metric functions on dataset loaded by sklearn.datasets.load_diabetesevaluate_with_model_validation.py trains both a candidate xgboost XGBClassifier model
and a baseline DummyClassifier model on dataset loaded by shap.datasets.adult. Then, it validates
the candidate model against specified thresholds on both builtin and extra metrics and the dummy model.pip install scikit-learn xgboost shap>=0.40 matplotlib
Run in this directory with Python.
python evaluate_on_binary_classifier.py
python evaluate_on_multiclass_classifier.py
python evaluate_on_regressor.py
python evaluate_with_custom_metrics.py
python evaluate_with_custom_metrics_comprehensive.py
python evaluate_with_model_validation.py