Track ML experiments with MLFlow¶
Access this AI accelerator on GitHub
Experimentation is a mandatory activity in any machine learning developer’s day-to-day activities. For time series projects, the number of parameters and settings to tune for achieving the best model is in itself a vast search space.
Many of the experiments in time series use cases are common and repeatable. Tracking these experiments and logging results is a task that needs streamlining. Manual errors and time limitations may lead to selection of suboptimal models leaving better models lost in global minima.
The integration of DataRobot API, Papermill, and MLFlow automates machine learning experimentation so that is becomes easier, robust, and easy to share.
As illustrated below, you will use the orchestration notebook to design and run the experiment notebook, with the permutations of parameters handled automatically by DataRobot. At the end of the experiments, copies of the experiment notebook will be available, with the outputs for each permutation for collaboration and reference.
You can review the dependencies for the accelerator.
This accelerator covers the following activities:
- Acquiring a training dataset.
- Building a new DataRobot project.
- Deploying a recommended model.
- Scoring via Spark using DataRobot's exportable Java Scoring Code.
- Scoring via DataRobot's Prediction API.
- Reporting monitoring data to the MLOps agent framework in DataRobot.
- Writing results back to a new table.