Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

End-to-end ML workflow with Databricks

Access this AI accelerator on GitHub

DataRobot features an in-depth API that allows data scientists to produce fully automated workflows in their coding environment of choice. This accelerator shows how to pair the power of DataRobot with the Spark-backed notebook environment provided by Databricks.

In this notebook you'll see how data acquired and prepared in a Databricks notebook can be used to train a collection of models on DataRobot. You'll then deploy a recommended model and use DataRobot's exportable Scoring Code to generate predictions on the Databricks Spark cluster.

This accelerator notebook covers the following activities:

  • Acquiring a training dataset.
  • Building a new DataRobot project.
  • Deploying a recommended model.
  • Scoring via Spark using DataRobot's exportable Java Scoring Code.
  • Scoring via DataRobot's Prediction API.
  • Reporting monitoring data to the MLOps agent framework in DataRobot.
  • Writing results back to a new table.

Updated September 28, 2023