Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Agent use cases

Reference the monitoring use cases below for examples of how to apply the MLOps agent.

Report metrics

If your prediction environment cannot be network-connected to DataRobot, you can instead use MLOps agent reporting in a disconnected manner.

  1. In the prediction environment, configure the MLOps library to use the FILESYSTEM spooler type. The MLOps library will report metrics into its configured directory, for example, /disconnected/predictions_dir.

  2. Run the agent on a machine that is network-connected to DataRobot.

  3. Configure the agent to use the FILESYSTEM spooler type and receive its input from a local directory. For example: /connected/predictions_dir.

  4. Migrate the contents of the directory /disconnected/predictions_dir to the connected environment /connected/predictions_dir.

Reports for Scoring Code models

You can also use MLOps agent reporting to send monitoring metrics to DataRobot for downloaded Scoring Code models. Reference an example of this use case in the tarball examples/java/CodeGenExample.

Monitor a Spark environment

A common use case for the MLOps agent is monitoring scoring in Spark environments where scoring happens in Spark and you want to report the predictions and features to DataRobot. Since Spark is usually using a multi-node setup, it is difficult to use the spool file channel in MLOps because a shared consistent file system is uncommon in Spark installations.

To work around this, use a network-based channel like RabbitMQ or AWS SQS. These channels can work with multiple writers and single (or multiple) readers.

The following example outlines how to setup agent monitoring on a Spark system using the MLOps Spark Util module, which provides a way to report scoring results on the Spark framework. Reference the documentation for the MLOpsSparkUtils module in the MLOps Java examples directory examples/java/SparkUtilsExample/.

The Spark example's source code performs three steps:

  1. Given a scoring JAR file, it scores data and delivers results in a DataFrame.
  2. Merges the feature's DataFrame and the prediction results into a single DataFrame.
  3. Calls the mlops_spark_utils.MLOpsSparkUtils.reportPredictions helper to report the predictions using the merged DataFrame.

You can use mlops_spark_utils.MLOpsSparkUtils.reportPredictions to report predictions generated by any model as long as the function retrieves the data via a DataFrame.

This example uses RabbitMQ as the channel of communication, and includes channel setup. Since Spark is a distributed framework, DataRobot requires a network-based channel like RabbitMQ or AWS SQS in order for the Spark workers to be able to send the monitoring data to the same channel regardless of the node the worker is running on.

Spark prerequisites

The following steps outline the prerequisites necessary to execute the Spark monitoring use case.

  1. Run a spooler (RabbitMQ in this example) in a container: docker run -d -p 15672:15672 -p 5672:5672 --name rabbit-test-for-medium rabbitmq:3-management This command also runs the management console for RabbitMQ. Access it via web browser: http://localhost:15672.

  2. Configure and start the MLOps agent.

    • Follow the quickstart guide provided in the agent tarball.
    • Setup the agent to communicate with RabbitMQ.
    • Edit the agent channel config to match the following:

      type: "RABBITMQ_SPOOL"
      details: {name: "rabbit", queueUrl: "amqp://localhost:5672", queueName: "spark_example" }
      
  3. If you are using mvn, install the datarobot-mlops JAR into your local mvn repository before testing the examples by running: install_jar_into_maven.sh. This script is in the examples/java/ directory.

  4. Use make to create the needed java JAR files.

  5. Set your JAVA_HOME environment variable, for example:

    export JAVA_HOME=/usr/libexec/java_home -v 1.8

  6. Install Spark locally.

    • Download Spark 2.4.5 (built for Hadoop 2.7) onto a local machine.
    • Unarchive the tarball: tar xvf ~/Downloads/spark-2.4.5-bin-hadoop2.7.tgz
    • In the created spark-2.4.5 directory, start the spark cluster:
      sbin/start-master.sh -i localhost`
      sbin/start-slave.sh -i localhost -c 8 -m 2G spark://localhost:7077
      
    • Ensure your installation is successful:

      bin/spark-submit --class org.apache.spark.examples.SparkPi --master spark://localhost:7077 --num-executors 1 --driver-memory 512m --executor-memory 512m --executor-cores 1 examples/jars/spark-examples_2.11-2.4.5.jar 10
      

Spark use case

After meeting the prerequisites outlined above, run the Spark example.

  1. Create the model package and initialize the deployment:

    create_deployment.sh

    Alternatively, use the DataRobot UI to create an external model package and deploy it.

  2. Set the environment variables for the deployment and the model returned from creating the deployment by copying and pasting them into your shell.

    export MLOPS_DEPLOYMENT_ID=<deployment_id>; export MLOPS_MODEL_ID=<model_id>

  3. Generate predictions and report statistics to DataRobot:

    run_example.sh

    You may need to put the Spark bin directory in your path and run the following command:

    env SPARK_BIN=/opt/ml/spark-2.4.5-bin-hadoop2.7/bin ./run_example.sh

  4. If you want to change the spooler type (the communication channel between the Spark job and the MLOps agent):

    • Edit the Scala code under src/main/scala/com/datarobot/dr_mlops_spark/Main.scala

    • Modify the following line to contain the required channel configuration:

      val channelConfig = "output_type=rabbitmq;rabbitmq_url=amqp://localhost;rabbitmq_queue_name=spark_example"
      
    • Recompile the code by running make.


Updated November 5, 2021
Back to top