Management agent installation and configuration¶
The MLOps agent .tar
file contains all artifacts required to run the management agent. You can run the management agent in either of the following configurations:
- Inside a container.
- On a host machine, as a standalone process.
-
To build and install the management agent container, run the following commands to unpack the tarball in a suitable location and build the container image:
tar -zxf datarobot_mlops_package-*.tar.gz cd datarobot_mlops_package-*/ cd tools/bosun_docker/ make build
This tags the management agent image with the appropriate
version
tag and thelatest
tag. -
To build the management agent image and run the container such that the management agent is configurable from the command line, run the following:
tar -zxf datarobot_mlops_package-*.tar.gz cd datarobot_mlops_package-*/ cd tools/bosun_docker/ make run
-
Enter the
mlopsUrl
, theapiToken
, and the ID of the prediction environment to monitor:Generate MLOps Management-Agent configuration file. Enter DataRobot App URL (e.g. https://app.datarobot.com): <https://<MLOPS_HOST>> Enter DataRobot API Token: <MLOPS_API_TOKEN> Enter DataRobot Prediction Environment ID: <MLOPS_PREDICTION_ENVIRONMENT_ID>
By default, the management agent uses the filesystem plugin. If you want to use a different plugin, you can configure the management agent configuration file to use that plugin and then map it to the container.
For example, you can use the following commands to run the management agent with the Kubernetes plugin:
cd datarobot_mlops_package-*/ docker run -it \ -v conf/mlops.bosun.conf.yaml:/opt/datarobot/mlops/bosun/conf/mlops.bosun.conf.yaml \ -v conf/plugin.k8s.conf.yaml:/opt/datarobot/mlops/bosun/conf/plugin.k8s.conf.yaml \ datarobot/mlops-management-agent
-
To install and run the management agent on the host machine, Python 3.7+ and Java 11 must be installed on the system. Then, you can create a Python virtual environment to install the management agent plugins:
mkdir /opt/management-agent-demo cd /opt/management-agent-demo python3 -m venv .venv source .venv/bin/activate tar -zxf datarobot_mlops_package-*.tar.gz cd datarobot_mlops_package-*/ pip install lib/datarobot_mlops-*-py2.py3-none-any.whl pip install lib/datarobot_mlops_connected_client-*-py3-none-any.whl pip install lib/datarobot_bosun-*-py3-none-any.whl
-
Configure the management agent by modifying the configuration file:
<your-chosen-editor> ./conf/mlops.bosun.conf.yaml
-
Start the management agent:
./bin/start-bosun.sh
-
To configure the management agent on the host machine, edit the management agent configuration file,
conf/mlops.bosun.conf.yaml
:-
Update the values for
mlopsUrl
andapiToken
. -
Verify that
<BOSUN_VENV_PATH>
points to the virtual environment created during installation (e.g.,/opt/management-agent-demo/bin
). -
Specify the Prediction Environment ID at
<MLOPS_PREDICTION_ENVIRONMENT_ID>
. -
Uncomment the appropriate
command:
line in thepredictionEnvironments
section to use the correct plugin. Ensure you comment out thecommand:
line for any unused plugins. -
(Optional) You may need to configure the configuration file for the plugin you're using. For more information, see Configure management agent plugins.
mlops.bosun.conf.yaml# This file contains configuration for the Management Agent # Items marked "Required" must be set. Other settings can use the defaults set below. # Required. URL to the DataRobot MLOps service. mlopsUrl: "https://<MLOPS_HOST>" # Required. DataRobot API token. apiToken: "<MLOPS_API_TOKEN>" # When true, verify SSL certificates when connecting to DR app. When false, SSL verification will not be # performed. It is highly recommended to keep this config variable as true. verifySSL: true # Whether to run management agent as the workload coordinator. The default value is true. isCoordinator: true # Whether to run management agent as worker. The default value is true. isWorker: true # When true, start a REST server. This will provide several API endpoints (worker health check enables) serverMode: false # The port to use for the above REST server serverPort: "12345" # The url where to reach REST server, will be use by external configuration services serverAddress: "http://localhost" # Specify the configuration service. This is 'internal' by default and the # workload coordinator and worker are expected to run in the same JVM. # When run in high availability mode, the configuration needs to be provided by # a service such as Consul. configurationService: tag: "tag" type: "internal" connectionDetail: "" # Path to write Bosun stats statsPath: "/tmp/management-agent-stats.json" # HTTP client timeout in milliseconds (30sec timeout). httpTimeout: 30000 # Number of times the agent will retry sending a request to the MLOps service after it receives a failure. httpRetry: 3 # Number of active workers to process management agent commands numActionWorkers: 2 # Timeout in seconds processing active commands, eg. launch, stop, replaceModel actionWorkerTimeoutSec: 300 # Timeout in seconds for requesting status of PE and the deployment statusWorkerTimeoutSec: 300 # How often (in seconds) status worker should update DR MLOps about the status of PE and deployments statusUpdateIntervalSec: 120 # How often (in seconds) to poll MLOps service for new deployment / PE Actions mlopsPollIntervalSec: 60 # Optional: Plugins directory in which all required plugin jars can be found. # If you are only using external commands to run plugin actions then there is # no need to use this option. # pluginsDir: "../plugins/" # Model Connector configuration modelConnector: type: "native" # Scratch place to work on, default "/tmp" scratchDir: "/tmp" # Config file for private / secret configuration, management agent will not read this file, just # forward the filename in configuration, optional secretsConfigFile: "/tmp/secrets.conf" # Python command that implements model connector. # mcrunner is installed as part the bosun python package. You should either # set your PATH to include the location of mcrunner, or provide the full path. command: "<BOSUN_VENV_PATH>/bin/mcrunner" # prediction environments this service will monitor predictionEnvironments: # This Prediction Environment ID matches the one in DR MLOps service - id: "<MLOPS_PREDICTION_ENVIRONMENT_ID>" type: "ExternalCommand" platform: "os" # Enable monitoring for this plugin, so that the MLOps information # (viz, url and token) can be forwarded to plugin, default: False # enableMonitoring: true # Provide the command to run the plugin: # You can either fix PATH to point to where bosun-plugin-runner is located, or # you can provide the full path below. # The filesystem plugin used in the example below if one of the built in plugins provided # by the bosun-plugin-runner command: "<BOSUN_VENV_PATH>/bin/bosun-plugin-runner --plugin filesystem --private-config <CONF_PATH>/plugin.filesystem.conf.yaml" # The following example will run the docker plugin # (one of the built in plugins provided by bosun-plugin runner) # command: "<BOSUN_VENV_PATH>/bin/bosun-plugin-runner --plugin docker --private-config <CONF_PATH>/plugin.docker.conf.yaml" # The following example will run the kubernetes plugin # (one of the built in plugins provided by bosun-plugin runner) # WARNING: this plugin is currently considered ALPHA maturity; please consult your account representative if you # are interested in trying it. # command: "<BOSUN_VENV_PATH>/bin/bosun-plugin-runner --plugin k8s --private-config <CONF_PATH>/plugin.k8s.conf.yaml" # If your plugin was installed as a python module (using pip), you can provide the name # of the module that contains the plugin class. For example --plugin sample_plugin.my_plugin # command: "<BOSUN_VENV_PATH>/bin/bosun-plugin-runner --plugin sample_plugin.my_plugin --private-config <CONF_PATH>/my_config.yaml" # If your plugin is in a directory, you can provide the name of the plugin as the path to the # file that contains your plugin. For example: --plugin sample_plugin/my_plugin.py # command: "<BOSUN_VENV_PATH>/bin/bosun-plugin-runner --plugin sample_plugin/my_plugin.py --private-config <CONF_PATH>/my_config.yaml" # Note: you can control the plugin logging via the --log-config option of bosun-plugin-runner
-
-
To run the management agent natively in Docker, first build the
datarobot/mlops-management-agent
image from the MLOps agent tarball:make build -C tools/bosun_docker
-
Configure the monitoring agent in Docker, mounted to the default directory or a custom location:
-
To run the management agent with the filesystem plugin and with the configuration mounted to the default directory:
docker run \ -v /path/to/mlops.bosun.conf.yaml:/opt/datarobot/mlops/bosun/conf/mlops.bosun.conf.yaml \ -v /path/to/plugin.filesystem.conf.yaml:/opt/datarobot/mlops/bosun/conf/plugin.filesystem.conf.yaml \ datarobot/mlops-management-agent
-
To run the management agent with the filesystem plugin and with agent configuration mounted to a custom location:
docker run \ -v /path/to/mlops.bosun.conf.yaml:/var/tmp/mlops.bosun.conf.yaml \ -v /path/to/plugin.filesystem.conf.yaml:/opt/datarobot/mlops/bosun/conf/plugin.filesystem.conf.yaml \ -e MLOPS_AGENT_CONFIG_YAML=/var/tmp/mlops.bosun.conf.yaml \ datarobot/mlops-management-agent
-
To use the Docker-based plugin while also running the management agent in a docker container, you will need to include a few extra options, and you will need to mount in the entire config directory since there are multiple files to modify:
$ docker run \ -v ${PWD}/conf/:/opt/datarobot/mlops/bosun/conf/ \ -v /tmp:/tmp \ -v /var/run/docker.sock:/var/run/docker.sock \ --user root \ --network bosun \ datarobot/mlops-management-agent:latest
-