GitHub Actions for custom models¶
Availability information
GitHub Actions for custom models is a premium feature. Contact your DataRobot representative or administrator for information on enabling the feature.
The custom models action manages custom inference models and their associated deployments in DataRobot via GitHub CI/CD workflows. These workflows allow you to create or delete models and deployments and modify settings. Metadata defined in YAML files enables the custom model action's control over models and deployments. Most YAML files for this action can reside in any folder within your custom model's repository. The YAML is searched, collected, and tested against a schema to determine if it contains the entities used in these workflows. For more information, see the custom-models-action repository.
GitHub Actions quickstart¶
This quickstart example uses a Python Scikit-Learn model template from the datarobot-user-model repository. To set up a custom models action that will create a custom inference model and deployment in DataRobot from a custom model repository in GitHub, take the following steps:
-
In the
.github/workflows
directory of your custom model repository, create a YAML file (with any filename) containing the following:1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
name: Workflow CI/CD on: pull_request: branches: [ master ] push: branches: [ master ] # Allows you to run this workflow manually from the Actions tab workflow_dispatch: jobs: datarobot-custom-models: # Run this job on any action of a PR, but skip the job upon merging to the main branch. This # will be taken care of by the push event. if: ${{ github.event.pull_request.merged != true }} runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 with: fetch-depth: 0 - name: DataRobot Custom Models Step id: datarobot-custom-models-step uses: datarobot-oss/custom-models-action@v1.6.0 with: api-token: ${{ secrets.DATAROBOT_API_TOKEN }} webserver: https://app.datarobot.com/ branch: master allow-model-deletion: true allow-deployment-deletion: true
Configure the following fields:
-
branches
: Provide the name of your repository's main branch (usually eithermaster
ormain
) forpull_request
andpush
. If you created your repository in GitHub, you likely need to update these fields tomain
. Whilemaster
andmain
are the most common branch names, you can target any branch; for example, you could run the workflow on arelease
branch or atest
branch. -
api-token
: Provide a value for theDATAROBOT_API_TOKEN
variable by creating an encrypted secret for GitHub Actions containing your DataRobot API key. Alternatively, you can set the token string directly to this field; however, this method is highly discouraged because your API key is extremely sensitive data. If you use this method, anyone who has access to your repository can access your API key. -
webserver
: Provide your DataRobot webserver value here if it isn't the default DataRobot US server (https://app.datarobot.com/
). -
branch
: Provide the name of your repository's main branch (usually eithermaster
ormain
). If you created your repository in GitHub, you likely need to update this field tomain
. Whilemaster
andmain
are the most common branch names, you can target any branch; for example, you could run the workflow on arelease
branch or atest
branch.
-
-
Commit the workflow YAML file and push it to the remote. After you complete this step, any push to the remote (or merged pull request) triggers the action.
-
In the folder for your DataRobot custom model, add a model definition YAML file (e.g.,
model.yaml
) containing the following YAML and update the field values according to your model's characteristics:Configure the following fields:user_provided_model_id: user/model-unique-id-1 target_type: Regression settings: name: My Awesome GitHub Model 1 [GitHub CI/CD] target_name: Grade 2014 version: # Make sure this is the environment ID is in your system. # This one is the '[DataRobot] Python 3 Scikit-Learn Drop-In' environment model_environment_id: 5e8c889607389fe0f466c72d
-
user_provided_model_id
: Provide any descriptive and unique string value. DataRobot recommends following a naming pattern, such as<user>/<model-unique-id>
.Note
By default, this ID will reside in a unique namespace, the GitHub repository ID. Alternatively, you can configure the namespace as an input argument to the custom models action.
-
target_type
: Provide the correct target type for your custom model. -
target_name
: Provide the correct target name for your custom model. -
model_environment_id
: Provide the DataRobot execution environment required for your custom model. You can find these environments in the DataRobot application under Model Registry > Custom Model Workshop > Environments.
-
-
In any directory in your repository, add a deployment definition YAML file (with any filename) containing the following YAML:
user_provided_deployment_id: user/my-awesome-deployment-id user_provided_model_id: user/model-unique-id-1
Configure the following fields:
-
user_provided_deployment_id
: Provide any descriptive and unique string value. DataRobot recommends following a naming pattern, such as<user>/<deployment-unique-id>
.Note
By default, this ID will reside in a unique namespace, the GitHub repository ID. Alternatively, you can configure the namespace as an input argument to the custom models action.
-
user_provided_model_id
: Provide the exactuser_provided_model_id
you set in the model definition YAML file.
-
-
Commit these changes and push to the remote, then:
-
Navigate to your custom model repository in GitHub and click the
Actions
tab. You'll notice that the action is being executed. -
Navigate to the DataRobot application. You'll notice that a new custom model was created along with an associated deployment. This action can take a few minutes.
-
Warning
Creating two commits (or merging two pull requests) in quick succession can result in a ResourceNotFoundError
. For example, you add a model definition with a training dataset, make a commit, and push to the remote. Then, you immediately delete the model definition, make a commit, and push to the remote. The training data upload action may begin after model deletion, resulting in an error. To avoid this scenario, wait for an action's execution to complete before pushing new commits or merging new pull requests to the remote repository.
Access commit information in DataRobot¶
After your workflow creates a model and a deployment in DataRobot, you can access the commit information from the model's version info and the deployment's overview:
-
In the Model Registry, click Custom Model Workshop.
-
On the Models tab, click a GitHub-sourced model from the list and then click the Versions tab.
-
Under Manage Versions, click the version you want to view the commit for.
-
Under Version Info, find the Git Commit Reference and then click the commit hash (or commit ID) to open the commit that created the current version.
-
In the Model Registry, on the Registered Models tab, click a GitHub-sourced model package from the list.
-
On the Info tab, review the model information provided by your workflow, find the Git Commit Reference, and then click the commit hash (or commit ID) to open the commit that created the current model package.
-
In the Deployments inventory, click a GitHub-sourced deployment from the list.
-
On the deployment's Overview tab, review the model and deployment information provided by your workflow.
-
In the Content group box, find the Git Commit Reference and click the commit hash (or commit ID) to open the commit that created the deployment.