Enterprise database integrations¶
Enterprise database integrations will be unavailable and fully deprecated (end-of-life) starting with release v7.3 (December 13th, 2021 for cloud users). Post-deprecation, integrations should not be used to generate predictions with deployments.
To enable integration with a variety of enterprise databases, DataRobot provides a “self-service” platform to write predictions to databases. This allows you to select a data source to make predictions, define a schedule on which data is scored, and receive the results of predictions—all from a DataRobot deployment.
DataRobot supports prediction integrations for Microsoft SQL, Snowflake, Tableau, and Qlik.
Note that prediction integration jobs cannot be configured for users on an MLOps-only instance using deployments that do not have an associated modeling project. In this case, you can instead make Batch Predictions with a JDBC adapter.
Prediction integration jobs can only be configured by deployment Owners.
Once you configure prediction jobs, you can view them all from the Prediction Jobs list. Note that multiple jobs can be created for each deployment, regardless of the platform used, provided that the input data's schema matches the data used to create the deployment.
Predictions with enterprise databases¶
Data sources can be used to write predictions to Microsoft SQL, Snowflake, and Tableau databases. From a deployment's Predictions > Integrations tab, select the tile for the database to which you want to write predictions.
Once chosen, select the account credentials used to access the database. If you are not using saved credentials, select Use different account and enter new credentials to access the data source. After selecting the data source and account, click Next.
Mark the checkbox to include:
Prediction Explanations. You can indicate the number of explanations you want to return for each outcome.
Prediction errors. Errors are returned in the
prediction_statuscolumn of the prediction response.
Select the destination features. These are the columns from the data source that, once selected, write to the destination alongside predictions. The destination is the table that DataRobot returns containing predictions, Prediction Explanations, and destination features. Check the features in the "Available Features" column that you want to include, or check "Select All" to include every feature. Then click the right-facing arrow to add them as destination features. To remove any features, check the features in the "Selected output features" column, and click the left-facing arrow.
Among the selected destination features, one of the features—a feature that contains a unique ID in each row—will serve as a primary key and return with the prediction results. The source data must also contains the primary key feature, with corresponding unique IDs, allowing you to match up and compare the prediction results with the source data. You can do so by using join statements on your database platform.
Select the destination for predictions.
For Microsoft SQL and Snowflake databases:
Choose to use a new or existing data destination, and whether to use the same credentials provided for your data source or a new account.
Complete the subsequent fields:
Field Description Username and Password Provide the credentials required to access the destination. Check the box to save the credentials for future use. Account name Select the name that represents the provided credentials that you previously saved for future logins. After entering the credentials and account name, click Save and sign in. Schema Provide the name of a schema to use as the data source. Once selected, the schema's database displays below the schema name. Be sure that the database has been configured for your data connection before proceeding. Destination table name Select a table from the dropdown to contain the predictions and associated metadata. Alternatively, select Create new table in the dropdown and enter the name of a table you wish to use.
Field Description Tableau URL Enter the URL to the server that the data source connects to. Username and Password Provide the credentials required to access the Tableau server. Site Name Indicate the name of the specific site to connect to on the Tableau server. If connecting to a hosted server, leave this field empty—these servers use the "Default" site. If connecting to a cloud-based Tableau sever, there is no "Default" site, so you must indicate a specific site name.
When you have completed the fields, click Next. DataRobot tests that the data connection is successful before proceeding.
Define when and how the prediction jobs will run.
Name the prediction job.
Choose whether to run the integration automatically. If toggled on, you can select the schedule on which the job runs from the dropdown (hourly, daily, weekly, etc.). If disabled, the job configuration is saved and is run manually from the Prediction Jobs page.
DataRobot provides a code snippet containing the necessary commands and identifiers needed to submit Qlik data for scoring. This code is can be used with the Prediction API.
From a deployment's Predictions > Integrations tab, select the Qlik tile.
To use the Qlik code snippet, follow the sample and make the necessary changes when you want to integrate the model, via API, into your production application. Enable the checkbox to include prediction explanations (1) alongside the prediction results.
Copy the sample code (2) and modify as necessary. Once modified, your snippet is ready for use with the Prediction API.
View prediction jobs¶
You can view a list of prediction integrations jobs set up for all deployments from the deployments inventory. Navigate to Deployments > Prediction Jobs. Note that you can also view job-specific and deployment-specific histories.
The following table describes the information and actions available from the Prediction Jobs list.
|Job Name||The name of the prediction integration job configured for a deployment.|
|Last Successful Run||The time and date recorded for the last successful prediction response for the prediction job.|
|Last Failed Run||The time and date recorded for the last failed prediction response for the prediction job.|
|Integration Type||The database platform to which the deployment writes predictions.|
|Schedule||The cadence at which the prediction jobs run (monthly, weekly, daily, etc.). If the job is designated to run manually, it returns the status "Disabled." Click the pencil icon to edit a job's schedule.|
|Status||The current status of each prediction job. Read below for more information on what each status represents.|
|Delete||Select the trash can icon in this column to delete a specific prediction job.|
The Status summary in the Prediction Jobs list provides an at-a-glance indication of each prediction job's state. To view this more detailed information for a prediction job, click on it to open the job run history.
Interpret the color indicators for each prediction job as follows:
|Green||The last prediction job ran successfully.|
|Red||The last prediction job failed.|
|/ Yellow||The prediction job is in progress. When viewing the job run history, the hourglass icon will display instead.|
|Gray||The prediction job has not yet run.|
Select an individual prediction job from the Prediction Jobs list to view its history of prediction requests. You can also access the individual job histories from a deployment's Integrations tab.
The job history details:
- start and end times for each job
- how many seconds each job took to run
- number of rows scored
- number of failed rows
- job status
Click the dropdown under Show Log to view the recorded progress of each job. This is useful for troubleshooting failed prediction job runs and identifying the specific errors that cause failures.
To view all prediction jobs configured for a single deployment, navigate to the deployment's Integrations tab. All existing jobs display below the tiles.
The deployment-specific jobs list details:
- job names, with an icon to represent the data source used
- when a job was created
- the date/time of the last run
- job status
- the job schedule
- job actions—pause/resume a job run or delete an existing job