DataRobot Prime tab¶
Contact your DataRobot representative for information on enabling DataRobot Prime.
DataRobot Prime optimizes prediction models for use outside of the DataRobot application, which can provide multiple benefits. Once created, you can export these models as a Python module or a Java class, and run the exported script. DataRobot Prime also supports models using feature lists that contain var type transformations.
You can build a DataRobot Prime model for most models on the Leaderboard. There are, however, some situations in which models cannot be approximated.
Creating a DataRobot Prime model¶
DataRobot Prime makes predictions using the number of features it has determined to be the optimal balance against the project's original metric. To create a DataRobot Prime model:
Process your dataset using any of the modeling modes.
Expand the model you want to approximate on the Leaderboard; click the DataRobot Prime tab.
On the resulting screen, click RUN DATAROBOT PRIME. You will see the the modeling job added to the Worker Queue and receive a success message:
When the job completes, the new DataRobot Prime model is available in the Leaderboard. The description below the model name contains the name and model number of the parent model, as well as the number of rules used in the downloadable code.
Expand the new DataRobot Prime model and click the DataRobot Prime tab to view a graph (explained here) of 10 rule count options plotted against the resulting metric score for each:
Changing the rule count¶
Initially, DataRobot approximates a model based on the best rule count choice. There are reasons why you may want to change the rule count, however. To use a different rule count:
- Determine, from the graph, the number of rules in your chosen selection.
- Select the new rule count by clicking the associated radio button.
- Confirm the new model request by clicking CONTINUE. When you click, DataRobot generates a new DataRobot Prime model, with the new rule count, and adds the entry to the Leaderboard.
Exporting your DataRobot Prime model¶
Once you are satisfied with the performance of your DataRobot Prime model, you can generate and download production code to make predictions.
Downloading production code¶
To download production code:
Using the Select Language dropdown in the bottom left corner, choose either Python or Java.
When using the generated source code in Python, you must specify the encoding if you are using a character set other than UTF-8.
Click Generate and Download Code. If this is the first time you are generating code for the model, DataRobot launches a Prime Validation job to test and verify the integrity of the source code it is generating. You can monitor the job progress in the Worker Queue:
When testing completes, DataRobot displays a message indicating whether validation passed or failed and provides a button to download the code:
To download DataRobot Prime model code for production use, click DOWNLOAD GENERATED CODE and browse to a save location. Your can now use the code outside of DataRobot to make predictions.
Using debugging information¶
When creating code, DataRobot tries to predict each row and, if an exception or error occurs, records the error in the code output (
stderr). Search for these messages to verify the integrity of your production code data or if you encounter problems when trying to run the production code.
For example, where "healthy" production code returns this:
def predict_dataframe(ds): return ds.apply(predict, axis=1)
Errors code returns something similar to this:
def predict_dataframe(ds): try: return ds.apply(predict, axis=1) except TypeError as e: sys.stderr.write('Error processing column: ' + unicode(e) +'\n') os._exit(1)
Using a DataRobot Prime model¶
This section provides additional details on DataRobot Prime models as well as tips in the event validation fails.
Reasons to use DataRobot Prime¶
DataRobot Prime supports the model transparency goals of DataRobot by providing:
- Generated model and scoring code.
- A coefficients model to verify data integrity.
- Multiple language support.
- DataRobot integration into systems that can’t necessarily communicate with the DataRobot environment (for example, for privacy reasons).
- Proof of performance as evidenced by the Prime model also placing in the Leaderboard.
- Low-latency scoring without the API call overhead. For example, if you use a real-time, low-latency scoring platform with GLMs and custom code, rule-based systems in a fast language like C++ or Java, DataRobot's Prime code export allows you to score directly on your low-latency platform without the API call-time overhead.
Exploring the DataRobot Prime model¶
To view the graph of rule count options plotted against the resulting metric score for each graph, expand the DataRobot Prime model on the Leaderboard and click the DataRobot Prime tab:
The following table describes the elements of the DataRobot Prime tab page for existing Prime models:
||Displays the metric used in the original project build.|
|Rule count options (2)||List the 10 rule count options, and their associated metric value, available for the model. Click the radio button to begin the build of a new model with a different rule count.|
|Language selection (3)||Provides a mechanism for choosing the language for your downloadable code.|
|Code generation link (4)||Begins the code generation (and, ultimately, download) process for exporting your DataRobot Prime model.|
Why to change the rule count¶
Initially, DataRobot approximates a model based on the best rule count choice. You may learn from the graph, however, that there is a better rule count choice and so you can change the rule count to simplify or add detail to your model. For example, a particular rule count may have fewer rules than the best selection, while only suffering a small score penalty.
When you change the rule count, DataRobot builds a new DataRobot Prime model and adds it to the Leaderboard. Any previous DataRobot Prime models built from the blueprint remain available. Note that you must generate and download code for each model individually.
There may be cases when you have applied a var type transformation on a feature and then created a feature list using the transformed feature. You can create a DataRobot Prime model using a var type transformation (a change from the type DataRobot detected and assigned to a type of your own choosing). If you execute the generated code on a dataset that does not contain the transformed feature, the DataRobot Prime model returns the same results as the internal predictions results. Because transformations allow you to define a "NaN" value, DataRobot replaces invalid values in the generated code with the value you defined.
DataRobot Prime does not support user-defined, log, square, or power transformations. Specifically, you can use the following var type transformations:
If validation fails¶
Although rare, it is possible that DataRobot returns an error message when it runs validation in response to a request to generate code. There are two reasons for error; DataRobot reports the error type in the message it returns. Note that even with an error message, you can still download code. It is best to email DataRobot Customer Support describing the issue for further assistance. Reasons for failure include:
Predictions from the generated code were not close enough to the predictions from the DataRobot Prime model. In this case, generated code can still be run.
Generated code could not run due to issues such as problem data or out of memory error. In this case, generated code probably will not run. That is, if the issue is a problem with the data, the code, most likely, will not run. If it is a memory error, if your local machine is large enough (while the workers that were trying to validate the code were not) the code may run.
You can re-run the validation if you feel circumstances may return a different result. To re-run a validation job:
- Delete the DataRobot Prime model.
- Run the model again (either by re-approximating the original model or generating a new model from the DataRobot Prime tab graph.
- Click Generate and Download Code to run the validation job again.
If validation still fails, click the link in the modal where the failure is indicated. DataRobot opens your email client and populates a message with the DataRobot Customer Support recipient, a subject line, and message content to help Support assist you in debugging the issue. You can add any additional information, if you choose.