Prediction reference¶
DataRobot supports many methods of making predictions, including the DataRobot UI and APIs—for example, Python, R, and REST. The prediction methods you use depend on factors like the size of your prediction data, whether you're validating a model prior to deployment or using and monitoring it in production, whether you need immediate low-latency predictions, or if you want to schedule batch prediction jobs. This page hosts considerations, limits, and other helpful information to reference before making predictions.
File size limits¶
Note
Prediction file size limits vary for Self-Managed AI Platform installations and limits are configurable.
Prediction method | Details | File size limit |
---|---|---|
Leaderboard predictions | To make predictions on a non-deployed model using the UI, expand the model on the Leaderboard and select Predict > Make Predictions. Upload predictions from a local file, URL, data source, or the AI Catalog. You can also upload predictions using the modeling predictions API, also called the "V2 predictions API." Use this API to test predictions using your modeling workers on small datasets. Predictions can be limited to 100 requests per user, per hour, depending on your DataRobot package. | 1GB |
Batch predictions (UI) | To make batch predictions using the UI, deploy a model and navigate to the deployment's Make Predictions tab (requires MLOps). | 5GB |
Batch predictions (API) | The Batch Prediction API is optimized for high-throughput and contains production grade connectivity options that allow you to not only push data through the API, but also connect to the AI catalog, cloud storage, databases, or data warehouses (requires MLOps). | Unlimited |
Prediction API (real-time) | To make real-time predictions on a deployed model, use the Prediction API. | 50 MB |
Prediction monitoring | While the Batch Prediction API isn't limited to a specific file size, prediction monitoring is still subject to an hourly rate limit. | 100MB / hour |
Monitor model health¶
If you use any of the prediction methods mentioned above, DataRobot allows you to deploy a model and monitor its prediction output and performance over a selected time period.
A critical part of the model management process is to identify when a model starts to deteriorate and to quickly address it. Once trained, models can then make predictions on new data that you provide. However, prediction data changes over time—businesses expand to new cities, new products enter the market, policy or processes change—any number of changes can occur. This can result in data drift, the term used to describe when newer data moves away from the original training data, which can result in poor or unreliable prediction performance over time.
Use the MLOps deployment dashboard to analyze a model's performance metrics: prediction response time, model health, accuracy, data drift analysis, and more. When models deteriorate, the common action to take is to retrain a new model. Deployments allow you to replace models without re-deploying them, so not only do you not need to change your code, but DataRobot can track and represent the entire history of a model used for a particular use case.
Avoiding common mistakes¶
The section on dataset guidelines provides important information about DataRobot's dataset requirements. In addition, consider:
-
Under-trained models. The most common prediction mistake is to use models in production without retraining them beyond the initial training set. Best practice suggests the following workflow:
- Select the best model based on the validation set.
- Retrain the best model, including the validation set.
- Unlock holdout, and use the holdout to validate that the retrained model performs as well as you expect.
- Note that this does not apply if you are using the model DataRobot selects as “Recommended for Deployment." DataRobot automates all three of these steps for the recommended model and trains it to 100% of the data.
-
File encoding issues. Be certain that you properly format your data to avoid prediction errors. For example, unquoted newline characters and commas in CSV files often cause problems. JSON can be a better choice for data that contains large amounts of text because JSON is more standardized than CSV. CSV can be faster than JSON, but only when it is properly formatted.
-
Insufficient cores. When making predictions, keep the number of threads or processes less than or equal to the number of prediction worker cores you have and make synchronous requests. That is, the number of concurrent predictions should generally not exceed the number of prediction worker cores on your dedicated prediction server(s). If you are not sure how many prediction cores you have, contact DataRobot Support.
Warning
When performing predictions, the positive class has multiple representations that DataRobot can choose from, from the original positive class as written on the dataset, a user-specified choice in the frontend, or the positive class as provided by the prediction set. Currently DataRobot's internal rules regarding this are not obvious, which can lead to automation issues like str("1.0")
being returned as the positive class instead of int(1)
. This issue is being fixed by standardizing the internal ruleset in a future release.
Prediction speed¶
-
Model scoring speed. Scoring time differs by model and not all models are fast enough for "real-time" scoring. Before going to production with a model, verify that the model you select is fast enough for your needs. Use the Speed vs. Accuracy tab to display model scoring time.
-
Understanding the model cache. A dedicated prediction server scores quickly because of its in-memory model cache. As a result, the first few requests using a new model may be slower because the model must first be retrieved.
-
Computing predictions with Prediction Explanations. Computing predictions with XEMP Prediction Explanations requires a significantly higher number of operations than only computing predictions. Expect higher runtimes, although actual speed is model-dependent. Reducing the number of features used or avoiding blenders and text variables may increase speed. Increased computation costs do not apply to SHAP Prediction Explanations.