# Add training data to a custom model

> Add training data to a custom model - How to assign training data to a custom model in the Custom
> Model Workshop.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.558153+00:00` (UTC).

## Primary page

- [Add training data to a custom model](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/custom-models/custom-model-workshop/custom-model-training-data.html): Full documentation for this topic (HTML).

## Related documentation

- [Classic UI documentation](https://docs.datarobot.com/en/docs/classic-ui/index.html): Linked from this page.
- [MLOps](https://docs.datarobot.com/en/docs/classic-ui/mlops/index.html): Linked from this page.
- [Deployment](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/index.html): Linked from this page.
- [Prepare custom models for deployment](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/custom-models/index.html): Linked from this page.
- [Custom Model Workshop](https://docs.datarobot.com/en/docs/classic-ui/mlops/deployment/custom-models/custom-model-workshop/index.html): Linked from this page.
- [unstructuredcustom inference models](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html): Linked from this page.
- [assemble a custom model](https://docs.datarobot.com/en/docs/workbench/nxt-registry/nxt-model-workshop/nxt-create-custom-model.html): Linked from this page.
- [data drift](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/data-drift.html): Linked from this page.
- [accuracy](https://docs.datarobot.com/en/docs/classic-ui/mlops/monitor/deploy-accuracy.html): Linked from this page.

## Documentation content

# Add training data to a custom model

To enable feature drift tracking for a model deployment, you must add training data. To do this, assign training data to a model version. The method for providing training and holdout datasets for [unstructuredcustom inference models](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html) requires you to upload the training and holdout datasets separately. Additionally, these datasets cannot include a partition column.

> [!WARNING] File size warning
> The file size limit for custom model training data uploaded to DataRobot is 1.5GB.

> [!WARNING] Considerations for training data prediction rows count
> Training data uploaded to a custom model is used to compute Feature Impact, drift baselines, and Prediction Explanation previews. To perform these calculations, DataRobot automatically splits the uploaded training data into partitions for training, validation, and holdout (i.e., T/V/H) in a 60/20/20 ratio. Alternatively, you can manually provide a partition column in the training dataset to assign predictions, row-by-row, to the training ( `T`), validation ( `V`), or holdout ( `H`) partitions.
> 
> Prediction Explanations require 100 rows in the validation partition, which—if you don’t define your own partitioning—requires the provided training dataset to contain a minimum of 500 rows. If the training data and partition ratio (defined automatically or manually) result in a validation partition containing fewer than 100 rows, Prediction Explanations are not calculated. While you can still register and deploy the model—and the deployment can make predictions—if you request predictions with explanations, the deployment returns an error.

To assign training data to a custom model version:

1. InModel Registry > Custom Model Workshop, in theModelslist, select the model you want to add training data to.
2. On theAssembletab, next toDatasets:
3. In theAdd Training Data(orChange Training Data) dialog box, click and drag a training dataset file into theTraining Databox, or clickChoose fileand do either of the following: Include features required for scoringThe columns in a custom model's training data indicate which features are included in scoring requests to the deployed custom model; therefore, once training data is available, any features not included in the training dataset aren't sent to the model. Available as a preview feature, when youassemble a custom modelin the NextGen experience, you can disable this behavior using theColumn filtering setting.
4. (Optional)Specify the column name containing partitioning info for your data(based on training/validation/holdout partitioning). If you plan to deploy the custom model and monitor itsdata driftandaccuracy, specify the holdout partition in the column to establish an accuracy baseline.
5. When the upload is complete, clickAdd Training Data. Training data assignment errorIf the training data assignment fails, an error message appears in the new custom model version underDatasets. While this error is active, you can't create a model package to deploy the affected version. To resolve the error and deploy the model package, reassign training data to create a new version, or create a new version andthenassign training data.