# Smart Downsampling

> Smart Downsampling - Smart Downsampling is a technique to reduce total dataset size by reducing the
> size of the majority class, enabling you to build models faster without sacrificing accuracy.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.597678+00:00` (UTC).

## Primary page

- [Smart Downsampling](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/smart-ds.html): Full documentation for this topic (HTML).

## Sections on this page

- [When to downsample](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/smart-ds.html#when-to-downsample): In-page section heading.
- [Conditions for Smart Downsampling](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/smart-ds.html#conditions-for-smart-downsampling): In-page section heading.
- [Enable Smart Downsampling](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/smart-ds.html#enable-smart-downsampling): In-page section heading.

## Related documentation

- [Classic UI documentation](https://docs.datarobot.com/en/docs/classic-ui/index.html): Linked from this page.
- [Modeling](https://docs.datarobot.com/en/docs/classic-ui/modeling/index.html): Linked from this page.
- [Build models](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/index.html): Linked from this page.
- [Advanced options](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/index.html): Linked from this page.
- [LogLoss](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/opt-metric.html#loglossweighted-logloss): Linked from this page.
- [different calculation](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/ts-reference/ts-feature-lists.html#zero-inflated-models): Linked from this page.
- [Random Partitioning](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/adv-opt/partitioning.html#random-partitioning-random): Linked from this page.
- [anomaly models](https://docs.datarobot.com/en/docs/classic-ui/modeling/special-workflows/unsupervised/anomaly-detection.html): Linked from this page.
- [modeling mode](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/build-basic/model-data.html#set-the-modeling-mode): Linked from this page.

## Documentation content

# Smart Downsampling

Smart downsampling is a technique to reduce total dataset size by reducing the size of the majority class, enabling you to build models faster without sacrificing accuracy. When enabled, all analysis and model building is based on the new dataset size after smart downsampling.

When setting the downsampling percentage rate, you are specifying the size of the majority class after Smart Downsampling. For example, a 70% Smart Downsampling rate would downsample a majority class of 100 rows to 70 rows.

### When to downsample

There are two types of problems that benefit from Smart Downsampling:

Imbalanced classification: This is a problem in which one of the two target classes occurs far more frequently than others in the dataset. For example, a direct mail response dataset might consist of negative responses on 99.5% of the records and positive responses on only 0.5%.

Zero-inflated regression: This is a problem in which the value zero appears in more than 50% of the dataset. A common example of this is within insurance claim data where, for example, 90% of policies may generate zero loss while the other 10% generate claims of various amounts.

In both cases, DataRobot first downsamples the majority class to make the classes balanced, then adds a weight so that the effect of the resulting dataset mimics the original balance of the classes. The applicable optimization metric indicates that the classes are weighted.

## Conditions for Smart Downsampling

Consider the following when using Smart Downsampling:

- The dataset must be larger than 500MB.
- The target variable must take only two values (binary classification) or it must be numeric with more than 50% of values being exactly zero (zero-boosted regression). With time series projects, modeling with many zeros uses adifferent calculation.
- You cannot selectRandom Partitioning(it is automatically disabled when you enable Smart Downsampling).
- DataRobot will not createanomaly modelswhen Smart Downsampling is enabled.
- Once enabled, the selected downsampling percentage rate cannot result in the majority class becoming smaller than the minority class.

If the conditions are not met, you cannot enable the feature. The Smart Downsampling option displays a message indicating that the current target is not a binary classification or zero-boosted regression problem.

When you use simple (binary) classification, DataRobot downsamples the majority class. When you use regression, DataRobot downsamples the zero values.Smart Downsampling is selected by default when both of the following conditions are met:

- The majority class is 2x or greater than the minority class.
- The dataset is larger than 500MB.

## Enable Smart Downsampling

Enable Smart Downsampling and specify a sampling percentage from the Advanced options link on the Data page:

1. Import a dataset or open a project for which models have not yet been built and enter a target variable that results in a binary classification or zero-boosted regression problem.
2. Click theShow advanced optionslink and select theSmart Downsamplingoption.
3. ToggleDownsample Datato ON:
4. By typing in the box or using the slider, enter the majority class downsampling percentage rate. Note the following:
5. Scroll to the top of the page, choose amodeling mode, and clickStartto begin modeling.
6. When model building is complete, selectModelsfrom the toolbar. The Leaderboard displays an icon indicating that model results are based on downsampling:
7. Click the icon for a report of the downsampling results:

From the report, you can see that readmitted=true, the minority class, was not modified by downsampling. The majority class, readmitted=false, was reduced by 25%. In other words, the percentage of the majority class that was maintained was 75%.
