# Assess data quality with EDA

> Assess data quality with EDA - How DataRobot performs Exploratory Data Analysis (EDA) and how to
> assess the quality of your data at each stage of EDA.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.536452+00:00` (UTC).

## Primary page

- [Assess data quality with EDA](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html): Full documentation for this topic (HTML).

## Sections on this page

- [Stages of EDA](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#stages-of-eda): In-page section heading.
- [Load and view your dataset](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#load-and-view-your-dataset): In-page section heading.
- [Assess after EDA1](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#assess-after-eda1): In-page section heading.
- [Assess after EDA2](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#assess-after-eda2): In-page section heading.
- [Investigate feature importance](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#investigate-feature-importance): In-page section heading.
- [Related reading](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#related-reading): In-page section heading.

## Related documentation

- [Classic UI documentation](https://docs.datarobot.com/en/docs/classic-ui/index.html): Linked from this page.
- [Data](https://docs.datarobot.com/en/docs/classic-ui/data/index.html): Linked from this page.
- [Analyze data](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/index.html): Linked from this page.
- [Outliers](https://docs.datarobot.com/en/docs/reference/data-ref/data-quality-ref.html#outliers): Linked from this page.
- [Exploratory data Analysis](https://docs.datarobot.com/en/docs/reference/data-ref/eda-explained.html#eda1): Linked from this page.
- [Fast EDA](https://docs.datarobot.com/en/docs/classic-ui/data/import-data/large-data/fast-eda.html#fast-eda-application): Linked from this page.
- [modeling mode](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/build-basic/model-data.html#set-the-modeling-mode): Linked from this page.
- [How common data quality issues are detected and surfaced in the Data Quality Assessment.](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/data-quality.html): Linked from this page.

## Documentation content

# Assess data quality with EDA

Learn how DataRobot performs Exploratory Data Analysis (EDA) and how to assess the quality of your data at each stage of EDA— EDA1 and EDA2.

Preparing your data is an iterative process. Even if you clean and prep your training data prior to uploading it to DataRobot, you can still improve its quality by assessing features during EDA.

The sample dataset featured on this page contains patient data. The goal is to predict the likelihood of patient readmission to the hospital. The target feature is `readmitted`.

## Stages of EDA

During EDA, DataRobot performs Data Quality Assessment. The assessment provides information about data quality issues that are relevant to the stage of model building you are performing. Click one of the following tabs to learn about the two EDA stages.

**EDA1:**
EDA1 (data ingest) occurs after you upload your data. EDA1 assesses the All Features list and detects issues like:

Outliers
Inliers
Excess zeros
Disguised missing values
Inconsistent gaps in time series projects

For more information on EDA1, see [Exploratory data Analysis](https://docs.datarobot.com/en/docs/reference/data-ref/eda-explained.html#eda1).

**EDA2:**
Once you click Start on the Data page, DataRobot performs another round of EDA. During this stage, DataRobot detects [target leakage](https://docs.datarobot.com/en/docs/reference/data-ref/data-quality-ref.html#target-leakage) and non-linear correlations between the features and the target, which helps you analyze [feature importance](https://docs.datarobot.com/en/docs/classic-ui/data/analyze-data/assess-data-quality-eda.html#investigate-feature-importance). EDA2 reports on the selected feature list. If a feature list is not selected, EDA2 reports on the default All Features list.

For more information on EDA2, see [Exploratory data Analysis](https://docs.datarobot.com/en/docs/reference/data-ref/eda-explained.html#eda2).


## Load and view your dataset

As soon as you load your dataset, creates a new project, and performs an initial EDA, generating summary statistics based on a sample of your data. View the progress in the Worker Queue on the right.

Once you import your data, click Explore the data or scroll down to see the features in your dataset.

DataRobot displays the features and provides summary information and statistics.

|  | Label | Description |
| --- | --- | --- |
| (1) | Var Type | The data type DataRobot identifies for the feature during EDA, for example, Numeric, Categorical, Boolean, Image, Text, and special features types like Date. |
| (2) | Unique | The number of unique values for the feature. |
| (3) | Missing | The number of missing values for the feature. |
| (4) | Mean, Std Dev, Median, Min, Max | DataRobot calculates these statistics for numerical features. |

## Assess after EDA1

EDA1 helps you catch data issues before you start modeling.

1. Above your feature list and to the right, clickView info. The Data Quality Assessment dropdown menu displays. TipThe Data Quality Assessment provides the following issue status flags:Warning: Attention or action required.Informational: No action required.No issue.
2. (Optional) ClickFilter affected features by type of issue detectedand select particular issues to search for.
3. Scroll down to locate the features with issues. If a feature has an issue, the issue flag displays in theData Qualitycolumn. Hover over the flag to view the type of issue.
4. Click a feature that displays an issue flag, then use tools such as the Histogram, Frequent Values, and Feature Associations to explore further.

## Assess after EDA2

EDA2 kicks off after you set your target and start the modeling process.

1. UnderWhat would you like to predict, enter your target feature. Modeling modesYou can keep the mode set to the default,QuickAutopilot, or you can select a differentmodeling mode. You can also customize yourmodeling settings.
2. ClickStart. DataRobot performs a number of processing steps. Monitor the steps in the Worker Queue. As soon as DataRobot finishes analyzing features, you can take a look at feature importance. DataRobot continues with blueprint generation.

## Investigate feature importance

The importance bars show the degree to which a feature is correlated with the target. Importance is calculated using an algorithm that measures the information content of the variable. This calculation is done independently for each feature in the dataset.

Investigate feature importance to determine which features are most useful for building accurate models and which features you can remove from your training data.

1. In theDatatab, scroll down to the feature list.
2. Take a look at theImportancecolumn. The green bars indicate how closely a feature is related to the target. You might want to remove features that are unrelated to the target.

## Related reading

To learn more about the topics discussed on this page, see:

- How DataRobot performs each stage of Exploratory Data Analysis (EDA).
- How common data quality issues are detected and surfaced in the Data Quality Assessment.
- Describes the checks DataRobot runs for the potential data quality issues.
