Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Insights

The Insights tab provides graphical representations of model details.

Member tabs Description Source
Activation Maps Visualize areas of images that a model is using when making predictions Training data
Anomaly Detection Provides a summary table of anomalous results sorted by score From Training data, the most anomalous rows (those with the highest scores)
Category Cloud Visualize relevancy of a collection of categories from summarized categorical features Training data
Hotspots Indicate predictive performance Training data
Image Embeddings Displays a projection of images onto a two-dimensional space defined by similarity Training data
Text Mining Visualize relevancy of words and short phrases Training data
Tree-based Variable Importance Provides a ranking of the most important variables in a model Training data
Variable Effects Illustrate the magnitude and direction of a feature's effect on a model's predictions Validation data
Word Cloud Visualize variable keyword relevancy Training data

Note

The particular insights that display are dependent on the model type, which is in turn dependent on the project type. In other words, not all insights listed above (and described below) will be available for all projects.

The following sections describe each of the Insight options. For all insights, use the dropdown menu to view and change to insights available for the project (1) and models available for that insight (2). Use the Export button to download the data to available options of PNG, CSV, or ZIP, depending on the insight:

Tree-Based Variable Importance

The Tree-Based Variable Importance chart shows the sorted relative importance of all key variables driving a specific model. This view accumulates all the Importance charts for models in the project to make it easier to compare these charts across models. Change the Sort by dropdown to list features by ranked importance or alphabetically (2).

Note

The chart is only available for tree/forest models (for example, Gradient Boosted Trees Classifier or Random Forest).

The chart shows the relative importance of all key features making up the model. The importance of each feature is calculated relative to the most important feature for predicting the target. To calculate, DataRobot sets the relative importance of the most important feature to 100%, and all other features are a percentage relative to the top feature.

Consider the following when interpreting the chart:

  • Sometimes relative importance can be very useful, especially when a particular feature appears to be significantly more important for predictions than all other features. It is usually worth checking if the values of this very important variable do not depend on the response. If it is the case, you may want to exclude this feature in training the model. Not all models have a Coefficients chart, and the Importance graph is the only way to visualize the feature impact to the model.

  • If a feature is included in only one model out of the dozens that DataRobot builds, it may not be that important. Excluding it from the feature set can optimize model building and future predictions.

  • It is useful to compare how feature importance changes for the same model with different feature lists. Sometimes the features recognized as important on a reduced dataset differ substantially from the features recognized on the full feature set.

Variable Effects

While Tree-Based Variable Importance tells you the relevancy of different variables to the model, the Variable Effects chart shows the impact of each variable in the prediction outcome.

Use this chart to compare the impact of a feature for different Constant Spline models. It is useful to ensure that the relative rank of feature importance across models does not vary wildly. If in one model a feature is regarded to be very important with positive effect and in another with negative, it is worth double-checking both the dataset and the model.

With Variable Effects, you can:

  • Click Variable Effects to display the relative rank of features.
  • Use the Sort by dropdown to sort values by impact (Feature Coefficients) or alphabetically (Feature Name).

Tip

Variable Effects are only available for full Autopilot models built using Constant Splines during preprocessing. To see the impact of each variable in the prediction outcomes for other model types, use the Coefficients tab.

Text-based insights

To help assess variable keyword relevancy, DataRobot provides both Text Mining and Word Cloud insights. If you expected to see one of these text models and do not, view the Log tab for error messages to help understand why the models may be missing.

One common reasons that text models are not built is because DataRobot removes single-character "words" when model building. It does this because the words are typically uninformative (e.g., "a" or "I"). A side-effect of this removal is that single-digit numbers are also removed. In other words, DataRobot removes "1" or "2" or "a" or "I". This common practice in text mining (for example, the Sklearn Tfidf Vectorizer "selects tokens of 2 or more alphanumeric characters").

This can be an issue if you have encoded words as numbers (which some organizations do to anonymize data). For example, if you use "1 2 3" instead of "john jacob schmidt" and "1 4 3" instead of "john jingleheimer schmidt," DataRobot removes the single digits; the texts become "" and "". If DataRobot cannot find any words for features of type text (because they are all single digits) it errors.

If you need a workaround to avoid the error, here are two simple solutions:

  • start numbering at 10 (e.g., "11 12 13" and "11 14 13")
  • add a single letter to each ID (e.g., "x1 x2 x3" and "x1 x4 x3").

Text Mining insights

The Text Mining chart displays the most relevant words and short phrases in any variables detected as text. Text variables often contain words that are highly indicative of the response.

The most important words and phrases are shown in the text mining chart, ranked by their coefficient value (which indicates how strongly the word or phrase is correlated with the target). This ranking enables you to compare the strength of the presence of these words and phrases. The side-by-side comparison allows you to see how individual words can be used in numerous —and sometimes counterintuitive—ways, with many different implications for the response.

With Text Mining you can:

  • Display text strings with a positive effect (red) and a negative effect (blue).
  • Use the Sort by dropdown (1) to sort values by impact (Feature Coefficients) or alphabetically (Feature Name).
  • For multiclass projects, use the Select Class dropdown (2) to choose the specific class you want to see text mining insights for.

Word Cloud insights

Word Cloud displays the most relevant words and short phrases in word cloud format. Text variables often contain words that are highly indicative of the response. Use Word Cloud on either the Insights page or the Leaderboard. Operationally, each version of the model behaves the same—use the Leaderboard tab to view a word cloud while investigating an individual model and the Insights page to access, and compare, each word cloud for a project. Additionally, word clouds are available for multimodal datasets (i.e., datasets that mix images, text, categorical, etc.)—a word cloud is displayed for all text from the data.

Note

The Word Cloud for a model is based on the data used to train that model, not on the entire dataset. For example, a model trained on a 32% sample size will result in a Word Cloud that reflects those same 32% of rows.

Word clouds are supported in the following model types:

  • Binary classification: All variants of ElasticNet Classifier (linear family models) with the exception of TinyBERT ElasticNet classifier and FastText ElasticNet classifier.
  • Multiclass: Stochastic Gradient Descent
  • Regression: Ridge Regressor, ElasticNet Regressor, Lasso Regressor

Click Word Cloud to display the chart:

  • Text strings are displayed in a color spectrum from blue to red, with blue indicating a negative effect and red indicating a positive effect.
  • Text strings that appear more frequently are displayed in a larger font size, and those that appear less frequently are displayed in smaller font sizes.

With a Word Cloud, you can:

  • Mouse over a word to display the coefficient value (1) specific to that word.
  • For multiclass projects, use the Select Class dropdown (2) to choose the specific class you want to see the word cloud for.
  • Check the Filter Stop Words box (3) to remove stop words (commonly used terms that can be excluded from searches) from the display.

See this note for a description of how DataRobot handles single-character "words."

Hotspot insights

Hot and cold spots represent simple rules with high predictive performance. These rules are good predictors for data and can easily be translated and implemented as business rules.

Hotspot insights are available only when you have:

  • A RuleFit Classification or Regression model that is trained on the training dataset only (not trained on the validation set or the holdout set)
  • At least one numeric or categorical column
  • Fewer than 100K rows

DataRobot uses the rules created by the RuleFit model to produce the hotspots plot in the Insights tab. Each spot corresponds to a rule.

  • The size of the spot indicates the number of observations that follow the rule.
  • The color of the rule indicates the relative difference between the average target value for the group defined by the rule and the overall population.

This difference is also represented as a ratio known as the Mean Relative to Target (MRT) Ratio, which is the ratio of the average target value for the subgroup defined by the rule to the average target value of the overall population. High values of MRT—i.e., red dots or “hotspots”—indicate groups with higher target values, whereas low values of MRT (blue dots or “coldspots”) indicate groups with lower target values.

An example of the average of subgroup divided by the overall average: If the average readmission rate across your dataset is 40%, but for people with 10+ inpatient procedures it is 80%, then MRT is 2.00. That does not mean that people with 10+ inpatient procedures are twice as likely to be readmitted. Instead it tells you that this rule is twice better at capturing positive instances than just guessing at random using the overall sample mean.

Rules also exist for categorical features. They will include x <= 0.5 or x > 0.5, which represent x=0 or “No” for a given category, or x=1 or Yes, respectively.

For example, consider a dataset that looks at admitted hospital patients. The categorical feature Medical Specialty identifies the speciality of a physician that attends to a patient (cardiology, surgeon, etc.). This feature is included in the rule MEDICAL_SPECIALTY-Surgery-General <= 0.5. This rule captures all the rows in the dataset where the medical specialty of the attending physician is not “Surgery General”.

With Hotspots, you can:

  • Click Hotspots to display the chart:

    • Hotspot rules are displayed in a color spectrum from blue to red, with blue indicating a negative effect (“cold”), and red indicating a positive effect (“hot”).
    • Rules with a larger positive or negative effect are displayed in deeper shades of red or blue, and those with a smaller magnitude are displayed in lighter shades.
    • Hover over a spot to display details.
    • Hotspot values are also displayed in a table beneath the image.
    • The Observation % displayed in the table is calculated using data from the validation partition.

  • Display only hotspots or coldspots by clicking Hot/Cold, then selecting or clearing the Hot and Cold check boxes.

Category Cloud insights

The Category Cloud tab becomes available for summarized categorical features after the modeling process completes. This is the same word cloud that is available from the Category Cloud tab on the Data page. On the Insights page you can compare word clouds for a project's categorically-based models; from the Data page you can more easily compare clouds across features. Note that the Category Cloud is not created when using a multiclass target.

The Category Cloud displays the keys most relevant to their corresponding feature in word cloud format. Keys are displayed in a color spectrum from blue to red, with blue indicating a negative effect and red indicating a positive effect. Keys that appear more frequently are displayed in a larger font size, and those that appear less frequently are displayed in smaller font sizes.

Check the Filter stop words box to remove stopwords (commonly used terms that can be excluded from searches) from the display. Removing these words can improve interpretability if the words are not informative to the Auto-Tuned Summarized Categorical Model.

Mouse over a key to display the coefficient value specific to that key. Note that the names of keys are truncated to 20 characters when displayed in the cloud. Hover over a key to read its full name (displayed with the information to the left of the cloud), which is limited to 100 characters.


Updated November 16, 2021
Back to top