# Modeling algorithms

> Modeling algorithms - Provides a list of the supervised and unsupervised modeling algorithms
> DataRobot supports.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-05-01T23:10:48.115562+00:00` (UTC).

## Primary page

- [Modeling algorithms](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html): Full documentation for this topic (HTML).

## Sections on this page

- [Pre-processing tasks](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#pre-processing): In-page section heading.
- [Categorical](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#categorical): In-page section heading.
- [Numerical](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#numerical): In-page section heading.
- [Geospatial](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#geospatial): In-page section heading.
- [Images](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#images): In-page section heading.
- [Text models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#text-models): In-page section heading.
- [Generalized Linear Models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#generalized-linear-models): In-page section heading.
- [Linear or additive models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#linear-or-additive-models): In-page section heading.
- [Generalized Linear Models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#generalized-linear-models_1): In-page section heading.
- [Support Vector Machines](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#support-vector-machines): In-page section heading.
- [Generalized Additive Models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#generalized-additive-models): In-page section heading.
- [Tree-based models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#tree-based-models): In-page section heading.
- [Deep learning and foundational models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#deep-learning-and-foundational-models): In-page section heading.
- [Time series-specific models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#time-series-specific-models): In-page section heading.
- [Unsupervised models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#unsupervised-models): In-page section heading.
- [Anomaly detection models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#anomaly-detection-models): In-page section heading.
- [Clustering models](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#clustering-models): In-page section heading.
- [Other model types](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/model-list.html#other-model-types): In-page section heading.

## Related documentation

- [Reference documentation](https://docs.datarobot.com/en/docs/reference/index.html): Linked from this page.
- [Predictive AI reference](https://docs.datarobot.com/en/docs/reference/pred-ai-ref/index.html): Linked from this page.
- [blueprint](https://docs.datarobot.com/en/docs/api/reference/public-api/blueprints.html): Linked from this page.
- [model repository](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/build-basic/repository.html): Linked from this page.
- [Composable ML](https://docs.datarobot.com/en/docs/classic-ui/modeling/special-workflows/cml/index.html): Linked from this page.

## Documentation content

# Modeling algorithms

DataRobot supports a comprehensive library of pre- and post-processing (modeling) steps, which combine to make up the model [blueprint](https://docs.datarobot.com/en/docs/api/reference/public-api/blueprints.html). Which are run or available in the [model repository](https://docs.datarobot.com/en/docs/classic-ui/modeling/build-models/build-basic/repository.html) is dependent on the dataset. The comprehensive combination of pre- and post-processing steps allows DataRobot to confidently create a Leaderboard of your best modeling options. Some examples of the modeling flexibility include logistic regression with and without PCA as a pre-processor or random forests with and without a greedy search for interaction terms.

The implication of this is that for every model in the list below, DataRobot likely runs two-to-five times, each with a different pre-processing and/or variable selection. The following sections list the relevant algorithms:

- Pre-processing
- Linear or additive models
- Tree-based models
- Deep learning and foundational models
- Time series-specific models
- Unsupervised models
- Other model types

## Pre-processing tasks

#### Categorical

- Buhlman credibility estimates for high cardinality features
- Categorical embedding
- Category count
- One-hot encoding
- Ordinal encoding of categorical variables
- Univariate credibility estimates with L2
- Efficient, sparse one-hot encoding for extremely high cardinality categorical variables

#### Numerical

- Binning of numerical variables
- Constant splines
- Missing values imputed
- Numeric data cleansing
- Partial Principal Components Analysis
- Truncated Singular Values Decomposition
- Normalizer

#### Geospatial

- Geospatial Location Converter
- Spatial Neighborhood Featurizer

#### Images

- Greyscale Downscaled Image Featurizer
- No Post Processing
- OpenCV detect largest rectangle
- OpenCV image featurizer
- Pre-trained multi-level global average pooling image featurizer

#### Text models

- Character / word n-grams
- Pretrained byte-pair encoders (best of both words for char-grams and n-grams)
- Stopword removal
- TF-IDF scaling (optional sublinear scaling and binormal separation scaling)
- Hashing vectorizers for big data
- Cosine similarity between pairs of text columns (on datasets with 2+ text columns)
- Support for all languages, including English, Japanese, Chinese, Korean, French, Spanish, Chinese, Portuguese, Arabic, Ukrainian, Klingon, Elvish, Esperanto, etc.
- Unsupervised Fasttext models
- Linear n-gram models (character/word n-grams + TF-IDF + penalized linear/logistic regression)
- SVD n-gram models (n-grams + TF-IDF + SVD)
- Naive Bayes weighted SVM
- TinyBERT / Roberta/ MiniLM embedding models
- Text CNNs

#### Generalized Linear Models

- NA imputation (methods for missing at random and missing not at random), standardization, ridit transform
- Search for best transformations
- Efficient, sparse one-hot encoding for extremely high cardinality categorical variables

## Linear or additive models

#### Generalized Linear Models

- Penalty: L1 (Lasso), L2 (Ridge), ElasticNet, None (Logistic Regression)
- Distributions: Binomial, Gaussian, Poisson, Tweedie, Gamma, Huber
- Special Cases: 2-stage model (Binomial + Gaussian) for zero-inflated regression

#### Support Vector Machines

- Penalty: L1 (Lasso), L2 (Ridge), ElasticNet, None
- Kernel: Linear, Nyström RFB, RBF
- liblinear and libsvm

#### Generalized Additive Models

- GAM
- GA2M

## Tree-based models

- Decision Tree (or CART)
- Random Forest
- ExtraTrees (or Extremely Randomized Forests)
- Gradient Boosted Trees (or GBM— Binomial, Gaussian, Poisson, Tweedie, Gamma, Huber)
- Extreme Gradient Boosted Trees (or XGBoost— Binomial, Gaussian, Poisson)
- LightGBM
- AdaBoost
- RuleFit

## Deep learning and foundational models

- Keras MLPs with residual connections, adaptive learning rates and adaptive batch sizes
- Keras self-normalizing MLPs with residual connections
- Keras neural architecture search MLPs using hyperband
- DeepCTR
- Pretrained CNNs for images using foundational models (especially EfficientNet)
- Pretrained + fine-tuned CNNs for images
- Image augmentation
- Pretrained TinyBERT models for text
- Keras Text CNNs
- Fastext models for text

## Time series-specific models

- LSTMs
- DeepAR models
- AutoArima
- ETS, aka exponential smoothing
- TBATS
- Prophet

## Unsupervised models

#### Anomaly detection models

- Isolation Forest
- Local Outlier Factor
- One Class SVM
- Double Median Absolute Deviation
- Mahalanobis Distance
- Anomaly Detection Blenders
- Keras Deep Autoencoder
- Keras Deep Variational Autoencoder

#### Clustering models

- Kmeans
- HDBScan

## Other model types

- Eureqa (proprietary genetic algorithm for symbolic regression)
- K-Nearest Neighbors (three distances)
- Partial-least squares (used for blenders)
- Isotonic Regression (used for calibrating predictions from other models)

Click a [blueprint](https://docs.datarobot.com/en/docs/api/reference/public-api/blueprints.html) node to access full model documentation. Using [Composable ML](https://docs.datarobot.com/en/docs/classic-ui/modeling/special-workflows/cml/index.html), you can build models that best suit your needs using built-in tasks and custom Python/R code.
