Skip to content

Click in-app to access the full platform documentation for your version of DataRobot.

Blueprints

During the course of building predictive models, DataRobot runs several different versions of each algorithm and tests thousands of possible combinations of data preprocessing and parameter settings. (Many of the models use DataRobot proprietary approaches to data preprocessing.) The result of this testing is provided in the Blueprints tab. A blueprint represents the high-level end-to-end procedure for fitting the model, including any preprocessing steps, algorithms, and post-processing.

What is the difference between a model and a blueprint?

A modeling algorithm fits a model to data, which is just one component of a blueprint. A blueprint represents the high-level, end-to-end procedure for fitting the model, including any preprocessing steps, modeling, and post-processing steps.

To view a graphical representation of a blueprint, click a model in the Leaderboard.

Availability information

The level of detail displayed in a blueprint is dependent on your feature enablement. If your organization does not currently display full, uncensored blueprints, contact your DataRobot representative for information on changing the access level.

Each blueprint has a few key sections.

Section Description
Data The incoming data, separated into each type (categorical, numeric, text, image, geospatial, etc.).
Transformations The tasks that perform transformations on the data (for example, Missing values imputed). Different columns in the dataset require different types of preparation and transformation. For example, some algorithms recommend subtracting the mean and dividing by the standard deviation of the input data—but this would not make sense for text input data. The first step in the execution of a blueprint is to identify data types that belong together so they can be processed separately.
Model(s) The model(s) making predictions or possibly supplying stacked predictions to a subsequent model.
Post-processing Any post-processing steps, such as Calibration.
Prediction The data being sent as the final predictions.

Each blueprint has nodes and edges (i.e., connections). A node will take in data, perform an operation, and output the data in its new form. An edge is a representation of the flow of data.

When two edges are received by a single node:

It is a representation of two sets of columns being received by the node— the two sets of columns are stacked horizontally. That is, the column count of the incoming data is the sum of the two sets of columns and the row count remains the same.

If two edges are output by a single node, it is a representation of two copies of the output data being sent to other nodes. Other nodes in the blueprint are other types of data transformations or models.

Click a blueprint node to display additional information, including access to model documentation.


Updated October 20, 2021
Back to top