DataRobot API resources > API user guide > Python code examples > Python modeling workflow overview

Python modeling workflow overview¶

This code example outlines how to use DataRobot's Python API client to train and experiment with models. It also offers ideas for integrating DataRobot with other products via the API.

Specifically, you will:

Create a project and run Autopilot.
Experiment with feature lists, modeling algorithms, and hyperparameters.
Choose the best model.
Perform an in-depth evaluation of the selected model.
Deploy a model into production in a few lines of code.

Data used for this example¶

This walkthrough uses a synthetic dataset that illustrates a credit card company’s anti-money laundering (AML) compliance program, with the intent of detecting the following money-laundering scenarios:

A customer spends on the card, but overpays their credit card bill and seeks a cash refund for the difference.
A customer receives credits from a merchant without offsetting transactions, and either spends the money or requests a cash refund from the bank.

A rule-based engine is in place to produce an alert when it detects potentially suspicious activity consistent with the scenarios above. The engine triggers an alert whenever a customer requests a refund of any amount. Small refund requests are included because they could be a money launderer’s way of testing the refund mechanism or trying to establish refund requests as a normal pattern for their account.

The target feature is SAR, suspicious activity reports. It indicates whether or not the alert resulted in an SAR after manual review by investigators, which means that this project is a binary classification problem. The unit of analysis is an individual alert, so the model will be built on the alert level. Each alert will get a score ranging from 0 to 1, indicating the probability of being an alert leading to an SAR. The data consists of a mixture of numeric, categorical, and text data.

Setup¶

Import Libraries¶

In [31]:

Copied!





import datarobot as dr
from datarobot_bp_workshop import Visualize, Workshop
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline
import time
import warnings

import graphviz
import plotly.express as px
import seaborn as sns

warnings.filterwarnings("ignore")
w = Workshop()

# wider .head()s
pd.options.display.width = 0
pd.options.display.max_columns = 200
pd.options.display.max_rows = 2000

sns.set_theme(style="darkgrid")
import datarobot as dr
from datarobot_bp_workshop import Visualize, Workshop
import matplotlib.pyplot as plt
import pandas as pd

%matplotlib inline
import time
import warnings

import graphviz
import plotly.express as px
import seaborn as sns

warnings.filterwarnings("ignore")
w = Workshop()

# wider .head()s
pd.options.display.width = 0
pd.options.display.max_columns = 200
pd.options.display.max_rows = 2000

sns.set_theme(style="darkgrid")

Connect to DataRobot¶

Read more about different options for connecting to DataRobot from the client.

In [ ]:

Copied!

# If the config file is not in the default location described in the API Quickstart guide, '~/.config/datarobot/drconfig.yaml', then you will need to call
# dr.Client(config_path='path-to-drconfig.yaml')
# If the config file is not in the default location described in the API Quickstart guide, '~/.config/datarobot/drconfig.yaml', then you will need to call
# dr.Client(config_path='path-to-drconfig.yaml')

Select a training dataset¶

In [ ]:

Copied!





# To read from a local file, uncomment and use:
# df = pd.read_csv('./data/DR_Demo_AML_Alert.csv')

# To read from an s3 bucket:
df = pd.read_csv("https://s3.amazonaws.com/datarobot_public_datasets/DR_Demo_AML_Alert.csv")
df.head()
# To read from a local file, uncomment and use:
# df = pd.read_csv('./data/DR_Demo_AML_Alert.csv')

# To read from an s3 bucket:
df = pd.read_csv("https://s3.amazonaws.com/datarobot_public_datasets/DR_Demo_AML_Alert.csv")
df.head()

In [20]:

Copied!





# To view target distribution:
df_target_summary = (
    pd.DataFrame(df["SAR"].value_counts())
    .reset_index()
    .rename(columns={"index": "SAR", "SAR": "Count"})
)
ax = sns.barplot(x="SAR", y="Count", data=df_target_summary)

for index, row in df_target_summary.iterrows():
    ax.text(row.SAR, row.Count, round(row.Count, 2), color="black", ha="center")

plt.show()
# To view target distribution:
df_target_summary = (
    pd.DataFrame(df["SAR"].value_counts())
    .reset_index()
    .rename(columns={"index": "SAR", "SAR": "Count"})
)
ax = sns.barplot(x="SAR", y="Count", data=df_target_summary)

for index, row in df_target_summary.iterrows():
    ax.text(row.SAR, row.Count, round(row.Count, 2), color="black", ha="center")

plt.show()

No description has been provided for this image

Create a project and train models with Autopilot¶

In [ ]:

Copied!





# Create a project by uploading data. This will take a few minutes.
project = dr.Project.create(
    sourcedata=df,
    project_name="DR_Demo_API_alert_AML_{}".format(pd.datetime.now().strftime("%Y-%m-%d %H:%M")),
)

# Set the project's target and initiate Autopilot in Quick mode.
# Wait for Autopilot to finish. You can set verbosity to 0 if you do not wish to see progress updates.
project.analyze_and_model(target="SAR", worker_count=-1)

# Open the project's Leaderboard to monitor the progress in UI.
project.open_in_browser()
# Create a project by uploading data. This will take a few minutes.
project = dr.Project.create(
    sourcedata=df,
    project_name="DR_Demo_API_alert_AML_{}".format(pd.datetime.now().strftime("%Y-%m-%d %H:%M")),
)

# Set the project's target and initiate Autopilot in Quick mode.
# Wait for Autopilot to finish. You can set verbosity to 0 if you do not wish to see progress updates.
project.analyze_and_model(target="SAR", worker_count=-1)

# Open the project's Leaderboard to monitor the progress in UI.
project.open_in_browser()

Retrieve and review results from the Leaderboard¶

In [33]:

Copied!





def get_top_of_leaderboard(project, verbose=True):
    # A helper method to assemble a dataframe with Leaderboard results and print a summary:
    leaderboard = []
    for m in project.get_models():
        leaderboard.append(
            [
                m.blueprint_id,
                m.featurelist.id,
                m.id,
                m.model_type,
                m.sample_pct,
                m.metrics["AUC"]["validation"],
                m.metrics["AUC"]["crossValidation"],
            ]
        )
    leaderboard_df = pd.DataFrame(
        columns=[
            "bp_id",
            "featurelist",
            "model_id",
            "model",
            "pct",
            "validation",
            "cross_validation",
        ],
        data=leaderboard,
    )

    if verbose == True:
        # Print a Leaderboard summary:
        print("Unique blueprints tested: " + str(len(leaderboard_df["bp_id"].unique())))
        print("Feature lists tested: " + str(len(leaderboard_df["featurelist"].unique())))
        print("Models trained: " + str(len(leaderboard_df)))
        print("Blueprints in the project repository: " + str(len(project.get_blueprints())))

        # Print the essential information for top models, sorted by accuracy from validation data:
        print("\n\nTop models on the Leaderboard:")
        leaderboard_top = (
            leaderboard_df[leaderboard_df["pct"] == 64]
            .sort_values(by="cross_validation", ascending=False)
            .head()
            .reset_index(drop=True)
        )
        display(leaderboard_top.drop(columns=["bp_id", "featurelist"], inplace=False))

        # Show blueprints of top models:
        for index, m in leaderboard_top.iterrows():
            Visualize.show_dr_blueprint(dr.Blueprint.get(project.id, m["bp_id"]))

    return leaderboard_top


leaderboard_top = get_top_of_leaderboard(project)
def get_top_of_leaderboard(project, verbose=True):
    # A helper method to assemble a dataframe with Leaderboard results and print a summary:
    leaderboard = []
    for m in project.get_models():
        leaderboard.append(
            [
                m.blueprint_id,
                m.featurelist.id,
                m.id,
                m.model_type,
                m.sample_pct,
                m.metrics["AUC"]["validation"],
                m.metrics["AUC"]["crossValidation"],
            ]
        )
    leaderboard_df = pd.DataFrame(
        columns=[
            "bp_id",
            "featurelist",
            "model_id",
            "model",
            "pct",
            "validation",
            "cross_validation",
        ],
        data=leaderboard,
    )

    if verbose == True:
        # Print a Leaderboard summary:
        print("Unique blueprints tested: " + str(len(leaderboard_df["bp_id"].unique())))
        print("Feature lists tested: " + str(len(leaderboard_df["featurelist"].unique())))
        print("Models trained: " + str(len(leaderboard_df)))
        print("Blueprints in the project repository: " + str(len(project.get_blueprints())))

        # Print the essential information for top models, sorted by accuracy from validation data:
        print("\n\nTop models on the Leaderboard:")
        leaderboard_top = (
            leaderboard_df[leaderboard_df["pct"] == 64]
            .sort_values(by="cross_validation", ascending=False)
            .head()
            .reset_index(drop=True)
        )
        display(leaderboard_top.drop(columns=["bp_id", "featurelist"], inplace=False))

        # Show blueprints of top models:
        for index, m in leaderboard_top.iterrows():
            Visualize.show_dr_blueprint(dr.Blueprint.get(project.id, m["bp_id"]))

    return leaderboard_top


leaderboard_top = get_top_of_leaderboard(project)

Unique blueprints tested: 15
Feature lists tested: 4
Models trained: 28
Blueprints in the project repository: 81


Top models on the Leaderboard:

	model_id	model	pct	validation	cross_validation
0	61ae672ab9e0c15c325b05b2	RandomForest Classifier (Gini)	64.0	0.94577	0.945790
1	61ae73aa2a2a4649c04e2c73	RandomForest Classifier (Gini)	64.0	0.94598	0.945712
2	61ae69df4e24ec75154c24ee	AVG Blender	64.0	0.94573	0.945318
3	61ae65a640d62771a45b0594	eXtreme Gradient Boosted Trees Classifier with...	64.0	0.94675	0.945166
4	61ae65a540d62771a45b0592	RandomForest Classifier (Gini)	64.0	0.94542	0.944690

Experiment to get better results¶

When you run a project using Autopilot, DataRobot first creates blueprints based on the characteristics of your data and puts them in the Repository. Then, it chooses a subset from these to train; when training completes, these are the blueprints you’ll find on the Leaderboard. After the Leaderboard is populated, it can be useful to train some of those blueprints that DataRobot skipped. For example, you can try a more complex Keras blueprint like Keras Residual AutoInt Classifier using Training Schedule (3 Attention Layers with 2 Heads, 2 Layers: 100, 100 Units). In some cases, you may want to directly access the trained model through R and retrain it with a different feature list or tune its hyperparameters.

Find blueprints trained for the project from the Repository¶

In [ ]:

Copied!





blueprints = project.get_blueprints()

# After retrieving the blueprints, you can search for a specific blueprint
# In the example below, search for all models that have "Gradient" in their name

models_to_run = []
for blueprint in blueprints:
    if "Gradient" in blueprint.model_type:
        models_to_run.append(blueprint)
blueprints = project.get_blueprints()

# After retrieving the blueprints, you can search for a specific blueprint
# In the example below, search for all models that have "Gradient" in their name

models_to_run = []
for blueprint in blueprints:
    if "Gradient" in blueprint.model_type:
        models_to_run.append(blueprint)

In [ ]:

Copied!

models_to_run
models_to_run

Define and train a custom blueprint¶

If you wish to instead create a custom blueprint rather than finding one from the Repository, use the following snippet. You can read more about composing custom blueprints via code by visiting the blueprint workshop in DataRobot.

In [34]:

Copied!





pdm3 = w.Tasks.PDM3(w.TaskInputs.CAT)
pdm3.set_task_parameters(cm=50000, sc=10)

ndc = w.Tasks.NDC(w.TaskInputs.NUM)
rdt5 = w.Tasks.RDT5(ndc)

ptm3 = w.Tasks.PTM3(w.TaskInputs.TXT)
ptm3.set_task_parameters(d2=0.2, mxf=20000, d1=5, n="l2", id=True)

kerasc = w.Tasks.KERASC(rdt5, pdm3, ptm3)
kerasc.set_task_parameters(
    always_use_test_set=1,
    epochs=4,
    hidden_batch_norm=1,
    hidden_units="list(64)",
    hidden_use_bias=0,
    learning_rate=0.03,
    use_training_schedule=1,
)

# Check task documentation:
# kerasc.documentation()

kerasc_blueprint = w.BlueprintGraph(kerasc, name="A Custom Keras BP (1 Layer: 64 Units)").save()
kerasc_blueprint.show()
kerasc_blueprint.train(project_id=project.id, sample_pct=64)
pdm3 = w.Tasks.PDM3(w.TaskInputs.CAT)
pdm3.set_task_parameters(cm=50000, sc=10)

ndc = w.Tasks.NDC(w.TaskInputs.NUM)
rdt5 = w.Tasks.RDT5(ndc)

ptm3 = w.Tasks.PTM3(w.TaskInputs.TXT)
ptm3.set_task_parameters(d2=0.2, mxf=20000, d1=5, n="l2", id=True)

kerasc = w.Tasks.KERASC(rdt5, pdm3, ptm3)
kerasc.set_task_parameters(
    always_use_test_set=1,
    epochs=4,
    hidden_batch_norm=1,
    hidden_units="list(64)",
    hidden_use_bias=0,
    learning_rate=0.03,
    use_training_schedule=1,
)

# Check task documentation:
# kerasc.documentation()

kerasc_blueprint = w.BlueprintGraph(kerasc, name="A Custom Keras BP (1 Layer: 64 Units)").save()
kerasc_blueprint.show()
kerasc_blueprint.train(project_id=project.id, sample_pct=64)

Training requested! Blueprint Id: 4f5c40cbacfa89b3e37dc2f6d5c169a2

Out[34]:

Name: 'A Custom Keras BP (1 Layer: 64 Units)'

Input Data: Categorical | Numeric | Text
Tasks: One-Hot Encoding | Numeric Data Cleansing | Smooth Ridit Transform | Matrix of word-grams occurrences | Keras Neural Network Classifier

Train a model with a different feature list¶

In [ ]:

Copied!





# Select a model from the Leaderboard:
model = dr.Model.get(project=project.id, model_id=leaderboard_top.iloc[0]["model_id"])

# Retrieve Feature Impact:
feature_impact = model.get_or_request_feature_impact()

# Create a feature list using the top 25 features based on feature impact:
feature_list = [f["featureName"] for f in feature_impact[:25]]
new_list = project.create_featurelist("new_feat_list", feature_list)

# Retrain models using the new feature list:
model.retrain(featurelist_id=new_list.id)
# Select a model from the Leaderboard:
model = dr.Model.get(project=project.id, model_id=leaderboard_top.iloc[0]["model_id"])

# Retrieve Feature Impact:
feature_impact = model.get_or_request_feature_impact()

# Create a feature list using the top 25 features based on feature impact:
feature_list = [f["featureName"] for f in feature_impact[:25]]
new_list = project.create_featurelist("new_feat_list", feature_list)

# Retrain models using the new feature list:
model.retrain(featurelist_id=new_list.id)

Tune hyperparameters for a model¶

In [ ]:

Copied!





tune = model.start_advanced_tuning_session()

# Get available task names,
# and available parameter names for a task name that exists on this model
tasks = tune.get_task_names()
tune.get_parameter_names(tasks[2])

# Adjust this section as required as it may differ depending on task/parameter names as well as acceptable values
tune.set_parameter(task_name=tasks[1], parameter_name="n_estimators", value=200)

job = tune.run()
tune = model.start_advanced_tuning_session()

# Get available task names,
# and available parameter names for a task name that exists on this model
tasks = tune.get_task_names()
tune.get_parameter_names(tasks[2])

# Adjust this section as required as it may differ depending on task/parameter names as well as acceptable values
tune.set_parameter(task_name=tasks[1], parameter_name="n_estimators", value=200)

job = tune.run()

Select the best model¶

In [32]:

Copied!

# View the top models on the Leaderboard
leaderboard_top = get_top_of_leaderboard(project)
# View the top models on the Leaderboard
leaderboard_top = get_top_of_leaderboard(project)

Unique blueprints tested: 15
Feature lists tested: 4
Models trained: 28
Blueprints in the project repository: 81


Top models on the Leaderboard:

	model_id	model	pct	validation	cross_validation
0	61ae672ab9e0c15c325b05b2	RandomForest Classifier (Gini)	64.0	0.94577	0.945790
1	61ae73aa2a2a4649c04e2c73	RandomForest Classifier (Gini)	64.0	0.94598	0.945712
2	61ae69df4e24ec75154c24ee	AVG Blender	64.0	0.94573	0.945318
3	61ae65a640d62771a45b0594	eXtreme Gradient Boosted Trees Classifier with...	64.0	0.94675	0.945166
4	61ae65a540d62771a45b0592	RandomForest Classifier (Gini)	64.0	0.94542	0.944690

In [37]:

Copied!

# Select the model based on accuracy (AUC)
top_model = dr.Model.get(project=project.id, model_id=leaderboard_top.iloc[0]["model_id"])
# Select the model based on accuracy (AUC)
top_model = dr.Model.get(project=project.id, model_id=leaderboard_top.iloc[0]["model_id"])

In-depth model evaluation¶

Retrieve and plot the ROC curve¶

In [38]:

Copied!





roc = top_model.get_roc_curve("validation")
df_roc = pd.DataFrame(roc.roc_points)
dr_dark_blue = "#08233F"
dr_roc_green = "#03c75f"
white = "#ffffff"

fig = plt.figure(figsize=(8, 8))
axes = fig.add_subplot(1, 1, 1, facecolor=dr_dark_blue)

plt.scatter(df_roc.false_positive_rate, df_roc.true_positive_rate, color=dr_roc_green)
plt.plot(df_roc.false_positive_rate, df_roc.true_positive_rate, color=dr_roc_green)
plt.plot([0, 1], [0, 1], color=white, alpha=0.25)
plt.title("ROC curve")
plt.xlabel("False Positive Rate (Fallout)")
plt.xlim([0, 1])
plt.ylabel("True Positive Rate (Sensitivity)")
plt.ylim([0, 1])
plt.show()
roc = top_model.get_roc_curve("validation")
df_roc = pd.DataFrame(roc.roc_points)
dr_dark_blue = "#08233F"
dr_roc_green = "#03c75f"
white = "#ffffff"

fig = plt.figure(figsize=(8, 8))
axes = fig.add_subplot(1, 1, 1, facecolor=dr_dark_blue)

plt.scatter(df_roc.false_positive_rate, df_roc.true_positive_rate, color=dr_roc_green)
plt.plot(df_roc.false_positive_rate, df_roc.true_positive_rate, color=dr_roc_green)
plt.plot([0, 1], [0, 1], color=white, alpha=0.25)
plt.title("ROC curve")
plt.xlabel("False Positive Rate (Fallout)")
plt.xlim([0, 1])
plt.ylabel("True Positive Rate (Sensitivity)")
plt.ylim([0, 1])
plt.show()

Retrieve and plot Feature Impact¶

In [39]:

Copied!





max_num_features = 15

# Retrieve Feature Impact
feature_impacts = top_model.get_or_request_feature_impact()

# Plot permutation-based Feature Impact
feature_impacts.sort(key=lambda x: x["impactNormalized"], reverse=True)
FeatureImpactDF = pd.DataFrame(
    [
        {"Impact Normalized": f["impactNormalized"], "Feature Name": f["featureName"]}
        for f in feature_impacts[:max_num_features]
    ]
)
FeatureImpactDF["X axis"] = FeatureImpactDF.index
g = sns.lmplot(x="Impact Normalized", y="X axis", data=FeatureImpactDF, fit_reg=False)
sns.barplot(y=FeatureImpactDF["Feature Name"], x=FeatureImpactDF["Impact Normalized"])
max_num_features = 15

# Retrieve Feature Impact
feature_impacts = top_model.get_or_request_feature_impact()

# Plot permutation-based Feature Impact
feature_impacts.sort(key=lambda x: x["impactNormalized"], reverse=True)
FeatureImpactDF = pd.DataFrame(
    [
        {"Impact Normalized": f["impactNormalized"], "Feature Name": f["featureName"]}
        for f in feature_impacts[:max_num_features]
    ]
)
FeatureImpactDF["X axis"] = FeatureImpactDF.index
g = sns.lmplot(x="Impact Normalized", y="X axis", data=FeatureImpactDF, fit_reg=False)
sns.barplot(y=FeatureImpactDF["Feature Name"], x=FeatureImpactDF["Impact Normalized"])

Out[39]:

<AxesSubplot:xlabel='Impact Normalized', ylabel='Feature Name'>

Retrieve and plot Feature Effects¶

In [40]:

Copied!





feature_effects = top_model.get_or_request_feature_effect(source="validation")
max_features = 5

for f in feature_effects.feature_effects[:max_features]:
    plt.figure(figsize=(9, 6))
    d = pd.DataFrame(f["partial_dependence"]["data"])
    if f["feature_type"] == "numeric":
        d = d[d["label"] != "nan"]
        d["label"] = pd.to_numeric(d["label"])
        sns.lineplot(x="label", y="dependence", data=d).set_title(
            f["feature_name"] + ": importance=" + str(round(f["feature_impact_score"], 2))
        )
    else:
        sns.scatterplot(x="label", y="dependence", data=d).set_title(
            f["feature_name"] + ": importance=" + str(round(f["feature_impact_score"], 2))
        )
feature_effects = top_model.get_or_request_feature_effect(source="validation")
max_features = 5

for f in feature_effects.feature_effects[:max_features]:
    plt.figure(figsize=(9, 6))
    d = pd.DataFrame(f["partial_dependence"]["data"])
    if f["feature_type"] == "numeric":
        d = d[d["label"] != "nan"]
        d["label"] = pd.to_numeric(d["label"])
        sns.lineplot(x="label", y="dependence", data=d).set_title(
            f["feature_name"] + ": importance=" + str(round(f["feature_impact_score"], 2))
        )
    else:
        sns.scatterplot(x="label", y="dependence", data=d).set_title(
            f["feature_name"] + ": importance=" + str(round(f["feature_impact_score"], 2))
        )

Score data before deployment¶

In [41]:

Copied!





# Use training data to test how the model makes predictions
test_data = df.head(50)

dataset_from_file = project.upload_dataset(test_data)
predict_job_1 = top_model.request_predictions(dataset_from_file.id)

predictions = predict_job_1.get_result_when_complete()
display(predictions.head())
# Use training data to test how the model makes predictions
test_data = df.head(50)

dataset_from_file = project.upload_dataset(test_data)
predict_job_1 = top_model.request_predictions(dataset_from_file.id)

predictions = predict_job_1.get_result_when_complete()
display(predictions.head())

	row_id	positive_probability	prediction_threshold	class_0.0	class_1.0
0	0	0.000192	0.5	0.999808	0.000192
1	1	0.214922	0.5	0.785078	0.214922
2	2	0.256123	0.5	0.743877	0.256123
3	3	0.000051	0.5	0.999949	0.000051
4	4	0.215951	0.5	0.784049	0.215951

Compute Prediction Explanations¶

In [42]:

Copied!

# Prepare prediction explanations
pe_job = dr.PredictionExplanationsInitialization.create(project.id, top_model.id)
pe_job.wait_for_completion()
# Prepare prediction explanations
pe_job = dr.PredictionExplanationsInitialization.create(project.id, top_model.id)
pe_job.wait_for_completion()

In [ ]:

Copied!





# Compute prediction explanations with default parameters
pe_job2 = dr.PredictionExplanations.create(
    project.id,
    top_model.id,
    dataset_from_file.id,
    max_explanations=3,
    threshold_low=0.1,
    threshold_high=0.5,
)
pe = pe_job2.get_result_when_complete()
display(pe.get_all_as_dataframe().head())
# Compute prediction explanations with default parameters
pe_job2 = dr.PredictionExplanations.create(
    project.id,
    top_model.id,
    dataset_from_file.id,
    max_explanations=3,
    threshold_low=0.1,
    threshold_high=0.5,
)
pe = pe_job2.get_result_when_complete()
display(pe.get_all_as_dataframe().head())

Deploy a model¶

After identifying the best-performing models, you can deploy them and use DataRobot's REST API to make HTTP requests and return predictions. You can also configure batch jobs to write back into your environment of choice.

Once deployed, access monitoring capabilities such as:

In [ ]:

Copied!





# Copy and paste the model ID from previous steps or from the UI:
model_id = top_model.id
prediction_server_id = dr.PredictionServer.list()[0].id

deployment = dr.Deployment.create_from_learning_model(
    model_id,
    label="New Deployment",
    description="A new deployment",
    default_prediction_server_id=prediction_server_id,
)
deployment
# Copy and paste the model ID from previous steps or from the UI:
model_id = top_model.id
prediction_server_id = dr.PredictionServer.list()[0].id

deployment = dr.Deployment.create_from_learning_model(
    model_id,
    label="New Deployment",
    description="A new deployment",
    default_prediction_server_id=prediction_server_id,
)
deployment

Python modeling workflow overview¶

Data used for this example¶

Setup¶

Import Libraries¶

Connect to DataRobot¶

Select a training dataset¶

Create a project and train models with Autopilot¶

Retrieve and review results from the Leaderboard¶

Experiment to get better results¶

Find blueprints trained for the project from the Repository¶

Define and train a custom blueprint¶

Train a model with a different feature list¶

Tune hyperparameters for a model¶

Select the best model¶

In-depth model evaluation¶

Retrieve and plot the ROC curve¶

Retrieve and plot Feature Impact¶

Retrieve and plot Feature Effects¶

Score data before deployment¶

Compute Prediction Explanations¶

Deploy a model¶

Was this page helpful?

Great! Let us know what you found helpful.

What can we do to improve the content?