# Unsupervised Projects (Clustering)

> Unsupervised Projects (Clustering) - Learn how to create and work with unsupervised clustering
> projects in DataRobot.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.289395+00:00` (UTC).

## Primary page

- [Unsupervised Projects (Clustering)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html): Full documentation for this topic (HTML).

## Sections on this page

- [Create unsupervised projects](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#create-unsupervised-projects): In-page section heading.
- [Unsupervised clustering project metric](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#unsupervised-clustering-project-metric): In-page section heading.
- [Retrieve information about clusters](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#retrieve-information-about-clusters): In-page section heading.
- [Work with cluster insights](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#work-with-cluster-insights): In-page section heading.
- [Work with clusters](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#work-with-clusters): In-page section heading.
- [Clustering classes reference](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#clustering-classes-reference): In-page section heading.
- [ClusteringModel](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#clusteringmodel): In-page section heading.
- [classdatarobot.models.model.ClusteringModel](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel): In-page section heading.
- [compute_insights(max_wait=600)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel.compute_insights): In-page section heading.
- [propertyinsights: List\[ClusterInsight\]](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel.insights): In-page section heading.
- [propertyclusters: List\[Cluster\]](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel.clusters): In-page section heading.
- [update_cluster_names(cluster_name_mappings)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel.update_cluster_names): In-page section heading.
- [update_cluster_name(current_name, new_name)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusteringModel.update_cluster_name): In-page section heading.
- [Cluster](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#cluster): In-page section heading.
- [classdatarobot.models.model.Cluster](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.Cluster): In-page section heading.
- [classmethodlist(project_id, model_id)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.Cluster.list): In-page section heading.
- [classmethodupdate_multiple_names(project_id, model_id, cluster_name_mappings)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.Cluster.update_multiple_names): In-page section heading.
- [classmethodupdate_name(project_id, model_id, current_name, new_name)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.Cluster.update_name): In-page section heading.
- [ClusterInsight](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#clusterinsight): In-page section heading.
- [classdatarobot.models.model.ClusterInsight](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusterInsight): In-page section heading.
- [classmethodcompute(project_id, model_id, max_wait=600)](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/unsupervised_clustering.html#datarobot.models.model.ClusterInsight.compute): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Developer learning](https://docs.datarobot.com/en/docs/api/dev-learning/index.html): Linked from this page.
- [Python API client user guide](https://docs.datarobot.com/en/docs/api/dev-learning/python/index.html): Linked from this page.
- [Modeling](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/index.html): Linked from this page.
- [Specialized workflows](https://docs.datarobot.com/en/docs/api/dev-learning/python/modeling/spec/index.html): Linked from this page.
- [Model](https://docs.datarobot.com/en/docs/api/reference/sdk/datarobot-models.html#datarobot.models.Model): Linked from this page.
- [ClientError](https://docs.datarobot.com/en/docs/api/reference/sdk/errors.html#datarobot.errors.ClientError): Linked from this page.

## Documentation content

# Unsupervised Projects (Clustering)

Use clustering when data is not labelled and the problem can be interpreted as grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar to each other than to those in other groups (clusters).
It is a common task in data exploration when finding groups and similarities is needed.

## Create unsupervised projects

To create an unsupervised project, set `unsupervised_mode` to `True` when setting the target.
To specify clustering, set `unsupervised_type` to `CLUSTERING`.
When setting the modeling mode is required, clustering supports either `AUTOPILOT_MODE.COMPREHENSIVE` for DataRobot-run Autopilot or `AUTOPILOT_MODE.MANUAL` for user control of which models/parameters to use.

Example:

```
from datarobot import Project
from datarobot.enums import UnsupervisedTypeEnum
from datarobot.enums import AUTOPILOT_MODE

project = Project.create("dataset.csv", project_name="unsupervised clustering")
project.analyze_and_model(
    unsupervised_mode=True,
    mode=AUTOPILOT_MODE.COMPREHENSIVE,
    unsupervised_type=UnsupervisedTypeEnum.CLUSTERING,
)
```

You can optionally specify list of explicit cluster numbers.
To do this, pass a list of integer values to optional `autopilot_cluster_list` parameter using the `analyze_and_model()` method.

```
project.analyze_and_model(
    unsupervised_mode=True,
    mode=AUTOPILOT_MODE.COMPREHENSIVE,
    unsupervised_type=UnsupervisedTypeEnum.CLUSTERING,
    autopilot_cluster_list=[7, 9, 11, 15, 19],
)
```

You can also do both in one step using the `Project.start()` method.
This method by default will use `AUTOPILOT_MODE.COMPREHENSIVE` mode.

```
from datarobot import Project
from datarobot.enums import UnsupervisedTypeEnum

project = Project.start(
    "dataset.csv",
    unsupervised_mode=True,
    project_name="unsupervised clustering project",
    unsupervised_type=UnsupervisedTypeEnum.CLUSTERING,
)
```

## Unsupervised clustering project metric

Unsupervised clustering projects use the `Silhouette Score` metric for model ranking (instead of using it for model optimization).
It measures the average similarity of objects within a cluster and their distance to the other objects in the other clusters.

## Retrieve information about clusters

In a trained model, you can retrieve information about clusters in along with standard model information.
To do this, when training completes, retrieve a model and view basic clustering information:

> n_clusters: number of clusters for modelis_n_clusters_dynamically_determined: how clustering model picks number of clusters

Here is a code snippet to retrieve information about the number of clusters for model:

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)
print("{} clusters found".format(model.n_clusters))
```

You can retrieve more details about clusters and their data using cluster insights.

## Work with cluster insights

You can compute insights to gain deep insights into clusters and their characteristics.
This process will perform calculations and return detailed information about each feature and its importance, as well as a detailed per-cluster breakdown.

To compute and retrieve cluster insights, use the `ClusteringModel` and its `compute_insights` method.
The method starts the cluster insights compute job, waits for its completion for the number of seconds specified in the optional parameter `max_wait` (default: 600), and returns results when insights are ready.

If clusters are already computed,  access them using the `insights` property of the `ClusteringModel` method.

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)
insights = model.compute_insights()
```

This call, with the specified `wait_time`, will run and wait for specified time:

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)
insights = model.compute_insights(max_wait=60)
```

If computation fails to finish before `max_wait` expires, the method will raise an `AsyncTimeoutError`.
You can retrieve cluster insights after jobs computation finishes.

To retrieve cluster insights already computed:

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)
for insight in model.insights:
    print(insight)
```

## Work with clusters

By default, DataRobot names clusters “Cluster 1”, “Cluster 2”, … , “Cluster N” .
You can retrieve these names and alter them according to preference.
When retrieving clusters before computing insights, clusters will contain only names.
After insight computation completes, each cluster will also hold information about the percentage of data that is represented by the Cluster.

For example:

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)

# helper function
def print_summary(name, percent):
    if not percent:
        percent = "?"
    print("'{}' holds {} % of data".format(name, percent))

for cluster in model.clusters:
    print_summary(cluster.name, cluster.percent)
model.compute_insights()
for cluster in model.clusters:
    print_summary(cluster.name, cluster.percent)
```

For a model with three clusters, the code snippet will output:

```
'Cluster 1' holds ? % of data
'Cluster 2' holds ? % of data
'Cluster 3' holds ? % of data
-- Cluster insights computation finished --
'Cluster 1' holds 27.1704180064 % of data
'Cluster 2' holds 36.9131832797 % of data
'Cluster 3' holds 35.9163987138 % of data
```

Use the following methods of `ClusteringModel` class to alter cluster names:
  - `update_cluster_names` - changes multiple cluster names using mapping in dictionary
  - `update_cluster_name` - changes one cluster name

After update, each method will return a list of clusters with changed names.

For example:

```
from datarobot import ClusteringModel
model = ClusteringModel.get(project_id, model_id)

# update multiple
cluster_name_mappings = [
    ("Cluster 1", "AAA"),
    ("Cluster 2", "BBB"),
    ("Cluster 3", "CCC")
]
clusters = model.update_cluster_names(cluster_name_mappings)

# update single
clusters = model.update_cluster_name("CCC", "DDD")
```

## Clustering classes reference

### ClusteringModel

### class datarobot.models.model.ClusteringModel

ClusteringModel extends [Model](https://docs.datarobot.com/en/docs/api/reference/sdk/datarobot-models.html#datarobot.models.Model) class.
It provides properties and methods specific to clustering projects.

#### compute_insights(max_wait=600)

Compute and retrieve cluster insights for model.
This method awaits completion of job computing cluster insights and returns results after it is finished.
If computation takes longer than specified `max_wait` exception will be raised.

- Parameters:
- project_id ( str ) – Project to start creation in.
- model_id ( str ) – Project’s model to start creation in.
- max_wait ( int ) – Maximum number of seconds to wait before giving up
- Return type: List of ClusterInsight
- Raises:
- ClientError – Server rejected creation due to client error. Most likely cause is bad project_id or model_id .
- AsyncFailureError – If any of the responses from the server are unexpected
- AsyncProcessUnsuccessfulError – If the cluster insights computation has failed or was cancelled.
- AsyncTimeoutError – If the cluster insights computation did not resolve in time

#### property insights : List[ClusterInsight]

Return actual list of cluster insights if already computed.

- Return type: List of ClusterInsight

#### property clusters : List[Cluster]

Return actual list of Clusters.

- Return type: List of Cluster

#### update_cluster_names(cluster_name_mappings)

Change many cluster names at once based on list of name mappings.

- Parameters: cluster_name_mappings ( List of tuples ) –

Cluster names mapping consisting of current cluster name and old cluster name.
  Example:

```
cluster_name_mappings = [
    ("current cluster name 1", "new cluster name 1"),
    ("current cluster name 2", "new cluster name 2")]
```

#### update_cluster_name(current_name, new_name)

Change cluster name from current_name to new_name.

- Parameters:
- current_name ( str ) – Current cluster name.
- new_name ( str ) – New cluster name.
- Return type: List of Cluster
- Raises: datarobot.errors.ClientError – Server rejected update of cluster names.

### Cluster

### class datarobot.models.model.Cluster

Representation of a single cluster.

- Variables:
- name ( str ) – Current cluster name
- percent ( float ) – Percent of data contained in the cluster. This value is reported after cluster insights are computed for the model.

#### classmethod list(project_id, model_id)

Retrieve a list of clusters in the model.

- Parameters:
- project_id ( str ) – ID of the project that the model is part of.
- model_id ( str ) – ID of the model.
- Return type: List of clusters

#### classmethod update_multiple_names(project_id, model_id, cluster_name_mappings)

Update many clusters at once based on list of name mappings.

- Parameters:
- project_id ( str ) – ID of the project that the model is part of.
- model_id ( str ) – ID of the model.
- cluster_name_mappings(Listoftuples) – Cluster name mappings, consisting of current and previous names for each cluster.
Example: cluster_name_mappings=[("current cluster name 1","new cluster name 1"),("current cluster name 2","new cluster name 2")] * Return type: List of clusters * Raises: * datarobot.errors.ClientError – Server rejected update of cluster names.
  * ValueError – Invalid cluster name mapping provided.

#### classmethod update_name(project_id, model_id, current_name, new_name)

Change cluster name from current_name to new_name

- Parameters:
- project_id ( str ) – ID of the project that the model is part of.
- model_id ( str ) – ID of the model.
- current_name ( str ) – Current cluster name
- new_name ( str ) – New cluster name
- Return type: List of Cluster

### ClusterInsight

### class datarobot.models.model.ClusterInsight

Holds data on all insights related to feature as well as breakdown per cluster.

- Parameters:
- feature_name ( str ) – Name of a feature from the dataset.
- feature_type ( str ) – Type of feature.
- insights ( List[ClusterInsight] ) – List provides information regarding the importance of a specific feature in relation to each cluster. Results help understand how the model is grouping data and what each cluster represents.
- feature_impact ( float ) – Impact of a feature ranging from 0 to 1.

#### classmethod compute(project_id, model_id, max_wait=600)

Starts creation of cluster insights for the model and if successful, returns computed ClusterInsights.
This method allows calculation to continue for a specified time and if not complete, cancels the request.

- Parameters:
- project_id ( str ) – ID of the project to begin creation of cluster insights for.
- model_id ( str ) – ID of the project model to begin creation of cluster insights for.
- max_wait ( int ) – Maximum number of seconds to wait canceling the request.
- Return type: List[ClusterInsight]
- Raises:
- ClientError – Server rejected creation due to client error. Most likely cause is bad project_id or model_id .
- AsyncFailureError – Indicates whether any of the responses from the server are unexpected.
- AsyncProcessUnsuccessfulError – Indicates whether the cluster insights computation failed or was cancelled.
- AsyncTimeoutError – Indicates whether the cluster insights computation did not resolve within the specified time limit (max_wait).
