Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Segmented analysis

Segmented analysis identifies operational issues with training and prediction data requests for a deployment. DataRobot enables the drill-down analysis of data drift and accuracy statistics by filtering them into unique segment attributes and values. Reference the guidelines below to understand how to configure and view segmented analysis.

Configure segmented analysis

To use segmented analysis for service health, data drift, and accuracy, you must enable the following deployment settings:

Note

Only the deployment owner can configure these settings.

View segmented analysis

If you have enabled segmented analysis for your deployment and have made predictions, you can access various statistics by segment. By default, statistics for a deployment are displayed without any segmentation. There are two dropdowns used for segment analysis: Segment Attribute and Segment Value.

Service health

Segmented analysis for service health uses fixed segment attributes for every deployment. The segment attributes represent the different ways in which prediction requests can be viewed. Segment value is a single value of the selected segment attribute present in one or more prediction requests. They are represented by different values depending on the segment attribute applied:

Segment Attribute Description Segment Value Example
DataRobot-Consumer Segments prediction requests by the users of a deployment that have made prediction requests. Each segment value is the email address of a user. Segment Attribute: DataRobot-Consumer
Value: username@datarobot.com
DataRobot-Host-IP Segments prediction requests by the IP address of the prediction server used to make prediction requests. Each segment value is a unique IP address. Segment Attribute: DataRobot-Host-IP
Value: 168.212. 226.204
DataRobot-Remote-IP Segments prediction requests by the IP address of a caller (the machine used to make prediction requests). Each segment value is a unique IP address. Segment Attribute: DataRobot-Remote-IP
Value: 63.211. 546.231

Select a segment attribute, then select a segment value for that attribute. When both are selected, the Service health tab automatically refreshes to display the statistics for the selected segment value.

Segment availability

The segment values that appear in the Segment Value dropdown menu are not dependent on the selected time range, monitoring type, or model ID.

Data drift and accuracy

Segmented analysis for data drift and accuracy allows for custom attributes in addition to fixed attributes for every deployment. The segment attributes represent the different ways in which the data can be viewed. Segment value is a single value of the selected segment attribute present in one or more prediction requests. They are represented by different values depending on the segment attribute applied:

Segment Attribute Description Segment Value Example
DataRobot-Consumer Segments prediction requests by the users of a deployment that have made prediction requests. Each segment value is the email address of a user. Segment Attribute: DataRobot-Consumer
Value: username@datarobot.com
Custom attribute Segments based on a column in the training data that is indicated when configuring segmented analysis. For example, if your training data includes a "Country" column, you could select it as a custom attribute and segment the data by individual countries (which make up the segment values for the custom attribute). Based on the segment attribute you provide. Segment Attribute: "Country"
Value: "Spain"
None Displays the data drift statistics without any segmentation. All (no segmentation applied). N/A

Select a segment attribute, and then select a segment value for that attribute. When both are selected, the Data Drift tab automatically refreshes to display the statistics for the selected segment value.

Segment availability

The segment values that appear in the Segment Value dropdown menu are not dependent on the selected time range, monitoring type, or model ID.


Updated July 2, 2024