Segmented analysis¶
Segmented analysis identifies operational issues with training and prediction data requests for a deployment. DataRobot enables the drill-down analysis of data drift and accuracy statistics by filtering them into unique segment attributes and values. Reference the guidelines below to understand how to configure and view segmented analysis.
Configure segmented analysis¶
To use segmented analysis for service health, data drift, and accuracy, you must enable the following deployment settings:
-
Enable target monitoring (required to enable data drift and accuracy tracking)
-
Enable feature drift tracking (required to enable data drift tracking)
-
Track attributes for segmented analysis of training data and predictions (required to enable segmented analysis for service health, data drift, and accuracy)
Note
Only the deployment owner can configure these settings.
View segmented analysis¶
If you have enabled segmented analysis for your deployment and have made predictions, you can access various statistics by segment. By default, statistics for a deployment are displayed without any segmentation. There are two dropdowns used for segment analysis: Segment Attribute and Segment Value.
Service health¶
Segmented analysis for service health uses fixed segment attributes for every deployment. The segment attributes represent the different ways in which prediction requests can be viewed. Segment value is a single value of the selected segment attribute present in one or more prediction requests. They are represented by different values depending on the segment attribute applied:
Segment Attribute | Description | Segment Value | Example |
---|---|---|---|
DataRobot-Consumer | Segments prediction requests by the users of a deployment that have made prediction requests. | Each segment value is the email address of a user. | Segment Attribute: DataRobot-Consumer Value: username@datarobot.com |
DataRobot-Host-IP | Segments prediction requests by the IP address of the prediction server used to make prediction requests. | Each segment value is a unique IP address. | Segment Attribute: DataRobot-Host-IP Value: 168.212. 226.204 |
DataRobot-Remote-IP | Segments prediction requests by the IP address of a caller (the machine used to make prediction requests). | Each segment value is a unique IP address. | Segment Attribute: DataRobot-Remote-IP Value: 63.211. 546.231 |
Select a segment attribute, then select a segment value for that attribute. When both are selected, the Service health tab automatically refreshes to display the statistics for the selected segment value.
Segment availability
The segment values that appear in the Segment Value dropdown menu are not dependent on the selected time range, monitoring type, or model ID.
Data drift and accuracy¶
Segmented analysis for data drift and accuracy allows for custom attributes in addition to fixed attributes for every deployment. The segment attributes represent the different ways in which the data can be viewed. Segment value is a single value of the selected segment attribute present in one or more prediction requests. They are represented by different values depending on the segment attribute applied:
Segment Attribute | Description | Segment Value | Example |
---|---|---|---|
DataRobot-Consumer | Segments prediction requests by the users of a deployment that have made prediction requests. | Each segment value is the email address of a user. | Segment Attribute: DataRobot-Consumer Value: username@datarobot.com |
Custom attribute | Segments based on a column in the training data that is indicated when configuring segmented analysis. For example, if your training data includes a "Country" column, you could select it as a custom attribute and segment the data by individual countries (which make up the segment values for the custom attribute). | Based on the segment attribute you provide. | Segment Attribute: "Country" Value: "Spain" |
None | Displays the data drift statistics without any segmentation. | All (no segmentation applied). | N/A |
Select a segment attribute, and then select a segment value for that attribute. When both are selected, the Data Drift tab automatically refreshes to display the statistics for the selected segment value.
Segment availability
The segment values that appear in the Segment Value dropdown menu are not dependent on the selected time range, monitoring type, or model ID.