Define custom metrics¶
The MetricBase
class provides an interface to define custom metrics. Four additional default classes can help you create custom metrics: ModelMetricBase
, DataMetricBase
, LLMMetricBase
, and SklearnMetric
.
Create a metric base¶
In MetricBase
, define the type of data a metric requires; the custom metric inherits that definition:
class MetricBase(object):
def __init__(
self,
name: str,
description: str = None,
need_predictions: bool = False,
need_actuals: bool = False,
need_scoring_data: bool = False,
need_training_data: bool = False,
):
self.name = name
self.description = description
self._need_predictions = need_predictions
self._need_actuals = need_actuals
self._need_scoring_data = need_scoring_data
self._need_training_data = need_training_data
In addition, you must implement the scoring and reduction methods in MetricBase
:
-
Scoring (
score
): Uses initialized data types to calculate a metric. -
Reduction (
reduce_func
): Reduces multiple values in the sameTimeBucket
to one value.
def score(
self,
scoring_data: pd.DataFrame,
predictions: np.array,
actuals: np.array,
fit_ctx=None,
metadata=None,
) -> float:
raise NotImplemented
def reduce_func(self) -> callable:
return np.mean
Create metrics calculated with predictions and actuals¶
ModelMetricBase
is the base class for metrics that require actuals and predictions for metric calculation.
class ModelMetricBase(MetricBase):
def __init__(
self, name: str, description: str = None, need_training_data: bool = False
):
super().__init__(
name=name,
description=description,
need_scoring_data=False,
need_predictions=True,
need_actuals=True,
need_training_data=need_training_data,
)
def score(
self,
prediction: np.array,
actuals: np.array,
fit_context=None,
metadata=None,
scoring_data=None,
) -> float:
raise NotImplemented
Create metrics calculated with scoring data¶
DataMetricBase
is the base class for metrics that require scoring data for metric calculation.
class DataMetricBase(MetricBase):
def __init__(
self, name: str, description: str = None, need_training_data: bool = False
):
super().__init__(
name=name,
description=description,
need_scoring_data=True,
need_predictions=False,
need_actuals=False,
need_training_data=need_training_data,
)
def score(
self,
scoring_data: pd.DataFrame,
fit_ctx=None,
metadata=None,
predictions=None,
actuals=None,
) -> float:
raise NotImplemented
Create LLM metrics¶
LLMMetricBase
is the base class for LLM metrics that require scoring data and predictions for metric calculation, otherwise known as prompts (the user input) and completions (the LLM response).
class LLMMetricBase(MetricBase):
def __init__(
self, name: str, description: str = None, need_training_data: bool = False
):
super().__init__(
name=name,
description=description,
need_scoring_data=True,
need_predictions=True,
need_actuals=False,
need_training_data=need_training_data,
)
def score(
self,
scoring_data: pd.DataFrame,
predictions: np.array,
fit_ctx=None,
metadata=None,
actuals=None,
) -> float:
raise NotImplemented
Create Sklearn metrics¶
To accelerate the implementation of custom metrics, you can use ready-made, proven metrics from Sklearn. Provide the name of a metric, using the SklearnMetric
class as the base class, to create a custom metric. For example:
from dmm.metric.sklearn_metric import SklearnMetric
class MedianAbsoluteError(SklearnMetric):
"""
Metric that calculates the median absolute error of the difference between predictions and actuals
"""
def __init__(self):
super().__init__(
metric="median_absolute_error",
)
Report custom metric values¶
The metrics described above provide the source of the custom metric definitions. Use the CustomMetric
interface to retrieve the metadata of an existing custom metric in DataRobot and to report data to that custom metric. Initialize the metric by providing the parameters explicitly (metric_id
, deployment_id
, model_id
, dr.Client()
):
from dmm.custom_metric import CustomMetric
cm = CustomMetric.from_id(metric_id=METRIC_ID, deployment_id=DEPLOYMENT_ID, model_id=MODEL_ID, client=CLIENT)
You can also define these parameters as environment variables:
Parameter | Environment variable |
---|---|
metric_id |
os.environ["CUSTOM_METRIC_ID"] |
deployment_id |
os.environ["DEPLOYMENT_ID"] |
model_id |
os.environ["MODEL_ID"] |
dr.Client() |
os.environ["BASE_URL"] and os.environ["DATAROBOT_ENDPOINT"] |
from dmm.custom_metric import CustomMetric
cm = CustomMetric.from_id()
Optionally, specify batch mode (is_batch=True
).
from dmm.custom_metric import CustomMetric
cm = CustomMetric.from_id(is_batch=True)
The report
method submits custom metric values to a custom metric defined in DataRobot. To use this method, report a DataFrame in the shape of the output from the metric evaluator.
print(aggregated_metric_per_time_bucket.to_string())
timestamp samples median_absolute_error
1 01/06/2005 14:00:00.000000 2 0.001
response = cm.report(df=aggregated_metric_per_time_bucket)
print(response.status_code)
202
The dry_run
parameter determines if the custom metric values transfer is a dry run (the values aren't saved in the database) or if it is a production data transfer.
This parameter is False
by default (the values are saved).
response = cm.report(df=aggregated_metric_per_time_bucket, dry_run=True)
print(response.status_code)
202