If your custom model doesn't use a target type supported by DataRobot, you can create an unstructured model. Unstructured models can use arbitrary (i.e., unstructured) data for input and output, allowing you to deploy and monitor models regardless of the target type. This characteristic of unstructured models gives you more control over how you read the data from a prediction request and response; however, it requires precise coding to assemble correctly. You must implement custom hooks to process the unstructured input data and generate a valid response.
Compare the characteristics and capabilities of the two types of custom models below:
Model type
Characteristics
Capabilities
Structured
Uses a target type known to DataRobot (e.g., regression, binary classification, multiclass, and anomaly detection).
Not required to conform to a request/response schema.
Accepts unstructured input and output data.
Limited deployment capabilities. Doesn't support data drift and accuracy statistics, challenger models, or humility rules.
Doesn't accept training data after deployment.
Inference models support unstructured mode, where input and output are not verified and can be almost anything. This is your responsibility to verify correctness. For assembly instructions specific to unstructured custom inference models, reference the model templates for Python and R provided in the DRUM documentation.
Data format
When working with unstructured models DataRobot supports data as a text or binary file.
The load_model() hook is executed only once at the beginning of the run to load one or more trained objects from multiple artifacts. It is only required when a trained object is stored in an artifact that uses an unsupported format or when multiple artifacts are used. The load_model() hook is not required when there is a single artifact in one of the supported formats:
The score_unstructured() hook defines the output of a custom estimator and returns predictions on input data. Do not use this hook for transform models.
A tuple return data: str/bytes, kwargs: dict[str, str] where kwargs = {"mimetype": "users/mimetype", "charset": "users/charset"} can be used to return mimetype and charset for the Content-Type response header.
The score_unstructured hook receives a data parameter, which can be of either str or bytes type.
You can use type-checking methods to verify types:
Python: isinstance(data, str) or isinstance(data, bytes)
R: is.character(data) or is.raw(data)
DataRobot uses the Content-Type header to determine a type to cast data to. The Content-Type header can be provided in a request or in --content-type CLI argument.
The Content-Type header format is type/subtype;parameter (e.g., text/plain;charset=utf8). The following rules apply:
If charset is not defined, default utf8 charset is used, otherwise provided charset is used to decode data.
If Content-Type is not defined, then incoming kwargs={"mimetype": "text/plain", "charset":"utf8"}, so data is treated as text, decoded using utf8 charset and passed as str.
If mimetype starts with text/ or application/json, data is treated as text, decoded using provided charset and passed as str.
For all other mimetype values, data is treated as binary and passed as bytes.
return data: str: The data is treated as text, the default Content-Type="text/plain;charset=utf8" header is set in response, and data is encoded and sent using the utf8charset.
return data: bytes: The data is treated as binary, the default Content-Type="application/octet-stream;charset=utf8" header is set in response, and data is sent as-is.
return data, kwargs: If mimetype value is missing in kwargs, the default mimetype is set according to the data type str/bytes -> text/plain/application/octet-stream. If charset value is missing, the default utf8 charset is set; then, if the data is of type str, it will be encoded using resolved charset and sent.
You may use the datarobot_drum.RuntimeParameters in your code (e.g. custom.py) to read runtime parameters delivered to the executed custom model. The runtime parameters should be defined in the DataRobot UI. Below is a simple example of how to read a string of credential runtime parameters: