Skip to content

アプリケーション内で をクリックすると、お使いのDataRobotバージョンに関する全プラットフォームドキュメントにアクセスできます。

Assemble unstructured custom models

If your custom model doesn't use a target type supported by DataRobot, you can create an unstructured model. Unstructured models can use arbitrary (i.e., unstructured) data for input and output, allowing you to deploy and monitor models regardless of the target type. This characteristic of unstructured models gives you more control over how you read the data from a prediction request and response; however, it requires precise coding to assemble correctly. You must implement custom hooks to process the unstructured input data and generate a valid response.

Compare the characteristics and capabilities of the two types of custom models below:

モデルタイプ 特性 Capabilities
  • Uses a target type known to DataRobot (e.g., regression, binary classification, multiclass, and anomaly detection).
  • Required to conform to a request/response schema.
  • Accepts structured input and output data.
  • Full deployment capabilities.
  • Accepts training data after deployment.
  • Uses a custom target type, unknown to DataRobot.
  • Not required to conform to a request/response schema.
  • Accepts unstructured input and output data.
  • Limited deployment capabilities. Doesn't support data drift and accuracy statistics, challenger models, or humility rules.
  • Doesn't accept training data after deployment.

Inference models support unstructured mode, where input and output are not verified and can be almost anything. This is your responsibility to verify correctness. 非構造化カスタム推論モデルに固有のアセンブル手順については、DRUMのドキュメントに記載されているPythonおよびRのモデルテンプレートを参照してください。

Data format

When working with unstructured models DataRobot supports data as a text or binary file.

Unstructured custom model hooks

Include any necessary hooks in a file called for Python models or custom.R for R models alongside your model artifacts in your model folder:

Type annotations in hook signatures

The following hook signatures are written with Python 3 type annotations. The Python types match the following R types:

Python type R type 説明
None NULL Nothing
str character 文字列
bytes raw Raw bytes
dict list A list of key/value pairs.
tuple list A list of data.
Any An R object The deserialized model.
*args, **kwargs These are keyword arguments, not types; they serve as placeholders for additional parameters.


The init hook is executed only once at the beginning of the run to allow the model to load libraries and additional files for use in other hooks.

init(**kwargs) -> None 

init() 入力

入力パラメーター 説明
**kwargs An additional keyword argument. code_dir provides a link, passed through the --code_dir parameter, to the folder where the model code is stored.


def init(code_dir):
    global g_code_dir
    g_code_dir = code_dir 
init <- function() {

init() 出力



The load_model() hook is executed only once at the beginning of the run to load one or more trained objects from multiple artifacts. トレーニング済みのオブジェクトがサポートされていない形式を使用するアーティファクトに保存されている場合、または複数のアーティファクトが使用される場合にのみ必要です。 The load_model() hook is not required when there is a single artifact in one of the supported formats:

  • Python:.pkl.pth.h5.joblib
  • Java:.mojo
  • R:.rds
load_model(code_dir: str) -> Any 

load_model() 入力

入力パラメーター 説明
code_dir A link, passed through the --code_dir parameter, to the directory where the model artifact and additional code are provided.


def load_model(code_dir):
    model_path = "model.pkl"
    model = joblib.load(os.path.join(code_dir, model_path)) 
load_model <- function(input_dir) {
    readRDS(file.path(input_dir, "model_name.rds"))

load_model() 出力



The score_unstructured() hook defines the output of a custom estimator and returns predictions on input data. 変換モデルにこのフックは使用しないでください。

score_unstructured(model: Any, data: str/bytes, **kwargs: Dict[str, Any]) -> str/bytes [, Dict[str, str]] 


入力パラメーター 説明
data Data represented as str or bytes, depending on the provided mimetype.
model A trained object loaded from the artifact by DataRobot or loaded through the load_model hook.
**kwargs Additional keyword arguments. For a binary classification model, it contains the positive and negative class labels as the following keys:
  • mimetype: str: Indicates the nature and format of the data, taken from request Content-Type header or --content-type CLI argument in batch mode.
  • charset: str: Indicates the encoding for text data, taken from request Content-Type header or --content-type CLI argument in batch mode.
  • query: dict: Parameters passed as query params in a http request or the --query CLI argument in batch mode.
  • headers: dict: Request headers passed in http request.


def score_unstructured(model, data, query, **kwargs):
    text_data = data.decode("utf8") if isinstance(data, bytes) else data
    text_data = text_data.strip()
    words_count = model.predict(text_data)
    return str(words_count) 
score_unstructured <- function(model, data, query,) {
    kwargs <- list()

    if (is.raw(data)) {
        data_text <- stri_conv(data, "utf8")
    } else {
        data_text <- data
    count <- str_count(data_text, " ") + 1
    ret = toString(count)


The score_unstructured() hook should return:

  • A single value return data: str/bytes.
  • A tuple return data: str/bytes, kwargs: dict[str, str] where kwargs = {"mimetype": "users/mimetype", "charset": "users/charset"} can be used to return mimetype and charset for the Content-Type response header.

Unstructured model considerations

Incoming data type resolution

The score_unstructured hook receives a data parameter, which can be of either str or bytes type.

You can use type-checking methods to verify types:

  • Python: isinstance(data, str) or isinstance(data, bytes)

  • R: is.character(data) or is.raw(data)

DataRobot uses the Content-Type header to determine a type to cast data to. The Content-Type header can be provided in a request or in --content-type CLI argument. The Content-Type header format is type/subtype;parameter (e.g., text/plain;charset=utf8). The following rules apply:

  • If charset is not defined, default utf8 charset is used, otherwise provided charset is used to decode data.

  • If Content-Type is not defined, then incoming kwargs={"mimetype": "text/plain", "charset":"utf8"}, so data is treated as text, decoded using utf8 charset and passed as str.

  • If mimetype starts with text/ or application/json, data is treated as text, decoded using provided charset and passed as str.

  • For all other mimetype values, data is treated as binary and passed as bytes.

Outgoing data and kwargs parameters

As mentioned above, score_unstructured can return:

  • A single data value: return data.

  • A tuple (data and additional parameters: return data, {"mimetype": "some/type", "charset": "some_charset"}).

Server mode

In server mode, the following rules apply:

  • return data: str: The data is treated as text, the default Content-Type="text/plain;charset=utf8" header is set in response, and data is encoded and sent using the utf8 charset.

  • return data: bytes: The data is treated as binary, the default Content-Type="application/octet-stream;charset=utf8" header is set in response, and data is sent as-is.

  • return data, kwargs: If mimetype value is missing in kwargs, the default mimetype is set according to the data type str/bytes -> text/plain/application/octet-stream. If charset value is missing, the default utf8 charset is set; then, if the data is of type str, it will be encoded using resolved charset and sent.

Batch mode

The best way to debug in batch mode is to provide --output file. The returned data is written to a file according to the type of data returned:

  • str data is written to a text file using default utf8 or returned in kwargs charset.

  • bytes data is written to a binary file. The returned kwargs are not shown in batch mode, but you can still print them during debugging.


You may use the datarobot_drum.RuntimeParameters in your code (e.g. to read runtime parameters delivered to the executed custom model. The runtime parameters should be defined in the DataRobot UI. Below is a simple example of how to read a string of credential runtime parameters:

from datarobot_drum import RuntimeParameters

def load_model(code_dir):
    target_url = RuntimeParameters.get("TARGET_URL")
    s3_creds = RuntimeParameters.get("AWS_CREDENIAL")

更新しました May 3, 2023
Back to top