# Assemble unstructured custom models

> Assemble unstructured custom models - Unstructured models can use arbitrary data for input and
> output, allowing you to deploy and monitor models regardless of the target type.

This Markdown file sits beside the HTML page at the same path (with a `.md` suffix). It summarizes the topic and lists links for tools and LLM context.

Companion generated at `2026-04-24T16:03:56.257897+00:00` (UTC).

## Primary page

- [Assemble unstructured custom models](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html): Full documentation for this topic (HTML).

## Sections on this page

- [Unstructured custom model hooks](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#unstructured-custom-model-hooks): In-page section heading.
- [init()](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#init): In-page section heading.
- [init()input](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#init-input): In-page section heading.
- [init()example](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#init-example): In-page section heading.
- [init()output](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#init-output): In-page section heading.
- [load_model()](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#load-model): In-page section heading.
- [load_model()input](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#load-model-input): In-page section heading.
- [load_model()example](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#load-model-example): In-page section heading.
- [load_model()output](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#load-model-output): In-page section heading.
- [score_unstructured()](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#score-unstructured): In-page section heading.
- [score_unstructured()input](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#score-input): In-page section heading.
- [score_unstructured()examples](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#score-unstructured-examples): In-page section heading.
- [score_unstructured()output](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#score-unstructured-output): In-page section heading.
- [Unstructured model considerations](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#unstructured-model-considerations): In-page section heading.
- [Incoming data type resolution](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#incoming-data-type-resolution): In-page section heading.
- [Outgoing data and kwargs parameters](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#outgoing-data-and-kwargs-parameters): In-page section heading.
- [Server mode](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#server-mode): In-page section heading.
- [Batch mode](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#batch-mode): In-page section heading.
- [Auxiliaries](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#auxiliaries): In-page section heading.

## Related documentation

- [Developer documentation](https://docs.datarobot.com/en/docs/api/index.html): Linked from this page.
- [Code-first tools](https://docs.datarobot.com/en/docs/api/code-first-tools/index.html): Linked from this page.
- [DataRobot User Models](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/index.html): Linked from this page.
- [structured input and output data](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/structured-custom-models.html#structured-custom-model-requirements): Linked from this page.

## Documentation content

# Assemble unstructured custom models

If your custom model doesn't use a target type supported by DataRobot, you can create an unstructured model. Unstructured models can use arbitrary ( i.e., unstructured) data for input and output, allowing you to deploy and monitor models regardless of the target type. This characteristic of unstructured models gives you more control over how you read the data from a prediction request and response; however, it requires precise coding to assemble correctly. You must implement [custom hooks to process the unstructured input data](https://docs.datarobot.com/en/docs/api/code-first-tools/drum/unstructured-custom-models.html#unstructured-custom-model-hooks) and generate a valid response.

Compare the characteristics and capabilities of the two types of custom models below:

| Model type | Characteristics | Capabilities |
| --- | --- | --- |
| Structured | Uses a target type known to DataRobot (e.g., regression, binary classification, multiclass, and anomaly detection).Required to conform to a request/response schema.Accepts structured input and output data. | Full deployment capabilities.Accepts training data after deployment. |
| Unstructured | Uses a custom target type, unknown to DataRobot.Not required to conform to a request/response schema.Accepts unstructured input and output data. | Limited deployment capabilities. Doesn't support data drift and accuracy statistics, challenger models, or humility rules.Doesn't accept training data after deployment. |

Inference models support unstructured mode, where input and output are not verified and can be almost anything. This is your responsibility to verify correctness. For assembly instructions specific to unstructured custom inference models, reference the model templates for [Python](https://github.com/datarobot/datarobot-user-models/tree/master/model_templates/python3_unstructured) and [R](https://github.com/datarobot/datarobot-user-models/tree/master/model_templates/r_unstructured) provided in the DRUM documentation.

> [!NOTE] Data format
> When working with unstructured models DataRobot supports data as a text or binary file.

## Unstructured custom model hooks

Include any necessary hooks in a file called `custom.py` for Python models or `custom.R` for R models alongside your model artifacts in your model folder.

> [!WARNING] Include all required custom model code in hooks
> Custom model hooks are callbacks passed to the custom model. All code required by the custom model must be in a custom model hook—the custom model can't access any code provided outside a defined custom model hook. In addition, you can't modify the input arguments of these hooks as they are predefined.

### init()

The `init` hook is executed only once at the beginning of the run to allow the model to load libraries and additional files for use in other hooks.

```
init(**kwargs) -> None
```

#### init() input

| Input parameter | Description |
| --- | --- |
| **kwargs | An additional keyword argument. code_dir provides a link, passed through the --code_dir parameter, to the folder where the model code is stored. |

#### init() example

**Python:**
```
def init(code_dir):
    global g_code_dir
    g_code_dir = code_dir
```

**R:**
```
init <- function(...) {
    library(brnn)
    library(glmnet)
}
```


#### init() output

The `init()` hook does not return anything.

### load_model()

The `load_model()` hook is executed only once at the beginning of the run to load one or more trained objects from multiple artifacts. It is only required when a trained object is stored in an artifact that uses an unsupported format or when multiple artifacts are used. The `load_model()` hook is not required when there is a single artifact in one of the supported formats:

- Python: .pkl , .pth , .h5 , .joblib
- Java: .mojo
- R: .rds

```
load_model(code_dir: str) -> Any
```

#### load_model() input

| Input parameter | Description |
| --- | --- |
| code_dir | A link, passed through the --code_dir parameter, to the directory where the model artifact and additional code are provided. |

#### load_model() example

**Python:**
```
def load_model(code_dir):
    model_path = "model.pkl"
    model = joblib.load(os.path.join(code_dir, model_path))
    return model
```

**R:**
```
load_model <- function(input_dir) {
    readRDS(file.path(input_dir, "model_name.rds"))
}
```


#### load_model() output

The `load_model()` hook returns a trained object (of any type).

### score_unstructured()

The `score_unstructured()` hook defines the output of a custom estimator and returns predictions on input data. Do not use this hook for transform models.

```
score_unstructured(model: Any, data: str/bytes, **kwargs: Dict[str, Any]) -> str/bytes [, Dict[str, str]]
```

#### score_unstructured() input

| Input parameter | Description |
| --- | --- |
| data | Data represented as str or bytes, depending on the provided mimetype. |
| model | A trained object loaded from the artifact by DataRobot or loaded through the load_model hook. |
| **kwargs | Additional keyword arguments. For a binary classification model, it contains the positive and negative class labels as the following keys:mimetype: str: Indicates the nature and format of the data, taken from request Content-Type header or --content-type CLI argument in batch mode.charset: str: Indicates the encoding for text data, taken from request Content-Type header or --content-type CLI argument in batch mode.query: dict: Parameters passed as query parameters in a HTTP request or the --query CLI argument in batch mode.headers: dict: Request headers passed in the HTTP request. |

#### score_unstructured() examples

**Python:**
The following example processes text input, decodes bytes if necessary, and returns a prediction as a string:

```
def score_unstructured(model, data, query, **kwargs):
    text_data = data.decode("utf8") if isinstance(data, bytes) else data
    text_data = text_data.strip()
    words_count = model.predict(text_data)
    return str(words_count)
```

```
curl -X POST "$DATAROBOT_ENDPOINT/api/v2/deployments/<deploymentId>/predictionsUnstructured/" \
  -H "Authorization: Bearer $DATAROBOT_API_TOKEN" \
  -H "Content-Type: text/plain" \
  -d "This is sample text input"

# Expected response:
5
```

The following example demonstrates parsing JSON input and returning JSON output with the appropriate `Content-Type` header:

```
import json

def load_model(code_dir):
    """Required when no model artifact (.pkl, .h5, etc.) is present."""
    return True

def score_unstructured(model, data, query, **kwargs):
    """Parse JSON input and return JSON output with Content-Type header."""
    # Parse JSON input
    input_data = json.loads(data) if data else {}

    # Your inference logic here
    result = {
        "input": input_data,
        "prediction": "your_prediction_here"
    }

    # Return JSON response with Content-Type header
    return json.dumps(result), {"mimetype": "application/json"}
```

```
curl -X POST "$DATAROBOT_ENDPOINT/api/v2/deployments/<deploymentId>/predictionsUnstructured/" \
  -H "Authorization: Bearer $DATAROBOT_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"key": "value", "test": 123}'

# Expected response:
{"input": {"key": "value", "test": 123}, "prediction": "your_prediction_here"}
```

**R:**
```
score_unstructured <- function(model, data, query, ...) {
    kwargs <- list(...)

    if (is.raw(data)) {
        data_text <- stri_conv(data, "utf8")
    } else {
        data_text <- data
    }
    count <- str_count(data_text, " ") + 1
    ret = toString(count)
    ret
}
```


#### score_unstructured() output

The `score_unstructured()` hook should return:

- A single value return data: str/bytes .
- A tuple return data: str/bytes, kwargs: dict[str, str] where kwargs can include {"mimetype": "users/mimetype", "charset": "users/charset"} to build the Content-Type response header from individual components.

## Unstructured model considerations

### Incoming data type resolution

The `score_unstructured` hook receives a `data` parameter, which can be of either `str` or `bytes` type.

You can use type-checking methods to verify types:

- Python:isinstance(data, str)orisinstance(data, bytes)
- R:is.character(data)oris.raw(data)

DataRobot uses the `Content-Type` header to determine a type to cast `data` to. The `Content-Type` header can be provided in a request or in `--content-type` CLI argument.The `Content-Type` header format is `type/subtype;parameter` (e.g., `text/plain;charset=utf8`). The following rules apply:

- Ifcharsetis not defined, defaultutf8charset is used, otherwise provided charset is used to decode data.
- IfContent-Typeis not defined, then incomingkwargs={"mimetype": "text/plain", "charset":"utf8"}, so data is treated as text, decoded usingutf8charset and passed asstr.
- Ifmimetypestarts withtext/orapplication/json, data is treated as text, decoded using provided charset and passed asstr.
- For all othermimetypevalues, data is treated as binary and passed asbytes.

### Outgoing data and kwargs parameters

As mentioned above, `score_unstructured` can return:

- A single data value:return data.
- A tuple (data and additional parameters:return data, {"mimetype": "some/type", "charset": "some_charset"}).

#### Server mode

In server mode, the following rules apply:

- return data: str: The data is treated as text, the defaultContent-Type="text/plain;charset=utf8"header is set in response, and data is encoded and sent using theutf8charset.
- return data: bytes: The data is treated as binary, the defaultContent-Type="application/octet-stream;charset=utf8"header is set in response, and data is sent as-is.
- return data, kwargs: Ifmimetypevalue is missing inkwargs, the defaultmimetypeis set according to the data typestr/bytes->text/plain/application/octet-stream. Ifcharsetvalue is missing, the defaultutf8charset is set; then, if the data is of typestr, it will be encoded using resolvedcharsetand sent.

#### Batch mode

The best way to debug in batch mode is to provide `--output` file. The returned data is written to a file according to the type of data returned:

- strdata is written to a text file using defaultutf8or returned inkwargscharset.
- bytesdata is written to a binary file. The returnedkwargsare not shown in batch mode, but you can still print them during debugging.

### Auxiliaries

You may use the `datarobot_drum.RuntimeParameters` in your code (e.g.`custom.py`) to read runtime parameters delivered to the executed custom model. The runtime parameters should be defined in the DataRobot UI. Below is a simple example of how to read a string of credential runtime parameters:

```
from datarobot_drum import RuntimeParameters

def load_model(code_dir):
    target_url = RuntimeParameters.get("TARGET_URL")
    s3_creds = RuntimeParameters.get("AWS_CREDENTIAL")
    ...
```
