Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Scoring at the command line

The following sections provide syntax for scoring at the command line.

Command line options

Option Required / Default Description
--help No
Default: Disabled
Prints all of the available options as well as some model metadata.
--input=<value> Yes
Default: None
Defines the source of the input data. Valid values are:
  • --input=- to set the input from standard input
  • --input=</path/to/input/csv>/input.csv to set the source of the data.
--output=<value> Yes
Default: None
Sets the way to output results. Valid values are:
  • --output=- to write the results to standard output
  • --output=/path/to/output/csv/output.csv to save results to a file. The output file always contains the same number of rows as the original file, and they are always in the same order. Note that for files smaller than 1GB, you can specify the output file to be the same as the input file, causing it to replace the input with the scored file.
--encoding=<value> No
Default: Default system encoding
Sets the charset encoding used to read file content. Use one of the canonical names for java.io API and java.lang API. If the option is not set, the tool will be able to detect UTF8 and UTF16 BOM.
--delimiter=<value> No
Default: , (comma)
Specifies the delimiter symbol used in CSV files to split values between columns.Note: use the option --delimiter=“;” to use the semicolon ; as a delimiter (; is a reserved symbol in bash/shell).
--passthrough_columns No
Default: None
Sets the input columns to include in the results file. For example, if the flag contains a set of columns (e.g., column1,column2), the output will contain predictive column(s) and the columns 1 and 2 only. To include all original columns, use All. The resulting file will contain columns in the same order, and will use the same format and the same value as the delimiters parameter. If this parameter is not specified, the command only returns the prediction column(s).
--chunk_size=<value> No
Default: min(1MB, {file_size}/{cores_number})
"Slices" the initial dataset into chunks to score in a sequence as separate asynchronous tasks. In most cases, the default value will produce the best performance. Bigger chunks can be used to score very fast models and smaller chunks can be used to score very slow models.
--workers_number=<value> No
Default: Number of logical cores
Specifies the number of workers that can process chunks of work concurrently. By default, the value will match the number of logical cores and will produce the best performance.
--log_level=<value> No
Default: INFO
Sets the level of information to be output to the console. Available options are INFO, DEBUG, and TRACE.
--pred_name=<value> No
Default: DR_Score
For regression projects, this field sets the name of the prediction column in the output file. In classification projects, the prediction labels are the same as the class labels.
--buffer_size=<value> No
Default: 1000
Controls the size of the asynchronous task queue. Set it to a smaller value if you are experiencing OutOfMemoryException errors while using this tool. This is an advanced parameter.
--config=<value> No
Default: The .jar file directory
Sets the location for the batch.properties file, which writes all config parameters to a single file. If you place it in the same directory as the .jar, you do not need to set this parameter. If you want to place batch.properties into another directory, you need to set the value of the parameter to be the path to the target directory.
--with_explanations No
Default: Disabled
Turns on prediction explanation computations.
--max_codes=<value> No
Default: 3
Sets the maximum number of explanations to compute.
--threshold_low=<value> No
Default: Null
Sets the low threshold for prediction rows to be included in the explanations.
--threshold_high=<value> No
Default: Null
Sets the high threshold for prediction rows to be included in the explanations.
--enable_mlops No
Default: Enabled
Initializes an MLOps instance for tracking scores.
--dr_token=<value> Yes if --enabled_mlops is set.
Default: None
Specifies the authorization token for monitoring agent requests.
--disable_agent No
Default: Enabled
When --enable_mlops is enabled, sets whether to allow offline tracking.
Time series options
--forecast_point=<value> No
Default: None
Formatted date from which to forecast.
--date_format=<value> No
Default: None
Date format to use for output.
--predictions_start_date=<value> No
Default: None
Timestamp that indicates when to start calculating predictions.
--predictions_end_date=<value> No
Default: None
Timestamp that indicates when to stop calculating predictions.
--with_intervals No
Default: None
Turns on prediction interval calculations.
--interval_length=<value> No
Default: None
Interval length as int value from 1 to 99.
--time_series_batch_processing No
Default: Disabled
Enables performance-optimized batch processing for time series models.

Note

For more information, see Scoring Code usage examples.

Batch properties file

You can configure the batch.properties file to change the default values for the command line options above, allowing you to simplify the command line scoring process, as having too many options for a bash command can make it difficult to read. In addition, some command line options depend on your scoring environment, leading to duplicate options for some commands; to avoid these duplications, you can save those parameters to the batch.properties file and reuse them.

The following properties are available in the batch.properties file, mapping to the listed command line option:

Batch property Option mapping
com.datarobot.predictions.batch.encoding --encoding
com.datarobot.predictions.batch.passthrough.columns --passthrough_columns
com.datarobot.predictions.batch.chunk.size=150 --chunk_size
com.datarobot.predictions.batch.workers.number= --workers_number
com.datarobot.predictions.batch.log.level=INFO --log_level
com.datarobot.predictions.batch.pred.name=PREDICTION --pred_name
com.datarobot.predictions.batch.buffer.size=1000 --buffer_size
com.datarobot.predictions.batch.enable.mlops=false --enable_mlops
com.datarobot.predictions.batch.disable.agent --disable_agent
com.datarobot.predictions.batch.max.file.size=1000000000 No option mapping
To read and write to and from the same file, this property sets the maximum original file size, allowing the command line interface to read it all in memory before scoring.
Time series parameters
com.datarobot.predictions.batch.forecast.point= --forecast_point
com.datarobot.predictions.batch.date.format=yyyy-MM-dd'T'HH:mm:ss.SSSSSS'Z' --date_format
com.datarobot.predictions.batch.start.timestamp= --predictions_start_date
com.datarobot.predictions.batch.end.timestamp= --predictions_end_date
com.datarobot.predictions.batch.with.interval --with_intervals
com.datarobot.predictions.batch.interval_length --interval_length
com.datarobot.predictions.batch.time.series.batch.proccessing --time_series_batch_processing

Increase Java heap memory

Depending on the model's binary size, you may have to increase the Java virtual machine (JVM) heap memory size. When scoring your model, if you receive an OutOfMemoryError: Java heap space error message, increase your Java heap size by calling java -Xmx1024m and adjusting the number as necessary to allocate sufficient memory for the process. To guarantee, in case of error, scoring result consistency and a non-zero exit code, run the application with the -XX:+ExitOnOutOfMemoryError flag.

The following example increases heap memory to 2GB:

java -XX:+ExitOnOutOfMemoryError -Xmx2g -Dlog4j2.formatMsgNoLookups=true -jar 5cd071deef881f011a334c2f.jar csv --input=Iris.csv --output=Isis_out.csv

Updated December 3, 2024