Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

JAR structure

Once you have downloaded the Scoring Code JAR package to your machine, you'll see that it has a well-organized structure:

Root directory

The root directory contains a set of .so and .jnilib files. These contain compiled Java Native Interface code for LAPACK and BLAS libraries. When a JAR is launched, it first attempts to locate these libraries in the OS. If located, model scoring is greatly speeded up. If the libraries are not located, Scoring Code falls back to a slower Java implementation.

com.github.fommil package

The com.github.fommil package contains the Java-side of LAPACK and BLAS native interfaces.

drmodel_ID package

The drmodel_ID package contains a set of binary files with parameters for individual nodes of a DataRobot model (blueprint). While these parameters are not human-readable, you can still get their values by debugging readParameters(DRDataInputStream dis) methods inside of classes that implement nodes of the model. These classes are located inside of the om.datarobot.prediction.dr<model_ID> package.

com.datarobot.prediction package

The com.datarobot.prediction package contains commonly used Java interfaces inside of a Scoring Code JAR. To maintain backward compatibility, it contains both current and deprecated versions of the interfaces. The deprecated interfaces are Predictor, MulticlassPredictor, and Row.

com.datarobot.prediction.dr package

Thecom.datarobot.prediction.dr<model_ID> package contains the classes that implement the model (blueprint) as well as some utility code.

To understand the model, start with the BP.java class. This class manages data flow through the model. The raw data comes into the DP.java class where feature conversion and transformation operations take place. Then, the preprocessed data goes into each one of V<number> classes where actual steps of model execution take place. All of these classes use three main utility classes:

  • BaseDataStructure defines a unified container for data.

  • DRDataInputStream reads binary parameters from the package dr<model_ID>.

  • BaseVertex contains actual implementations of machine learning algorithms and utility functions.

  • DRModel defines the low-level implementation of a model API. The classes RegressionPredictorImpl and ClassificationPredictorImpl are top-level APIs built on top of DRModel. It is highly recommended that you use these classes instead of using DRModel directly. More information about these interfaces can be found in the javadoc (linked from the Downloads tab) and in the section Backward-compatible Java API.

com.datarobot.prediction.drmatrix package

The com.datarobot.prediction.drmatrix package contains implementations of common matrix operations on dense and sparse matrices.

com.datarobot.prediction.engine and com.datarobot.prediction.io packages

The com.datarobot.prediction.engine and com.datarobot.prediction.io packages contain high-performance scoring logic that enables each Scoring Code JAR to be used as a command line scoring tool for CSV files.

Differences between source and binary JARs

The following table describes the differences between the source and binary download options.

Files Binary .jar Source .jar
Native .so and jnilib files for BLAS and LAPAC libraries Yes No
com.github.fommil for BLAS and LAPAC libraries Yes No
dr<model_ID> (binary parameters for nodes of the model) Yes Yes
com.datarobot.prediction Yes No
com.datarobot.prediction.drmodel_ID Yes Yes
com.datarobot.prediction.drmatrix Yes No
com.datarobot.prediction.engine Yes No
com.datarobot.prediction.io Yes No

DataRobot provides “source” .jar files for downloading to simplify the process of model inspection. By using the “source” download option, you get only the code that directly implements the model. It is the same code as the “binary” .jar, but stripped of all of the dependencies.


Updated May 31, 2022