How-to:連続値モデルの評価¶
This walkthrough uses machine learning to identify how different survey responses predict developer salaries. Think of this in the context of a Human Resources department determining the salary of an individual based on the experience needed for the position. Because the model needs to predict a number, this is a regression problem.
ダウンロードするアセット¶
To follow this walkthrough, download the datasets that will be used to train and evaluate a regression model below. The first is the training dataset, which will be used to build the model. The second is the test dataset, which will be used to generate predictions.
重要
Follow the steps detailed in the Introduction to data analysis in DataRobot walkthrough to upload the dataset and prepare it for modeling.
Stack Overflow survey data¶
Stack Overflow runs an annual survey that captures the feedback of thousands of developers. The survey collects an array of information, including favorite technologies, preferences for job types, and even salaries.
The data from this version of the survey:
- Was collected in 2019.
- Is anonymized and published online.
- Contains over 90,000 responses.
- Consists of many different information types (such as text and categoricals).
- Is more than just a few hundred rows.
Building a model¶
Now that the data has been uploaded and analyzed, it is time to build a model.
The steps in this section will build a model that can be used to predict the salary amount, which is indicated by the CompTotal feature.
-
Click Data actions > Start modeling.
-
In the Set up new experiment window, specify
CompTotalin the Target feature field. -
Leave the remaining fields at their defaults and click Next >.
備考
For more details on the additional settings, see Start modeling setup.
-
Leave all partitioning changes fields at their defaults and click Start modeling.
-
DataRobot begins building the models.
-
After a few moments, the Model Leaderboard appears and indicates the training progress.
Model build time
Model build time can vary depending on the size of the dataset. When it completes, the Workers pane displays No jobs currently running.
For details on how to assess the various models after they are built, see Compare models.
Model evaluation and interpretation¶
Now that a set of models are ready for analysis, select the top model and explore its details. DataRobot flags the most accurate model as Prepared for deployment in the Model Leaderboard.
Click the model to view more detailed information about it. Use the tabs in the Details pane to explore various insights, as highlighted below.
These tabs provide a quick overview of the evaluation metrics available. Click Explanations > Individual Prediction Explanations and then Compute to have DataRobot generate the number of predictions for each row in the dataset.
As seen in the graph above, the model shows the expected salary range based on the features in the dataset. The table below the graph provides a sample of five predictions from the model as an example of its results. Click one of the predictions to see its details.
For more details on how to evaluate a model and explanations for what each insight means, see Evaluate with model insights.
Make predictions with the model¶
When the most accurate model has been identified and selected, it can be used to make predictions.
-
Click Model actions > Make predictions.
-
In the Make Predictions window, specify the dataset to use for predictions. In this case, use the test dataset by clicking Choose file > Upload a local file. Browse to the files downloaded in the Assets for download section and select the
test_set_usd.csvfile. -
Once the new data is uploaded and processed, click Compute and download predictions to generate the predictions. This process can take some time, depending on the size of the dataset.
For a deeper dive into making predictions, see Make predictions.
Review the results¶
Once the predictions are successfully generated, review them to see how well the model performed by opening the downloaded predictions file using a spreadsheet application. Alternatively, the predictions can be viewed in DataRobot by clicking Workbench and selecting your Use Case from the table.
Once the uploaded file registers, click the new dataset to view the predictions.












