Skip to content

On-premise users: click in-app to access the full platform documentation for your version of DataRobot.

Triage insurance claims

This page outlines a use case that assesses claim complexity and severity as early as possible to optimize claim routing, ensure the appropriate level of attention, and improve claimant communications. It is captured below as a UI-based walkthrough. It is also available as a Jupyter notebook that you can download and execute.

Business problem

Claim payments and claim adjustment are typically an insurance company’s largest expenses. For long-tail lines of business, such as workers’ compensation (which covers medical expenses and lost wages for injured workers), the true cost of a claim may not be known for many years until it is paid in full. However, claim adjustment activities start when a claim is made aware to the insurer.

Typically when an employee gets injured at work (Accident Date), the employer (insured) decides to file a claim to its insurance company (Report Date) and a claim record is created in the insurer's claim system with all available information about the claim at the time of reporting. The claim is then assigned to a claim adjuster. This assignment could be purely random or based on roughly defined business rules. During the life cycle of a claim, assignment may be re-evaluated multiple times and re-assigned to a different claim adjuster.

This process, however, has costly consequences:

  • It is well-known in insurance that 20% of claims account for 80% of the total claim payouts. Randomly assigning claims wastes resources.

  • Early intervention is critical to optimal claim results. Without the appropriate assignment of resources as early as possible, seemingly mild claims can become substantial.

  • Claims of low severity and complexity must wait to be processed alongside all other claims, often leading to a poor customer experience.

  • A typical claim adjuster can receive several hundred new claims every month, in addition to any existing open claims. When a claim adjuster is overloaded, it is unlikely they can process every assigned claim. If too much time passes, the claimant is more likely to obtain an attorney to assist in the process, driving up the cost of the claim unnecessarily.

Solution value

  • Challenge: Help insurers assess claim complexity and severity as early as possible so that:

    • Claims of low severity and low complexity are routed to straight-through-processing, avoiding the wait and improving the customer experience.
    • Claims of high complexity get the required attention of experienced claim adjusters and nurse case managers.
    • The improved communication between claimants and the insured leads to minimized attorney involvement.
    • The transfer of knowledge between experienced and junior adjusters is improved.
  • Desired Outcome

    • Reduce loss adjustment expenses by more efficiently allocating claim resources.
    • Reduce claims’ costs by effectively assigning nurse case managers and experienced adjusters to claims that they can impact the most.
  • How can DataRobot help?

    • Machine learning models using claim- and policy-level attributes at First Notice of Loss (FNOL) can help you understand the complicated relationship between claim severity and various policy attributes at an early stage of a claim's life cycle. Model predictions are used to rank new claims from least severe to most severe. Thresholds can be determined by the business based on the perceived level of low-, medium-, high-severity or volume of claims that a claim adjuster's bandwidth can handle. You can also create thresholds based on a combination of claim severity and claim volume. Use these thresholds and model predictions to route claims in an efficient manner.
Topic Description
Use case type Insurance / Claim Triage
Target audience Claim adjusters
Metrics / KPIs
  • False positive/negative rate
  • Total expense savings (in terms of both labor and more accurate adjudication of claims)
  • Customer satisfaction
Sample dataset Download here

Problem framing

A machine learning model learns complex patterns from historically observed data. Those patterns can be used to make predictions on new data. In this use case, historical insurance claim data is used to build the model. When a new claim is reported, the model makes a prediction on it.

Depending on how the problem is framed, the prediction can have different meanings. The goal of this claim triage use case is to have a model evaluate the workers' compensation claim severity as early as possible, ideally at the moment a claim is reported (the first notice of loss, or FNOL). The target feature is related to the total payment for a claim and the modeling unit is each individual claim.

When the total payment for a claim is treated as the target, the use case is framed as a regression problem because you are predicting a quantity. The predicted total payment can then be compared with thresholds for low and high severity claims defined by business need, which classifies each claim as low-, medium-, or high-severity.

Alternatively, you can frame this use case as a classification problem. To do so, apply the aforementioned thresholds to the total claim payment first and convert it to a categorical feature with levels "Low", "Medium" and "High". You can then build a classification model that uses this categorical variable as the target. The model instead predicts the probability a claim is going to be low-, medium- or high-severity.

Regardless how the problem is framed, the ultimate goal is to route a claim appropriately.

ROI estimation

For this use case, direct return on investment (ROI) comes from improved claim handling results and expense savings. Indirect ROI stems from improved customer experience which in turn increases customer loyalty. The steps below focus on the direct ROI calculation based on the following assumptions:

  • 10,000 claims every month
  • Category I: 30% (3000) of claims are routed to straight through processing (STP)
  • Category II: 60% (6000) of claims are handled normally
  • Category III: 10% (1000) of claims are handled by experienced claim adjusters
  • Average Category I claim severity is 250 without the model; 275 with the model
  • Average Category II claim severity is 10K without the model; 9500 with the model
  • Saved labor: 3 full-time employees with an average annual salary of 65000

Total annual ROI = 65000 x 3 + [3000 x (250-275) + 1000 x (10000 - 9500)] x 12 = $5295000

Working with data

The sample data for this use case is a synthetic dataset from a worker compensation insurer's claims database, organized at the individual claim level. Most claim databases in an insurance company contain transactional data, i.e., one claim may have multiple records in the claims database. When the claim is first reported, a claim is recorded in the claims systems and initial information about the claim is recorded. Depending on the insurer's practice, a case reserve may be set up. The case reserve is adjusted accordingly when claim payments are made or additional information collected indicates a need to change the case reserve.

Policy-level information can be predictive as well. This type of information includes class, industry, job description, employee tenure, size of the employer, and whether there is a return to work program. Policy attributes should be joined with the claims data to form the modeling dataset, although they are ignored in this example.

When it comes to claim triage, insurers would like to know as early as possible how severe a claim potentially is, ideally at the moment a claim is reported (FNOL). However, an accurate estimate of a claim's severity may not be feasible at FNOL due to insufficient information. Therefore, in practice, a series of claim triage models are needed to predict the severity of a claim at different stages of that claim's life cycle, e.g., FNOL, 30 days, 60 days, 90 days, etc.

For each of the models, the goal is to predict the severity of a claim; therefore, the target feature is the total payment on a claim. The features included in the training data are the claim attributes and policy attributes at different snapshots. For example, for an FNOL model, features are limited to what is known about a claim at FNOL. For insurers still using legacy systems which may not record the true FNOL data, an approximation is often made between 0-30 days.

Features overview

The following table outlines the prominent features in the sample training dataset.

Feature Name Data Type Description Data Source
ReportingDelay Numeric Number of days between the accident date and report date Claims
AccidentHour Numeric Time of day that the accident occurred Claims
Age Numeric Age of claimant Claims
Weekly Rate Numeric Weekly salary Claims
Gender Categorical Gender of the claimant Claims
Marital Status Categorical Whether the claimant is married or not Claims
HoursWorkedPerWeek Numeric The usual number of hours worked per week by the claimant Claims
DependentChildren Numeric Claimant's number of dependent children Claims
DependentsOther Numeric Claimant's number of dependents who are not children Claims
PartTimeFullTime Numeric Whether the claimant works part time or full time Claims
DaysWorkedPerWeek Numeric Number of days per week worked by the claimant Claims
DateOfAccident Date Date that the accident occurred Claims
ClaimDescription Text Text description of the accident and injury Claims
ReportedDay Numeric Day of the week that the claim was reported to the insurer Claims
InitialCaseEstimate Numeric Initial case estimate set by claim staff Claims
Incurred Numeric target : final cost of the claim = all payments made by the insurer Claims

Data preparation

The example data is organized at the claim level; each row is a claim record, with all the claim attributes taken at FNOL. On the other hand, the target variable, Incurred, is the total payment for a claim when it is closed. So there are no open claims in the data.

A workers’ compensation insurance carrier’s claim database is usually stored at the transaction level. That is, a new record will be created for each change to a claim, such as partial claim payments and reserve changes. This use case snapshots the claim (and all the attributes related to the claim) when it is first reported and then again when the claim is closed (from target to total payment). Policy-level information can be predictive as well, such as class, industry, job description, employee tenure, size of the employer, whether there is a return to work program, etc. Policy attributes should be joined with the claims’ data to form the modeling dataset.

Data evaluation

Once the modeling data is uploaded to DataRobot, EDA produces a brief summary of the data, including descriptions of feature type, summary statistics for numeric features, and the distribution of each feature. A data quality assessment helps ensure that only appropriate data is used in the modeling process. Navigate to the Data tab to learn more about your data.

Exploratory Data Analysis

Click each feature to see histogram information such as the summary statistics (min, max, mean, std) of numeric features or a histogram that represents the relationship of a feature with the target.

DataRobot automatically performs data quality checks. In this example, it has detected outliers for the target feature. Click Show Outliers to view them all (outliers are common in insurance claims data). To avoid bias introduced by the outlier, a common practice is to cap the target, such as capping it to the 95th percentile. This cap is especially important for linear models.

Feature Associations

Use the Feature Associations tab to visualize the correlations between each pair of the input features. For example, in the plot below, the features DaysWorkedPerWeek and PartTimeFullTime (top-left corner) have strong associations and are therefore "clustered" together. Each color block in this matrix is a cluster.

Modeling and insights

After modeling completes, you can begin interpreting the model results.

Feature Impact

Feature Impact reveals the association between each feature and the model target—the key drivers of the model. Feature Impact ranks features based on feature importance, from the most important to the least important, and also shows the relative importance of those features. In the example below we can see that InitialCaseEstimateis the most important feature for this model, followed by ClaimDescription, WeeklyRate, Age, HoursWorkedPerWeek, etc.

This example indicates that features after MaritalStatus contribute little to the model. For example, genderhas minimal contribution to the model, indicating that claim severity doesn't vary by the gender of the claimant. If you create a new feature list that does not include gender (and other features less impactful than MaritalStatus) and only includes the most impactful features, the model accuracy should not be significantly impacted. A natural next step is to create a new feature list with only the top features and rerun the model. DataRobot automatically creates a new feature list, "DR Reduced Features", by including features that have a cumulative feature impact of 95%.

Partial Dependence plot

Once you know which features are important to the model, it is useful to know how each feature affects predictions. This can be seen in Feature Effects and in particular a model's partial dependence plot. In the example below, notice the partial dependence for the WeeklyRate feature. You can observe that claimants with lower weekly pay have lower claim severity, while claimants with higher weekly pay have higher claim severity.

Prediction Explanations

When a claims adjuster sees a low prediction for a claim, they are likely to initially ask what the drivers are behind such a low prediction. The Prediction Explanation insight, provided at an individual prediction level, can help claim adjusters understand how a prediction is made, increasing confidence in the model. By default, DataRobot provides the top three explanations for each prediction, but you can request up to 10 explanations. Model predictions and explanations can be downloaded as a CSV and you can control which predictions are populated in the CSV by specifying the thresholds for high and low predictions.

The graph below shows the top three explanations for the 3 highest and lowest predictions. The graph shows that, generally, high predictions are associated with older claimants and higher weekly salary, while the low predictions are associated with a lower weekly salary.

Word Cloud

The feature ClaimDescription is an unstructured text field. DataRobot builds text mining models on textual features, and the output from those text-mining models is used as inputs into subsequent modeling processes. Below is a Word Cloud for ClaimDescription, which shows the keywords parsed out by DataRobot. Size of the word indicates how frequently the word appears in the data: strain appears very often in the data while fractureddoes not appear as often. Color indicates severity: both strain and fractured (red words) are associated with high severity claims while finger and eye (blue words) are associated with low severity claims.

Evaluate accuracy

The following insights help evaluate accuracy.

Lift Chart

The Lift Chart shows how effective the model is at differentiating lowest risks (on the left) from highest risks (on the right). In the example below, the blue curve represents the average predicted claim cost, and the orange curve indicates the average actual claim cost. The upward slope indicates the model has effectively differentiated the claims of low severity (close to 0) on the left and those of high severity (~45K) on the right. The fact that the actual values (orange curve) closely track the predicted values (blue curve) tells you that the model fits the data well.

Note that DataRobot only displays lift charts on validation or holdout partitions.

Post-processing

A prediction for claim severity can be used for multiple different applications, requiring different post-processing steps for each. Primary insurers may use the model predictions for claim triage, initial case reserve determination, or reinsurance reporting. For example, for claim triage at FNOL, the model prediction can be used to determine where the claim should be routed. A workers’ compensation carrier may decide:

  • All claims with predicted severity under $5000 go to straight-through processing (STP).
  • Claims between $5000 and $20,000 go through the standard process.
  • Claims over $20,000 are assigned a nurse case manager.
  • Claims over $500,000 are also reported to a reinsurer, if applicable.

Another carrier may decide to pass 40% of claims to STP; 55% to regular process; and 5% get assigned a nurse case manager so that thresholds can be determined accordingly. These thresholds can be programmed into the business process so that claims go through the predesigned pipeline once reported and then get routed appropriately. Note that companies with STP should carefully design their claim monitoring procedures to ensure unexpected claim activities are captured.

In order to test these different assumptions, design single or multiple A/B tests and run them in sequence or parallel. Power analysis and p-value needs to be set before the tests in order to determine the number of observations required before stopping the test. In designing the test, think carefully about the drivers of profitability. Ideally you want to allocate resources based on the change they can effect, not just on the cost of the claim. For example, fatality claims are relatively costly but not complex, and so often can be assigned to a very junior claims handler. Finally, at the end of the A/B tests, you can identify the best combination based on the profit of each test.

Predict and deploy

You can use the DataRobot UI or REST API to deploy a model, depending on how ready it is to be put into production. However, before the model is fully integrated into production, a pilot may be beneficial for:

  • Testing the model performance using new claims data.
  • Monitoring unexpected scenarios so a formal monitoring process can be designed or modified accordingly.
  • Increasing the end users’ confidence in using the model outputs to assist business decision making.

Once stakeholders feel comfortable about the model and also the process, integration of the model with production systems can maximize the value of the model. The outputs from the model can be customized to meet the needs of claim management.

Decision process

Deploy the selected model into your desired decision environment to embed the predictions into your regular business decisions. Insurance companies often have a separate system for claims management. For this particular use case, it may be in the best interest of the users to integrate the model with the claims management system, and with visualization tools such as Power BI or Tableau.

If a model is integrated within an insurer’s claim management system when a new claim is reported, FNOL staff can record all the available information in the system. The model can then be run in the background to evaluate the ultimate severity. The estimated severity can help suggest initial case reserves and appropriate route for further claim handling (i.e., STP, regular claim adjusting, or experienced claim adjusters, possibly with nurse case manager involvement and/or reinsurance reporting).

Carriers will want to include rules-based decisions as well, to capture decisions that are driven by considerations other than ultimate claim severity.

Most carriers do not set initial reserves for STP claims. For those claims beyond STP, you can use model predictions to set initial reserves at the first notice of loss. Claims adjusters and nurse case managers will only be involved for claims over certain thresholds. The reinsurance reporting process may benefit from the model predictions as well; instead of waiting for claims to develop to very high severity, the reporting process may start at FNOL. Reinsurers will certainly appreciate the timely reporting of high severity claims, which will further improve the relationship between primary carriers and reinsurers.

Decision stakeholders

Consider the following to serve as decision stakeholders:

  • Claims management team
  • Claims adjusters
  • Reserving actuaries

Model monitoring

Carriers implementing a claims severity model usually have strictly defined business rules to ensure abnormal activities will be captured before they get out of control. Triggers based on abnormal behavior (for example, abnormally high predictions, too many missing inputs, etc.) can trigger manual reviews. Use the performance monitoring capabilities—especially service health, data drift, and accuracy to produce and distribute regular reports to stakeholders.

Implementation considerations

A claim severity model at FNOL should be one of a series of models built to monitor claim severity over time. Besides the FNOL Model, build separate models at different stages of a claim (e.g., 30 days, 90 days, 180 days) to leverage the additional information available and further evaluate the claim severity. Additional information comes in over time regarding medical treatments and diagnoses and missed work, allowing for improved accuracy as a claim matures.

No-Code AI Apps

Consider building a custom application where stakeholders can interact with the predictions and record the outcomes of the investigation. Once the model is deployed, predictions can be consumed for use in the decision process. For example, this No-Code AI App is an easily shareable, AI-powered application using a no-code interface:

Notebook demo

See the notebook version of this accelerator here.


Updated February 1, 2024