Poisson optimization metric¶
Hello team, other than setting up the project as a regression problem, do we have any suggestions for the following?
- Does DR have any plans to explicitly support modeling count data?
- How does DR suggest customers model counts if they want to use the platform?
This has come up a few times (and in another project yesterday, where the response is a count, and they ignore this and just upload and model as regression).
The real question I’m asking is whether any of the blueprints that hint at modeling counts actually model counts? I think the XGB + Poisson loss ones do. Also, the GLMs-based blueprints (like elastic net and such) naturally support Poisson/NB distributions, but wasn’t sure if DataRobot supported those or not?
Use Poisson as the project metric! DataRobot has great support for count data. You don’t need to worry about logging the data: we handle the link function for you.
We have Poisson GLMs, Poisson XGBoost, and Poisson neural networks for modeling count data! They work great!
We also support weights, offsets, and exposure for projects that model counts (e.g., projects using poisson loss).
I bet that just loading the data into our platform and hitting start will do the trick 9/10 times. Based on the EDA analysis of the target, sometimes the recommended optimization metric will already be set up for you.
Robot 2 thoughts?