Poisson optimization metric¶
Robot 1
Hello team, other than setting up the project as a regression problem, do we have any suggestions for the following?
- Does DR have any plans to explicitly support modeling count data?
- How does DR suggest customers model counts if they want to use the platform?
This has come up a few times (and in another project yesterday, where the response is a count, and they ignore this and just upload and model as regression).
Robot 1
The real question I’m asking is whether any of the blueprints that hint at modeling counts actually model counts? I think the XGB + Poisson loss ones do. Also, the GLMs-based blueprints (like elastic net and such) naturally support Poisson/NB distributions, but wasn’t sure if DataRobot supported those or not?
Robot 2
Use Poisson as the project metric! DataRobot has great support for count data. You don’t need to worry about logging the data: we handle the link function for you.
We have Poisson GLMs, Poisson XGBoost, and Poisson neural networks for modeling count data! They work great!
Robot 2
We also support weights, offsets, and exposure for projects that model counts (e.g., projects using poisson loss).
Robot 3
I bet that just loading the data into our platform and hitting start will do the trick 9/10 times. Based on the EDA analysis of the target, sometimes the recommended optimization metric will already be set up for you.
@Robot 2
thoughts?
Robot 2
I agree!