-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
LightGBM 2.3.1 has some parameters and features which are missing from Microsoft.ML.LightGBM. One such is refitting / re-training:
- https://lightgbm.readthedocs.io/en/latest/pythonapi/lightgbm.Booster.html#lightgbm.Booster.refit
- How does refit work? microsoft/LightGBM#1473
- https://docs.microsoft.com/en-us/dotnet/machine-learning/how-to-guides/retrain-model-ml-net
Re-fitting is missing from all tree-based trainers in Microsoft.ML. It has been requested at least in #6010 albeit for FastTree, as well something useful for some of my use cases. Another useful would be GPU-support which is mentioned multiple issues, e.g. #452
There was also PR about LightGBM upgrade to 3.x versions but it was decided to not upgrade it at this time. Upgrade requirements are listed in issue #5447 .NET might be missing out on the improvements, as well as throws "Bad allocation" errors in some combination of hyperparameters.
In my experiments, FastTree often outperforms LightGBM. Based on online discussion sentiment about LightGBM this should not be the case. LightGBM is considered a high-performing algorithm and has continued development which I think LightGBM integration should be important for performance of Microsoft.ML.
Finally, there may be some misconfiguration which prevents Microsoft.ML.LightGBM getting similar results with same hyperparameters as through Python interface (links in #6064 , sample code https://github.com/torronen/lightgbm-comparison )
Is the intent to have a complete LightGBM interface in .NET, or is it better to use Python for advanced uses cases (and e.g. export to ONNX from Python)? Any roadmaps / estimated priority for upgrade of LightGBM?