Incorrect model returned by LightGBM calibration for multiclass with softmax

https://github.com/dotnet/machinelearning/blob/5123aeedb91a435236e8b94ba0f17a6d15a20db5/src/Microsoft.ML.LightGBM/LightGbmMulticlassTrainer.cs#L104

The model returned here is only correct if the objective type is `multiclassova`, since the `OvaPredictor` predictor type created here always just normalises the `_numClass` probabilities to sum to one:
https://github.com/dotnet/machinelearning/blob/5123aeedb91a435236e8b94ba0f17a6d15a20db5/src/Microsoft.ML.StandardLearners/Standard/MultiClass/Ova.cs#L575

However, when the objective type is `multiclass` (i.e., softmax), what should happen is that the `OvaPredictor` type should be given an array of predictors that return the _raw_ scores from each of the trees (i.e., the result of `CreateBinaryPredictor` instead of the `FeatureWeightsCalibratedPredictor`), and instead of normalising the outputs of these predictors, it should be computing the softmax function of the raw scores, as is done in the LightGBM implementation:
https://github.com/Microsoft/LightGBM/blob/087d30623aef4b50592cf528eb50c1c55baeaab8/src/objective/multiclass_objective.hpp#L115

These kind of bugs would be easily detected if there were unit tests comparing the results of evaluating the managed ML.NET GBDTs against the original native implementation (i.e., calling `LGBM_BoosterPredictForMat` on the Booster object used to calibrate the model).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect model returned by LightGBM calibration for multiclass with softmax #1424

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Incorrect model returned by LightGBM calibration for multiclass with softmax #1424

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions