KEMBAR78
Incorrect model returned by LightGBM calibration for multiclass with softmax · Issue #1424 · dotnet/machinelearning · GitHub
Skip to content

Incorrect model returned by LightGBM calibration for multiclass with softmax #1424

@mjmckp

Description

@mjmckp

return OvaPredictor.Create(Host, predictors);

The model returned here is only correct if the objective type is multiclassova, since the OvaPredictor predictor type created here always just normalises the _numClass probabilities to sum to one:

However, when the objective type is multiclass (i.e., softmax), what should happen is that the OvaPredictor type should be given an array of predictors that return the raw scores from each of the trees (i.e., the result of CreateBinaryPredictor instead of the FeatureWeightsCalibratedPredictor), and instead of normalising the outputs of these predictors, it should be computing the softmax function of the raw scores, as is done in the LightGBM implementation:
https://github.com/Microsoft/LightGBM/blob/087d30623aef4b50592cf528eb50c1c55baeaab8/src/objective/multiclass_objective.hpp#L115

These kind of bugs would be easily detected if there were unit tests comparing the results of evaluating the managed ML.NET GBDTs against the original native implementation (i.e., calling LGBM_BoosterPredictForMat on the Booster object used to calibrate the model).

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions