KEMBAR78
Ensemble Methods - Foundations and Algorithms | PDF | Applied Mathematics | Cognition
0% found this document useful (0 votes)
144 views2 pages

Ensemble Methods - Foundations and Algorithms

Zhi-Hua Zhou's book 'Ensemble Methods: Foundations and Algorithms' provides a comprehensive overview of ensemble methods, including boosting and bagging, and discusses various combination techniques for improving model accuracy. The book covers theoretical foundations, practical applications, and advanced topics like semi-supervised learning and clustering ensembles, making it suitable for both researchers and practitioners. While it excels in clarity and depth, the reviewer notes a lack of coverage on statistical methods and software references, suggesting these could be addressed in a future edition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
144 views2 pages

Ensemble Methods - Foundations and Algorithms

Zhi-Hua Zhou's book 'Ensemble Methods: Foundations and Algorithms' provides a comprehensive overview of ensemble methods, including boosting and bagging, and discusses various combination techniques for improving model accuracy. The book covers theoretical foundations, practical applications, and advanced topics like semi-supervised learning and clustering ensembles, making it suitable for both researchers and practitioners. While it excels in clarity and depth, the reviewer notes a lack of coverage on statistical methods and software references, suggesting these could be addressed in a future edition.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Book Review: Ensemble Methods: Foundations and Algorithms 33

Ensemble Methods:
Foundations and Algorithms
BY Zhi-Hua Zhou - ISBN 978-1-439-830031

detail. The boosting chapter explains the prominent form of combination methods:
basic idea, which starts by fitting one Averaging (simple, weighted, etc.) for
learner, and correcting its “mistakes” in regression, and voting (majority,
subsequent learners. Adaboost is its best weighted, plurality, etc.) for
known representative of the residual- classification. Next, Stacking (also
decreasing methods, which is explained known as constructing a meta-learner);
in-depth in Chapter 2. It is an example of the idea of stacking is to train the
a sequential ensemble method. Error first-level learners using the original
bounds of the final combined learner are training dataset, and then generate a new
discussed based on the errors of its weak dataset for training the second-level
base learners. Mostly, the book first (meta) learner, where the outputs of the
explains the binary classification first-level learners are regarded as input
problem, and then ventures into multi- features. Next, the author goes on to
class extensions (one-versus-all, one- discuss a number of other combination
versus-one approaches), also in this case methods: algebraic methods, Behavior
for multiclass Adaboost. It is well Knowledge Space (BKS) method and
known that the algorithm suffers from decision template method.
noisy data. Hence, the remainder of this
chapter mainly focuses on how the Diversity is the foundation on which the
algorithm can be made less vulnerable performance of ensembles is built.
to its weakness to noisy data. Hence, the book devotes an entire
chapter (5) to this topic, providing a lot
Chapter 3 details the Bagging idea of information of diversity measures.
REVIEWED BY DIRK VAN DEN POEL (Boostrap AGGregatING), which is a
parallel ensemble method, and lends Chapter 6 is devoted to ensemble
nsemble methods train multiple learners itself ideally to the possibility of parallel pruning: Instead of using all learners,
and then combine them for use. They computing. Bagging uses bootstrap why not use a subset of them. Generally,
have become a hot topic in academia sampling (i.e., composing a new dataset it is better to retain some accurate
since the 1990s, and are enjoying of the same size by sampling with learners together with some not-that-
increased attention in industry. This is replacement from the base dataset). It good but complementary learners. The
mainly based on their generalization builds on the idea that the combination author discusses ordering-based pruning,
ability, which is often much stronger of independent base learners will lead to clustering-based pruning, and optimiza-
than that of simple/base learners. a substantial decrease of errors and tion-based pruning.
Ensemble methods are able to boost therefore, we want to obtain base
weak learners, which are even just learners as independent as possible. The In Chapter 7 the book discusses
slightly better than random performance bootstrapping leads to a nice Clustering Ensembles. These are desired
to strong learners, which can make very side-benefit: Thanks to sampling with to improve clustering quality, clustering
accurate predictions. replacement, about 37% of the base robustness, etc., although their original
dataset remains unused, i.e., out-of-bag motivation was to enable knowledge
Zhi-Hua Zhou’s “Ensemble Methods: validation performance can be reuse and distributed computing. The
Foundations and Algorithms” starts off computed to assess the quality of the author discusses similarity-based
in Chapter 1 with a brief introduction to learner. Talking about bagging would methods, graph-based methods, relabel-
the basics, by discussing nomenclature not be complete without talking about ing-based methods, and transformation-
and the basic classifiers including, naive Random Forest, Breiman’s random tree based methods.
bayes, SVM, k-NN, decision trees, etc. ensemble. They can also be found in the
book. Finally, Chapter 8 discusses advanced
The real ensemble content kicks off with topics such as semi-supervised learning
a discussion of Boosting (Chapter 2), Chapter 4 talks about combination with ensembles, active learning, and
followed by Bagging (Chapter 3). These methods, which form the basis to class-imbalance learning. In real-world
two chapters form the heart of the book; achieve strong generalization ability. applications, in addition to attaining
hence they are discussing the topic in The author starts with the most good accuracy, the comprehensibility of

IEEE Intelligent Informatics Bulletin December 2012 Vol.13 No.1


34 Profile/Research Outlook/Book Review: Title

the learned model is also important, clear explanation of the reasoning well, and most of all, it provides an
because an ensemble aggregates behind it. The discussion starts with the comprehensive overview of the
multiple models. Among my favorite basic algorithm, and then introduces a alternative approaches (as opposed to
parts of the book: A discussion of the number of improvements that have been the academic papers, where it lies
alternative ways to achieve this published in leading scientific journals. scattered in thousands of (small)
objective: e.g. reduction of the ensemble At the end of each chapter, there is contributions).
to a single model. always a "further readings" section
providing hints for literature reading.
It is always exciting to read a new book THE BOOK:
of a prominent researcher in the field. What I missed in this book? Some of the
Zhi-Hua Zhou’s book certainly qualifies statistical methods (logistic regression), ZHI-HUA ZHOU (2012), ENSEMBLE
in this category. Discussion in the book references to software and hybrid METHODS: FOUNDATIONS AND
starts from a theoretical foundation, but ensembles. This should be seen as ALGORITHMS, 236 P. BOCA RATON, FL:
the author also includes many references suggestions for a second edition of the CHAPMAN & HALL/CRC.
to successful applications, which makes book, rather than as real problems. A ISBN: 978-1-439-830031
it a good book both for the researcher book is always a compromise. Unlike a
and the practitioner. Moreover, this website, a book has to be balanced,
book is not written from a single point of which means one cannot provide ABOUT THE REVIEWER:
view, but rather includes the view from asymmetric depth in the different topics.
pattern recognition, data mining as well DIRK VAN DEN POEL
as (to a lesser extent) statistics. In sum, this book deserves a special Marketing Analytics at Faculty of
place in my library. It is well-written, Economics and Business
Important algorithms/approaches are and provides a very clear explanation of Administration, Ghent University,
discussed in pseudo-codes, which the different ensemble approaches Belgium. Contact him at:
facilitates the understanding. The author including the intuition behind the dirk.vandenpoel@ugent.be
does not just provide the math, but also a algorithms why some of them work so

December 2012 Vol.13 No.1 IEEE Intelligent Informatics Bulletin

You might also like