Represents en weighted ensemble of models. More...
#include <shark/Models/Ensemble.h>
Public Member Functions | |
std::string | name () const |
returns the name of the object | |
void | addModel (ModelType const &model, double weight=1.0) |
Adds a new model to the ensemble. | |
void | clearModels () |
Removes all models from the ensemble. | |
std::size_t | numberOfModels () const |
Returns the number of models. | |
std::remove_pointer< ModelType >::type & | model (std::size_t i) |
Returns a reference to the i-th model. | |
std::remove_pointer< ModelType >::type const & | model (std::size_t i) const |
Returns a const reference to the i-th model. | |
double const & | weight (std::size_t i) const |
Returns the weight of the i-th model. | |
double & | weight (std::size_t i) |
Returns the weight of the i-th model. | |
double | sumOfWeights () const |
Returns the total sum of weights used for averaging. | |
Represents en weighted ensemble of models.
In an ensemble, each model computes a response for an input independently. The responses are then pooled to form a single label. The hope is that models in an ensemble do not produce the same type of errors and thus the averaged response is more reliable. An example for this is AdaBoost, where a series of weak models is trained and weighted to create one final prediction.
There are two orthogonal aspects to consider in the Ensemble. The pooling function, which is chosen based on the output type of the ensemble models, and the mapping of the output of the pooling function to the model output.
If the ensemble consists of models returning vectors, pooling is implemented using weighted averaging. If the models return class labels, those are first transformed into a one-hot encoding before averaging. Thus the output can be interpreted as the probability of a class label when picking a member of the emsemble randomly with probability proportional to its weights.
The final mapping to the output is based on the OutputType template parameter, which by default is the same as the output type of the model. If it is unsigned int, the Ensemble is treated as Classifier with decision function being the result of the pooling function (i.e. the class with maximum response in the weighted average is chosen). In this case, Essemble is derived from Classifier<>. Otherwise the weighted average is returned.
Note that there is a decision in algorihm design tot ake for classifiers: We can either let each member of the Ensemble predict a class-label and then pool the labels as described above, or we can create an ensemble of decision functions and weight them into one decision function to produce the class-label. Those approaches will lead to different results. For example if the underlying models produce class probabilities, the class with the largest average probability might not be the same as the class with most votes from the individual models.
Models are added using addModel. The ModelType is allowed to be either a concrete model like LinearModel<>, in which case a copy of each added model is stored. If the ModelType is a pointer, for example AbstractModel<...>*, only pointers are stored and all added models must outlive the lifetime of the ensemble. This also entails differences in serialization. In the first case, the model can be serialized completely without any setup. In the second case before deserializing, the models must be constructed and added.
Definition at line 252 of file Ensemble.h.
|
inline |
Adds a new model to the ensemble.
model | the new model |
weight | weight of the model. must be > 0 |
Definition at line 261 of file Ensemble.h.
References shark::Ensemble< ModelType, OutputType >::model(), and shark::Ensemble< ModelType, OutputType >::weight().
Referenced by shark::RFTrainer< unsigned int >::train(), and shark::RFTrainer< RealVector >::train().
|
inline |
Removes all models from the ensemble.
Definition at line 266 of file Ensemble.h.
Referenced by shark::RFTrainer< unsigned int >::train(), and shark::RFTrainer< RealVector >::train().
|
inline |
Returns a reference to the i-th model.
i | model index. |
Definition at line 278 of file Ensemble.h.
Referenced by shark::Ensemble< ModelType, OutputType >::addModel().
|
inline |
Returns a const reference to the i-th model.
i | model index. |
Definition at line 284 of file Ensemble.h.
|
inlinevirtual |
returns the name of the object
Reimplemented from shark::INameable.
Reimplemented in shark::RFClassifier< LabelType >.
Definition at line 254 of file Ensemble.h.
|
inline |
Returns the number of models.
Definition at line 271 of file Ensemble.h.
|
inline |
Returns the total sum of weights used for averaging.
Definition at line 303 of file Ensemble.h.
|
inline |
Returns the weight of the i-th model.
i | model index. |
Definition at line 298 of file Ensemble.h.
|
inline |
Returns the weight of the i-th model.
i | model index. |
Definition at line 291 of file Ensemble.h.
Referenced by shark::Ensemble< ModelType, OutputType >::addModel().