Kernel Target Alignment - a measure of alignment of a kernel Gram matrix with labels. More...
#include <shark/ObjectiveFunctions/KernelTargetAlignment.h>
Public Member Functions | |
KernelTargetAlignment (LabeledData< InputType, LabelType > const &dataset, AbstractKernelFunction< InputType > *kernel, bool centering=true) | |
Construction of the Kernel Target Alignment (KTA) from a kernel object. | |
std::string | name () const |
From INameable: return the class name. | |
SearchPointType | proposeStartingPoint () const |
Return the current kernel parameters as a starting point for an optimization run. | |
std::size_t | numberOfVariables () const |
Accesses the number of variables. | |
double | eval (RealVector const &input) const |
Evaluate the (centered, negative) Kernel Target Alignment (KTA). | |
ResultType | evalDerivative (const SearchPointType &input, FirstOrderDerivative &derivative) const |
Compute the derivative of the KTA as a function of the kernel parameters. | |
Public Member Functions inherited from shark::AbstractObjectiveFunction< RealVector, double > | |
const Features & | features () const |
virtual void | updateFeatures () |
bool | hasValue () const |
returns whether this function can calculate it's function value | |
bool | hasFirstDerivative () const |
returns whether this function can calculate the first derivative | |
bool | hasSecondDerivative () const |
returns whether this function can calculate the second derivative | |
bool | canProposeStartingPoint () const |
returns whether this function can propose a starting point. | |
bool | isConstrained () const |
returns whether this function can return | |
bool | hasConstraintHandler () const |
returns whether this function can return | |
bool | canProvideClosestFeasible () const |
Returns whether this function can calculate thee closest feasible to an infeasible point. | |
bool | isThreadSafe () const |
Returns true, when the function can be usd in parallel threads. | |
bool | isNoisy () const |
Returns true, when the function can be usd in parallel threads. | |
AbstractObjectiveFunction () | |
Default ctor. | |
virtual | ~AbstractObjectiveFunction () |
Virtual destructor. | |
virtual void | init () |
void | setRng (random::rng_type *rng) |
Sets the Rng used by the objective function. | |
virtual bool | hasScalableDimensionality () const |
virtual void | setNumberOfVariables (std::size_t numberOfVariables) |
Adjusts the number of variables if the function is scalable. | |
virtual std::size_t | numberOfObjectives () const |
virtual bool | hasScalableObjectives () const |
virtual void | setNumberOfObjectives (std::size_t numberOfObjectives) |
Adjusts the number of objectives if the function is scalable. | |
std::size_t | evaluationCounter () const |
Accesses the evaluation counter of the function. | |
AbstractConstraintHandler< SearchPointType > const & | getConstraintHandler () const |
Returns the constraint handler of the function if it has one. | |
virtual bool | isFeasible (const SearchPointType &input) const |
Tests whether a point in SearchSpace is feasible, e.g., whether the constraints are fulfilled. | |
virtual void | closestFeasible (SearchPointType &input) const |
If supported, the supplied point is repaired such that it satisfies all of the function's constraints. | |
ResultType | operator() (SearchPointType const &input) const |
Evaluates the function. Useful together with STL-Algorithms like std::transform. | |
virtual ResultType | evalDerivative (SearchPointType const &input, SecondOrderDerivative &derivative) const |
Evaluates the objective function and calculates its gradient. | |
Public Member Functions inherited from shark::INameable | |
virtual | ~INameable () |
Additional Inherited Members | |
Public Types inherited from shark::AbstractObjectiveFunction< RealVector, double > | |
enum | Feature |
List of features that are supported by an implementation. More... | |
typedef RealVector | SearchPointType |
typedef double | ResultType |
typedef boost::mpl::if_< std::is_arithmetic< double >, SearchPointType, RealMatrix >::type | FirstOrderDerivative |
typedef TypedFlags< Feature > | Features |
This statement declares the member m_features. See Core/Flags.h for details. | |
typedef TypedFeatureNotAvailableException< Feature > | FeatureNotAvailableException |
Protected Member Functions inherited from shark::AbstractObjectiveFunction< RealVector, double > | |
void | announceConstraintHandler (AbstractConstraintHandler< SearchPointType > const *handler) |
helper function which is called to announce the presence of an constraint handler. | |
Protected Attributes inherited from shark::AbstractObjectiveFunction< RealVector, double > | |
Features | m_features |
std::size_t | m_evaluationCounter |
Evaluation counter, default value: 0. | |
AbstractConstraintHandler< SearchPointType > const * | m_constraintHandler |
random::rng_type * | mep_rng |
Kernel Target Alignment - a measure of alignment of a kernel Gram matrix with labels.
The Kernel Target Alignment (KTA) was originally proposed in the paper:
On Kernel-Target Alignment. N. Cristianini, J. Shawe-Taylor, A. Elisseeff, J. Kandola. Innovations in Machine Learning, 2006.
Here we provide a version with centering of the features as proposed in the paper:
Two-Stage Learning Kernel Algorithms. C. Cortes, M. Mohri, A. Rostamizadeh. ICML 2010.
The kernel target alignment is defined as where K is the kernel Gram matrix of the data and y is the vector of
\[ \hat A = \frac{\langle K, y y^T \rangle}{\sqrt{\langle K, K \rangle \cdot \langle y y^T, y y^T \rangle}} \]
+1/-1 valued labels. The outer product \(y y^T\) corresponds to an ideal Gram matrix corresponding to a kernel that maps the two classes each to a single point, thus minimizing within-class distance for fixed inter-class distance. The inner products denote the Frobenius product of matrices: http://en.wikipedia.org/wiki/Matrix_multiplication#Frobenius_product
In kernel-based learning, the kernel Gram matrix \(K\) is of the form
\[ K_{i,j} = k(x_i, x_j) = \langle \phi(x_i), \phi(x_j) \rangle \]
for a Mercer kernel function k and inputs \(x_i, x_j\). In this version of the KTA we use centered feature vectors. Let
\[ \psi(x_i) = \phi(x_i) - \frac{1}{\ell} \sum_{j=1}^{\ell} \phi(x_j) \]
denote the centered feature vectors, then the centered Gram matrix \(K^c\) is given by
\[ K^c_{i,j} = \langle \psi(x_i), \psi(x_j) \rangle = K_{i,j} - \frac{1}{\ell} \sum_{n=1}^\ell K_{i,n} + K_{j,n} + \frac{1}{\ell^2} \sum_{m,n=1}^\ell K_{n,m} \]
The alignment measure computed by this class is the exact same formula for \( \hat A \), but with \(K^c\) plugged in in place of \(K\).
KTA measures the Frobenius inner product between a kernel Gram matrix and this ideal matrix. The interpretation is that KTA measures how well a given kernel fits a classification problem. The actual measure is invariant under kernel rescaling. In Shark, objective functions are minimized by convention. Therefore the negative alignment \(- \hat A\) is implemented. The measure is extended for multi-class problems by using prototype vectors instead of scalar labels.
The following properties of KTA are important from a model selection point of view: it is relatively fast and easy to compute, it is differentiable w.r.t. the kernel function, and it is independent of the actual classifier.
Definition at line 95 of file KernelTargetAlignment.h.
|
inline |
Construction of the Kernel Target Alignment (KTA) from a kernel object.
Definition at line 101 of file KernelTargetAlignment.h.
References shark::AbstractObjectiveFunction< RealVector, double >::CAN_PROPOSE_STARTING_POINT, shark::AbstractObjectiveFunction< RealVector, double >::HAS_FIRST_DERIVATIVE, shark::AbstractObjectiveFunction< RealVector, double >::HAS_VALUE, shark::LabeledData< InputT, LabelT >::labels(), shark::AbstractObjectiveFunction< RealVector, double >::m_features, shark::LabeledData< InputT, LabelT >::numberOfElements(), and SHARK_RUNTIME_CHECK.
|
inlinevirtual |
Evaluate the (centered, negative) Kernel Target Alignment (KTA).
See the class description for more details on this computation.
Reimplemented from shark::AbstractObjectiveFunction< RealVector, double >.
Definition at line 140 of file KernelTargetAlignment.h.
References shark::IParameterizable< VectorType >::setParameterVector().
|
inlinevirtual |
Compute the derivative of the KTA as a function of the kernel parameters.
It holds:
\[ \langle K^c, K^c \rangle = \langle K,K \rangle -2 \ell \langle k,k \rangle + mk^2 \ell^2 \\ (\langle K^c, K^c \rangle )' = 2 \langle K,K' \rangle -4\ell \langle k, \frac{1}{\ell} \sum_j K'_ij \rangle +2 \ell^2 mk \sum_ij 1/(\ell^2) K'_ij \\ = 2 \langle K,K' \rangle -4 \langle k, \sum_j K'_ij \rangle + 2 mk \sum_ij K_ij \\ = 2 \langle K,K' \rangle -4 \langle k u^T, K' \rangle + 2 mk \langle u u^T, K' \rangle \\ = 2\langle K -2 k u^T + mk u u^T, K' \rangle ) \\ \langle Y, K^c \rangle = \langle Y, K \rangle - 2 n \langle y, k \rangle + n^2 my mk \\ (\langle Y , K^c \rangle )' = \langle Y -2 y u^T + my u u^T, K' \rangle \]
now the derivative is computed from this values in a second sweep over the data: we get:
\[ \hat A' = 1/\langle K^c,K^c \rangle ^{3/2} (\langle K^c,K^c \rangle (\langle Y,K^c \rangle )' - 0.5*\langle Y, K^c \rangle (\langle K^c , K^c \rangle )') \\ = 1/\langle K^c,K^c \rangle ^{3/2} \langle \langle K^c,K^c \rangle (Y -2 y u^T + my u u^T)- \langle Y, K^c \rangle (K -2 k u^T+ mk u u^T),K' \rangle \\ = 1/\langle K^c,K^c \rangle ^{3/2} \langle W,K' \rangle \]
reordering rsults in
\[ W= \langle K^c,K^c \rangle Y - \langle Y, K^c \rangle K \\ - 2 (\langle K^c,K^c \rangle y - \langle Y, K^c \rangle k) u^T \\ + (\langle K^c,K^c \rangle my - \langle Y, K^c \rangle mk) u u^T \]
where \( K' \) is the derivative of K with respct of the kernel parameters.
Reimplemented from shark::AbstractObjectiveFunction< RealVector, double >.
Definition at line 166 of file KernelTargetAlignment.h.
References shark::LabeledData< InputT, LabelT >::batch(), shark::batchSize(), shark::AbstractKernelFunction< InputTypeT >::createState(), shark::AbstractKernelFunction< InputTypeT >::eval(), shark::LabeledData< InputT, LabelT >::numberOfBatches(), shark::IParameterizable< VectorType >::numberOfParameters(), shark::IParameterizable< VectorType >::setParameterVector(), SHARK_CRITICAL_REGION, SHARK_PARALLEL_FOR, and shark::AbstractKernelFunction< InputTypeT >::weightedParameterDerivative().
|
inlinevirtual |
From INameable: return the class name.
Reimplemented from shark::INameable.
Definition at line 124 of file KernelTargetAlignment.h.
|
inlinevirtual |
Accesses the number of variables.
Implements shark::AbstractObjectiveFunction< RealVector, double >.
Definition at line 133 of file KernelTargetAlignment.h.
References shark::IParameterizable< VectorType >::numberOfParameters().
|
inlinevirtual |
Return the current kernel parameters as a starting point for an optimization run.
Reimplemented from shark::AbstractObjectiveFunction< RealVector, double >.
Definition at line 128 of file KernelTargetAlignment.h.