Module Ridge

operalib.ridge implements Operator-Valued Kernel ridge regression.

class operalib.ridge.OVKDecomposableRidge(input_kernel='Gauss', A=None, lbda=1e-05, gamma=None, theta=0.7, period='autocorr', autocorr_params=None)[source]

Operator-Valued kernel ridge regression.

Operator-Valued kernel ridge regression (OVKRR) combines ridge regression (linear least squares with l2-norm regularization) with the (OV)kernel trick. It learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space.

This is a simplyfied version of OVKRidge Handelling only the decomposable kernel. This allows to rewrite the optimality condition as a sylvester system of equation and reduce the complexity to .. math:

\mathcal{O}(n^3) + \mathcal{O}(p^3)

where n is the number of training points and p the number of outputs.

Optimization problem solved for learning: .. math:

\min_{h \in \mathcal H}~ 

rac{lambda}{2} |h|_{mathcal H}^2 +

rac{1}{np} sum_{i=1}^n |y_j - h(x_i)|_{mathcal Y}^2 +

rac{lambda_m}{2} sum_{i=1}^n W_{ij}
|h(x_i) - h(x_j)|_{mathcal Y}^2

Examples

>>> import operalib as ovk
>>> import numpy as np
>>> n_samples, n_features, n_targets = 10, 5, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples, n_targets)
>>> X = rng.randn(n_samples, n_features)
>>> clf = ovk.OVKRidge('DGauss', lbda=1.0)
>>> clf.fit(X, y)

OVKRidge(...)
Attributes:
dual_coef_ : array, shape = [n_samples x n_targest]

Weight vector(s) in kernel space

linop_ : callable

Callable which associate to the training points X the Gram matrix (the Gram matrix being a LinearOperator)

A_ : array, shape = [n_targets, n_targets]

Set when Linear operator used by the decomposable kernel is default or None.

period_ : float

Set when period used by the First periodic kernel is ‘autocorr’.

Methods

fit(X, y) Fit OVK ridge regression model.
get_params([deep]) Get parameters for this estimator.
predict(X) Predict using the OVK ridge model.
score(X, y[, sample_weight]) Returns the coefficient of determination R^2 of the prediction.
set_params(**params) Set the parameters of this estimator.
fit(X, y)[source]

Fit OVK ridge regression model.

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training data.

y : {array-like}, shape = [n_samples] or [n_samples, n_targets]

Target values. numpy.NaN for missing targets (semi-supervised learning).

Returns:
self : returns an instance of self.
get_params(deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

predict(X)[source]

Predict using the OVK ridge model.

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Samples.

Returns:
C : {array}, shape = [n_samples] or [n_samples, n_targets]

Returns predicted values.

score(X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True values for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:
score : float

R^2 of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self
class operalib.ridge.OVKRidge(ovkernel='DGauss', lbda=1e-05, lbda_m=0.0, A=None, gamma=None, gamma_m=None, theta=0.7, period='autocorr', autocorr_params=None, solver='L-BFGS-B', solver_params=None)[source]

Operator-Valued kernel ridge regression.

Operator-Valued kernel ridge regression (OVKRR) combines ridge regression (linear least squares with l2-norm regularization) with the (OV)kernel trick. It learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space.

Let n be the number of training points and p the number of outputs. This algorithm has a complexity per iterations of … math:

\mathcal{O}(pn^2) + \mathcal{O}(p^2n)

for the decomposable kernel. Hence, when the number of outputs is large, the solver OVKDecomposableRidge should be faster.

The form of the model learned by OVKRR is identical to support vector regression (SVR). However, different loss functions are used: OVKRR uses squared error loss while support vector regression uses epsilon-insensitive loss, both combined with l2 regularization. In contrast to SVR, fitting a OVKRR model can be done in closed-form and is typically faster for medium-sized datasets. On the other hand, the learned model is non-sparse and thus slower than SVR, which learns a sparse model for epsilon > 0, at prediction-time.

Optimization problem solved for learning: .. math:

\min_{h \in \mathcal H}~ 

rac{lambda}{2} |h|_{mathcal H}^2 +

rac{1}{np} sum_{i=1}^n |y_j - h(x_i)|_{mathcal Y}^2 +

rac{lambda_m}{2} sum_{i=1}^n W_{ij}
|h(x_i) - h(x_j)|_{mathcal Y}^2

Examples

>>> import operalib as ovk
>>> import numpy as np
>>> n_samples, n_features, n_targets = 10, 5, 5
>>> rng = np.random.RandomState(0)
>>> y = rng.randn(n_samples, n_targets)
>>> X = rng.randn(n_samples, n_features)
>>> clf = ovk.OVKRidge('DGauss', lbda=1.0)
>>> clf.fit(X, y)

OVKRidge(...)
Attributes:
dual_coef_ : array, shape = [n_samples x n_targest]

Weight vector(s) in kernel space

linop_ : callable

Callable which associate to the training points X the Gram matrix (the Gram matrix being a LinearOperator)

A_ : array, shape = [n_targets, n_targets]

Set when Linear operator used by the decomposable kernel is default or None.

L_ : array, shape = [n_samples_miss, n_samples_miss]

Graph Laplacian of data with missing targets (semi-supervised learning).

period_ : float

Set when period used by the First periodic kernel is ‘autocorr’.

solver_res_ : any

Raw results returned by the solver.

Methods

fit(X, y) Fit OVK ridge regression model.
get_params([deep]) Get parameters for this estimator.
predict(X) Predict using the OVK ridge model.
score(X, y[, sample_weight]) Returns the coefficient of determination R^2 of the prediction.
set_params(**params) Set the parameters of this estimator.
fit(X, y)[source]

Fit OVK ridge regression model.

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Training data.

y : {array-like}, shape = [n_samples] or [n_samples, n_targets]

Target values. numpy.NaN for missing targets (semi-supervised learning).

Returns:
self : returns an instance of self.
get_params(deep=True)

Get parameters for this estimator.

Parameters:
deep : boolean, optional

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
params : mapping of string to any

Parameter names mapped to their values.

predict(X)[source]

Predict using the OVK ridge model.

Parameters:
X : {array-like, sparse matrix}, shape = [n_samples, n_features]

Samples.

Returns:
C : {array}, shape = [n_samples] or [n_samples, n_targets]

Returns predicted values.

score(X, y, sample_weight=None)

Returns the coefficient of determination R^2 of the prediction.

The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.

Parameters:
X : array-like, shape = (n_samples, n_features)

Test samples.

y : array-like, shape = (n_samples) or (n_samples, n_outputs)

True values for X.

sample_weight : array-like, shape = [n_samples], optional

Sample weights.

Returns:
score : float

R^2 of self.predict(X) wrt. y.

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Returns:
self