Module Ridge¶
operalib.ridge
implements Operator-Valued Kernel ridge
regression.
-
class
operalib.ridge.
OVKDecomposableRidge
(input_kernel='Gauss', A=None, lbda=1e-05, gamma=None, theta=0.7, period='autocorr', autocorr_params=None)[source]¶ Operator-Valued kernel ridge regression.
Operator-Valued kernel ridge regression (OVKRR) combines ridge regression (linear least squares with l2-norm regularization) with the (OV)kernel trick. It learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space.
This is a simplyfied version of OVKRidge Handelling only the decomposable kernel. This allows to rewrite the optimality condition as a sylvester system of equation and reduce the complexity to .. math:
\mathcal{O}(n^3) + \mathcal{O}(p^3)
where n is the number of training points and p the number of outputs.
Optimization problem solved for learning: .. math:
\min_{h \in \mathcal H}~
rac{lambda}{2} |h|_{mathcal H}^2 +
rac{1}{np} sum_{i=1}^n |y_j - h(x_i)|_{mathcal Y}^2 +
- rac{lambda_m}{2} sum_{i=1}^n W_{ij}
- |h(x_i) - h(x_j)|_{mathcal Y}^2
Examples
>>> import operalib as ovk >>> import numpy as np >>> n_samples, n_features, n_targets = 10, 5, 5 >>> rng = np.random.RandomState(0) >>> y = rng.randn(n_samples, n_targets) >>> X = rng.randn(n_samples, n_features) >>> clf = ovk.OVKRidge('DGauss', lbda=1.0) >>> clf.fit(X, y) OVKRidge(...)
Attributes: - dual_coef_ : array, shape = [n_samples x n_targest]
Weight vector(s) in kernel space
- linop_ : callable
Callable which associate to the training points X the Gram matrix (the Gram matrix being a LinearOperator)
- A_ : array, shape = [n_targets, n_targets]
Set when Linear operator used by the decomposable kernel is default or None.
- period_ : float
Set when period used by the First periodic kernel is ‘autocorr’.
Methods
fit
(X, y)Fit OVK ridge regression model. get_params
([deep])Get parameters for this estimator. predict
(X)Predict using the OVK ridge model. score
(X, y[, sample_weight])Returns the coefficient of determination R^2 of the prediction. set_params
(**params)Set the parameters of this estimator. -
fit
(X, y)[source]¶ Fit OVK ridge regression model.
Parameters: - X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Training data.
- y : {array-like}, shape = [n_samples] or [n_samples, n_targets]
Target values. numpy.NaN for missing targets (semi-supervised learning).
Returns: - self : returns an instance of self.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
predict
(X)[source]¶ Predict using the OVK ridge model.
Parameters: - X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Samples.
Returns: - C : {array}, shape = [n_samples] or [n_samples, n_targets]
Returns predicted values.
-
score
(X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X : array-like, shape = (n_samples, n_features)
Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True values for X.
- sample_weight : array-like, shape = [n_samples], optional
Sample weights.
Returns: - score : float
R^2 of self.predict(X) wrt. y.
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self
-
class
operalib.ridge.
OVKRidge
(ovkernel='DGauss', lbda=1e-05, lbda_m=0.0, A=None, gamma=None, gamma_m=None, theta=0.7, period='autocorr', autocorr_params=None, solver='L-BFGS-B', solver_params=None)[source]¶ Operator-Valued kernel ridge regression.
Operator-Valued kernel ridge regression (OVKRR) combines ridge regression (linear least squares with l2-norm regularization) with the (OV)kernel trick. It learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space.
Let n be the number of training points and p the number of outputs. This algorithm has a complexity per iterations of … math:
\mathcal{O}(pn^2) + \mathcal{O}(p^2n)
for the decomposable kernel. Hence, when the number of outputs is large, the solver OVKDecomposableRidge should be faster.
The form of the model learned by OVKRR is identical to support vector regression (SVR). However, different loss functions are used: OVKRR uses squared error loss while support vector regression uses epsilon-insensitive loss, both combined with l2 regularization. In contrast to SVR, fitting a OVKRR model can be done in closed-form and is typically faster for medium-sized datasets. On the other hand, the learned model is non-sparse and thus slower than SVR, which learns a sparse model for epsilon > 0, at prediction-time.
Optimization problem solved for learning: .. math:
\min_{h \in \mathcal H}~
rac{lambda}{2} |h|_{mathcal H}^2 +
rac{1}{np} sum_{i=1}^n |y_j - h(x_i)|_{mathcal Y}^2 +
- rac{lambda_m}{2} sum_{i=1}^n W_{ij}
- |h(x_i) - h(x_j)|_{mathcal Y}^2
Examples
>>> import operalib as ovk >>> import numpy as np >>> n_samples, n_features, n_targets = 10, 5, 5 >>> rng = np.random.RandomState(0) >>> y = rng.randn(n_samples, n_targets) >>> X = rng.randn(n_samples, n_features) >>> clf = ovk.OVKRidge('DGauss', lbda=1.0) >>> clf.fit(X, y) OVKRidge(...)
Attributes: - dual_coef_ : array, shape = [n_samples x n_targest]
Weight vector(s) in kernel space
- linop_ : callable
Callable which associate to the training points X the Gram matrix (the Gram matrix being a LinearOperator)
- A_ : array, shape = [n_targets, n_targets]
Set when Linear operator used by the decomposable kernel is default or None.
- L_ : array, shape = [n_samples_miss, n_samples_miss]
Graph Laplacian of data with missing targets (semi-supervised learning).
- period_ : float
Set when period used by the First periodic kernel is ‘autocorr’.
- solver_res_ : any
Raw results returned by the solver.
Methods
fit
(X, y)Fit OVK ridge regression model. get_params
([deep])Get parameters for this estimator. predict
(X)Predict using the OVK ridge model. score
(X, y[, sample_weight])Returns the coefficient of determination R^2 of the prediction. set_params
(**params)Set the parameters of this estimator. -
fit
(X, y)[source]¶ Fit OVK ridge regression model.
Parameters: - X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Training data.
- y : {array-like}, shape = [n_samples] or [n_samples, n_targets]
Target values. numpy.NaN for missing targets (semi-supervised learning).
Returns: - self : returns an instance of self.
-
get_params
(deep=True)¶ Get parameters for this estimator.
Parameters: - deep : boolean, optional
If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns: - params : mapping of string to any
Parameter names mapped to their values.
-
predict
(X)[source]¶ Predict using the OVK ridge model.
Parameters: - X : {array-like, sparse matrix}, shape = [n_samples, n_features]
Samples.
Returns: - C : {array}, shape = [n_samples] or [n_samples, n_targets]
Returns predicted values.
-
score
(X, y, sample_weight=None)¶ Returns the coefficient of determination R^2 of the prediction.
The coefficient R^2 is defined as (1 - u/v), where u is the residual sum of squares ((y_true - y_pred) ** 2).sum() and v is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a R^2 score of 0.0.
Parameters: - X : array-like, shape = (n_samples, n_features)
Test samples.
- y : array-like, shape = (n_samples) or (n_samples, n_outputs)
True values for X.
- sample_weight : array-like, shape = [n_samples], optional
Sample weights.
Returns: - score : float
R^2 of self.predict(X) wrt. y.
-
set_params
(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>
so that it’s possible to update each component of a nested object.Returns: - self