Movatterモバイル変換

Overview

We implement adouble RKHS variant of PLS, whereboth the input and the output spaces are endowed with reproducingkernels:

\(K_X \in \mathbb{R}^{n\times n}\)with entries\([K_X]_{ij} = k_X(x_i,x_j)\),
\(K_Y \in \mathbb{R}^{n\times n}\)with entries\([K_Y]_{ij} = k_Y(y_i,y_j)\).

We use centered Grams\(\tilde K_X = H K_XH\) and\(\tilde K_Y = H K_YH\), where\(H = I -\frac{1}{n}\mathbf{1}\mathbf{1}^\top\).

Operator and Latent Directions

Following the spirit ofKernel PLS Regression II (IEEETNNLS, 2019), we avoid explicit square roots and form theSPDsurrogate operator\[\mathcal{M} \, v= (K_X+\lambda_x I)^{-1} \; K_X \; K_Y \; K_X \; (K_X+\lambda_x I)^{-1}\, v,\] with small ridge\(\lambda_x >0\) for stability. We compute the first\(A\) orthonormal latent directions\(T = [t_1,\dots,t_A]\) via power iterationwith Gram–Schmidt orthogonalization on\(\mathcal{M}\).

We then solve asmall regression in the latentspace:\[C = (T^\top T)^{-1} (T^\top \tilde Y),\qquad \tilde Y = Y - \mathbf{1} \bar y^\top,\] and form dual coefficients\[\alpha \;=\; U \, C, \qquad U \;=\; (K_X+\lambda_x I)^{-1} T,\] so that training predictions satisfy\[\hat Y \;=\; \tilde K_X \, \alpha + \mathbf{1}\,\bar y^\top .\]

Centering for Prediction

Given new inputs\(X_\*\), definethecross-Gram\[K_\* = K(X_\*, X) .\] To apply training centering to\(K_\*\), use\[\tilde K_\* \;=\; K_\* \;-\; \mathbf{1}\, \bar k_X^\top \;-\; \bar k_\*\mathbf{1}^\top \;+\; \mu_X,\] where: -\(\bar k_X =\frac{1}{n}\mathbf{1}^\top K_X\) is thecolumnmean vector for the (uncentered) training Gram, -\(\mu_X = \frac{1}{n^2} \mathbf{1}^\top K_X\mathbf{1}\) is itsgrand mean, -\(\bar k_\*\) is therowmean of\(K_\*\) (computed atprediction time).

Predictions then follow the familiar dual form:\[\hat Y_\* \;=\; \tilde K_\* \, \alpha + \mathbf{1}_\* \, \bar y^\top .\]

Practical Notes

Choose\(k_X\) (e.g., RBF) toreflectnonlinear structure in inputs. A linear\(k_Y\) already produces numeric outputs in\(\mathbb{R}^m\).
The ridge terms\(\lambda_x,\lambda_y\) stabilize inversions and dampen numerical noise.
Withalgorithm = "rkhs_xy", the package returns:
- dual_coef\(=\alpha\),
- scores\(=T\)(approximately orthonormal),
- intercept\(=\bary\),
- and uses the centered cross-kernel formula above inpredict().

Minimal Example

library(bigPLSR)set.seed(42)n<-60; p<-6; m<-2X<-matrix(rnorm(n* p), n, p)Y<-cbind(sin(X[,1])+0.4* X[,2]^2,cos(X[,3])-0.3* X[,4]^2)+matrix(rnorm(n*m,sd=.05), n, m)op<-options(bigPLSR.rkhs_xy.kernel_x ="rbf",bigPLSR.rkhs_xy.gamma_x  =0.5,bigPLSR.rkhs_xy.kernel_y ="linear",bigPLSR.rkhs_xy.lambda_x =1e-6,bigPLSR.rkhs_xy.lambda_y =1e-6)on.exit(options(op),add =TRUE)fit<-pls_fit(X, Y,ncomp =3,algorithm ="rkhs_xy",backend ="arma")Yhat<-predict(fit, X)mean((Y- Yhat)^2)#> [1] 2.619847e-12

References • Rosipal & Trejo (2001) Kernel Partial Least SquaresRegression in Reproducing Kernel Hilbert Space. JMLR 2:97–123.doi:10.5555/944733.944741. • Kernel PLS Regression II:Kernel Partial Least Squares Regression by Projecting Both Independentand Dependent Variables into Reproducing Kernel Hilbert Space. IEEETNNLS (2019).doi:10.1109/TNNLS.2019.2932014.

Movatterモバイル変換

Double RKHS PLS (rkhs_xy): Theory andUsage

Frédéric Bertrand

2025-11-26

Overview

Operator and Latent Directions

Centering for Prediction

Practical Notes

Minimal Example