Suppose we have a functional response variable\(y_m(t), \ m=1,\dots,M\), a functionalcovariate\(x_m(t)\) and also a set of\(p=2\) scalar covariates\(\textbf{u}_m = (u_{m0},u_{m1})^\top\).
A Gaussian process functional regression (GPFR) model used in thisexample is defined by
\(y_m(t) = \mu_m(t) + \tau_m(x_m(t)) +\varepsilon_m(t)\),
where\(\mu_m(t) = \textbf{u}_m^\top\boldsymbol{\beta}(t)\) is the mean function model acrossdifferent curves and\(\tau_m(x_m(t))\)is a Gaussian process with zero mean and covariance function\(k_m(\boldsymbol{\theta}|x_m(t))\). That is,\(\tau_m(x_m(t))\) defines thecovariance structure of\(y_m(t)\) forthe different data points within the same curve.
The error term can be assumed to be\(\varepsilon_m(t) \sim N(0,\sigma_\varepsilon^2)\), where the noise variance\(\sigma_\varepsilon^2\) can be estimated asa hyperparameter of the Gaussian process.
In the example below, the training data consist of\(M=20\) realisations on\([-4,4]\) with\(n=50\) points for each curve. We assumeregression coefficient functions\(\beta_0(t)=1\),\(\beta_1(t)=\sin((0.5 t)^3)\), scalarcovariates\(u_{m0} \sim N(0,1)\) and\(u_{m1} \sim N(10,5^2)\) and afunctional covariate\(x_m(t) = \exp(t) +v\), where\(v \sim N(0,0.1^2)\). The term\(\tau_m(x_m(t))\) is a zero mean Gaussianprocess with exponential covariance kernel and\(\sigma_\varepsilon^2 = 1\).
We also simulate an\((M+1)\)threalisation which is used to assess predictions obtained by the modelestimated by using the training data of size\(M\). The\(y_{M+1}(t)\) and\(x_{M+1}(t)\) curves are observed on equallyspaced\(60\) time points on\([-4,4]\).
set.seed(100)M<-20n<-50p<-2# number of scalar covariateshp<-list('pow.ex.v'=log(10),'pow.ex.w'=log(1),'vv'=log(1))## Training data: M realisations -----------------tt<-seq(-4,4,len=n)b<-sin((0.5*tt)^3)scalar_train<-matrix(NA, M, p)t_train<-matrix(NA, M, n)x_train<-matrix(NA, M, n)response_train<-matrix(NA, M, n)for(iin1:M){ u0<-rnorm(1) u1<-rnorm(1,10,5) x<-exp(tt)+rnorm(n,0,0.1) Sigma<-cov.pow.ex(hyper = hp,input = x,gamma =1)diag(Sigma)<-diag(Sigma)+exp(hp$vv) y<- u0+u1*b+mvrnorm(n=1,mu=rep(0,n),Sigma=Sigma) scalar_train[i,]<-c(u0,u1) t_train[i,]<- tt x_train[i,]<- x response_train[i,]<- y}## Test data (M+1)-th realisation ------------------n_new<-60t_new<-seq(-4,4,len=n_new)b_new<-sin((0.5*t_new)^3)u0_new<-rnorm(1)u1_new<-rnorm(1,10,5)scalar_new<-cbind(u0_new, u1_new)x_new<-exp(t_new)+rnorm(n_new,0,0.1)Sigma_new<-cov.pow.ex(hyper = hp,input = x_new,gamma =1)diag(Sigma_new)<-diag(Sigma_new)+exp(hp$vv)response_new<- u0_new+ u1_new*b_new+mvrnorm(n=1,mu=rep(0,n_new),Sigma=Sigma_new)The estimation of mean and covariance functions in the GPFR model isdone using thegpfr function:
a1<-gpfr(response = response_train,time = tt,uReg = scalar_train,fxReg =NULL,gpReg = x_train,fyList =list(nbasis =23,lambda =0.0001),uCoefList =list(list(lambda =0.0001,nbasi =23)),Cov ='pow.ex',gamma =1,fitting = T)Note that the estimated covariance function hyperparameters aresimilar to the true values:
To visualise all the realisations of the training data:
To visualise three realisations of the training data:
The in-sample fit using mean function from FR model only can beseen:
The GPFR model fit to the training data is visualised by using:
If\(y_{M+1}(t)\) is observed overall the domain of\(t\), the Type Iprediction can be seen:
b1<-gpfrPredict(a1,testInputGP = x_new,testTime = t_new,uReg = scalar_new,fxReg =NULL,gpReg =list('response'= response_new,'input'= x_new,'time'= t_new))plot(b1,type ='prediction',colourTrain ='pink')lines(t_new, response_new,type ='b',col =4,pch =19,cex =0.6,lty =3,lwd =2)If we assume that\(y_{M+1}(t)\) isonly partially observed, we can obtain Type I predictions via:
b2<-gpfrPredict(a1,testInputGP = x_new,testTime = t_new,uReg = scalar_new,fxReg =NULL,gpReg =list('response'= response_new[1:20],'input'= x_new[1:20],'time'= t_new[1:20]))plot(b2,type ='prediction',colourTrain ='pink')lines(t_new, response_new,type ='b',col =4,pch =19,cex =0.6,lty =3,lwd =2)Type II prediction, which is made by not including any informationabout\(y_{M+1}(t)\), is visualisedbelow.
b3<-gpfrPredict(a1,testInputGP = x_new,testTime = t_new,uReg = scalar_new,fxReg =NULL,gpReg =NULL)plot(b3,type ='prediction',colourTrain ='pink')lines(t_new, response_new,type='b',col =4,pch =19,cex =0.6,lty =3,lwd =2)