Movatterモバイル変換


[0]ホーム

URL:


Skip to main content
                                  NCBI home page
Search in PMCSearch
As a library, NLM provides access to scientific literature. Inclusion in an NLM database does not imply endorsement of, or agreement with, the contents by NLM or the National Institutes of Health.
Learn more:PMC Disclaimer | PMC Copyright Notice
NIHPA Author Manuscripts logo
. Author manuscript; available in PMC: 2021 Apr 26.

Smooth Interpolation of Covariance Matrices and Brain Network Estimation

Lipeng Ning1
1Department of Psychiatry, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115 USA.

Email:lning@bwh.harvard.edu

Issue date 2019 Aug.

Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

PMCID: PMC8074851  NIHMSID: NIHMS1063649  PMID:33907337
The publisher's version of this article is available atIEEE Trans Automat Contr

Abstract

We propose an approach to use the state covariance of autonomous linear systems to track time-varying covariance matrices of nonstationary time series. Following concepts from the Riemannian geometry, we investigate three types of covariance paths obtained by using different quadratic regularizations of system matrices. The first quadratic form induces the geodesics based on the Hellinger–Bures metric related to optimal mass transport (OMT) theory and quantum mechanics. The second type of quadratic form leads to the geodesics based on the Fisher–Rao metric from information geometry. In the process, we introduce a weighted-OMT interpretation of the Fisher–Rao metric for multivariate Gaussian distributions. A main contribution of this work is the introduction of the third type of covariance paths, which are steered by system matrices with rotating eigenspaces. The three types of covariance paths are compared using two examples with synthetic data and real data from resting-state functional magnetic resonance imaging, respectively.

Index Terms—: Brain networks, functional magnetic resonance imaging, information theory, optimal control, optimal mass transport (OMT), Riemannian metric, system identification

I. Introduction

THE problem of tracking changes and deformations of positive-definite matrices is relevant to a wide spectrum of scientific applications, including computer vision, sensor array, and diffusion tensor imaging (see, e.g., [1]–[7]). A key motivation behind the present work is from a neuroscience application on understanding functional brain connectivity using resting-state function magnetic resonance imaging (rsfMRI) data. Specifically, rsfMRI is a widely used neuroimaging modality, which acquires a sequence of three-dimensional image volumes to understand brain functions and activities [8]. The standard approach for analyzing functional connections between brain regions is to compute the correlation coefficient between the underlying rsfMRI time-series data. Consequently, the whole-brain functional network is characterized by the covariance matrix of a multivariate time-series data obtained from different brain regions [9], [10]. It has been recently observed that the functional connectivity fluctuates over time [11], [12], implying that the static covariance matrix is too simplistic to capture the full extent of time-varying brain activities. Thus, there is an urgent need for new computational tools for understanding dynamic functional brain networks using nonstationary rsfMRI data. The aim of this work is use control-theoretic approaches to develop models for smooth paths of time-varying covariance matrices.

We consider a possibly nonstationary zero-mean time series{xτ;τ} taking values inn. We assume that the temporal change of the probability distributionspt(x) ofxt is much slower than the rate of measurements. Therefore, the instantaneous covariance matrix

PtEpt(xx)=nxxpt(x)dx

can be estimated by using sample covariance matrices computed from discrete measurements of the time series. Assume that two covariance matricesP0 andP1 att = 0, 1 are known. Then, geodesics connectingP0 andP1 on the manifold of positive-definite matrices provide natural structures to model time-varying covariance matricesPt on the time intervalt ∈ [0,1]. As generalizations of straight lines in Euclidean space, geodesics are paths of the shortest distance connecting the start to the finish on a curved manifold. The path length is measured by Riemannian metrics, which are quadratic forms of the tangent matrixP˙t. Several Riemannian metrics have been investigated to derive geodesics for covariance matrices. For instance, the geodesics based on the Fisher–Rao metric for Gaussian distributions [13]–[16] from the theory of information geometry is given by

Ptinfo=P012(P012P1P012)tP012.(1)

Another example would be the Wasserstein-2 metric for Gaussian probability density functions [17]–[20], which induces the following geodesic:

Ptomt=((1t)P012+tP112U^)((1t)P012+tP112U^)(2)

wherePt12 denotes the unique positive-definite square root ofPt, and

U^=P112P012(P012P1P012)12

is an orthogonal matrix. These Riemannian metrics will be explained in more detail in the following sections.

In this paper, we combine concepts from Riemannian geometry and autonomous systems to develop smooth covariance paths. Specifically, we consider a geodesicPt as the state covariance of the following autonomous linear system:

x˙t=Atxt(3)

withAtn×n. We analyze the linear systems that steerPt along different geodesics. Based on (3), the state covariance evolves according to

P˙t=AtPt+PtAt.(4)

The matrixAt can be considered as a noncommutative division ofP˙t byPt scaled by a factor of12. In this paper, we consider Riemannian metrics on the manifolds of positive-definite matrices as quadratic forms ofAt. To this end, we consider covariance paths that are the solutions to

minPt,At{01f(At)dt|P˙t=AtPt+PtAt,P0,P1 given}(5)

where f(At) is a positive quadratic function ofAt, which may depend onPt. Note that the optimal system matrixAt may be asymmetric, which could provide insight to understand directed dependence between the underlying variables.

The rest of this paper is organized as follows. In Section II, we revisit the optimal mass transport (OMT)-based geodesicsPtomt and introduce the corresponding quadratic form f(At). Section III will focus on the quadratic forms that lead to the Fisher-Rao-based geodesicsPtinfo. We also introduce a fluid-mechanics interpretation of the Fisher–Rao metric, which provides a point of contact between OMT and information geometry. In Section IV, we investigate the optimal solutions to (5) corresponding to a family of quadratic functions f(At), which are weighted-square norms of the symmetric and asymmetric part ofAt. In Section V, we compare the three types of covariance paths using two examples based on synthetic data and real data from rsfMRI, respectively. Section VI includes the discussions and conclusions.

For notations, Symn,Sym+n, andSym++n denote the set of symmetric, positive semidefinite, and positive-definite matrices of sizen ×n, respectively. Small boldface letters, e.g.,x,v, represent column vectors. Capital letters, e.g.,P, A, denote matrices. Regular small letters, e.g., w, h, are for scalars or scalar-valued functions.

II. Mass-Transport-Based Covariance Paths

A. On Optimal Mass Transport

Letp0 (x) andp1 (x) denote two probability density functions onn. The Wasserstein-2 metric between the two, denoted by w2 (p0,p1), is defined by the square root of the optimal value as

infm(x,y)0n×nxy22m(x,y)dxdy,
s.t. nm(x,y)dx=p1(y),nm(x,y)dy=p0(x)

wherem (x,y) represents a probability density function on the joint spacen×n with the marginals specified byp0 andp1 [17], [18]. A fluid-mechanics interpretation of w2 (p0, p1)2 was introduced in [21] and [22], which provided a Riemannian structure of the manifold of probability density functions. To introduce this formula, we consider the following continuity equation:

pt(x)t+x(pt(x)vt(x))=0(6)

wherevt (x) represents a time-varying velocity field andpt (x) represents the time-varying density function. Then, w2 (p0, p1)2 is equal to (see [22])

infpt,vt{01Ept(vt(x)22)dt|p˙t+x(ptvt)=0}.(7)

The optimal solution ofpt (x) is the geodesic on the manifold of probability density functions that connects the endpointsp0 (x) andp1 (x).

B. Hellinger–Bures Metric

In the special case whenp0 (x) andp1 (x) are zero-mean Gaussian probability density functions with

pi(x)=det(2πPi)12e(12xPi1x), for i=0,1

the corresponding geodesicpt (x) at any fixed timet is also zero-mean Gaussian with the corresponding covariance matrix given byPtomt [19], [20]. The geodesic distance is equal to the Wasserstein-2 metric w2 (p0, p1), which also induces the following distance measure on the covariance matrices:

w2(p0,p1)=dw2(P0,P1)P012P112U^F,(8)

where ∥·∥F denotes the Frobenius norm of a matrix.

The covariance pathPtomt is also equal to the geodesic induced by the Hellinger–Bures metric from quantum mechanics [23], [24] on the manifold of positive-definite matrices. In particular, let Δ ∈ Symn, which represents a tangent vector atPSym++n. Then, the Hellinger–Bures metric takes the form

gP,Bures(Δ)=tr(ΔM)

whereM is the unique matrix in Symn, which satisfies12(PM+MP)=Δ (see, e.g., [25] and [26]). The trajectoryPtomt in (2) is the shortest path connectingP0 andP1. Thus, it satisfies

Ptomt=argminPt01gPt,Bures(P˙t)dt(9)

with a given pair of endpointsP0,P1Sym++n (see, e.g., [23]). The geodesic distance, also known as the Bures distance, is equal todw2(P0,P1).

The Hellinger–Bures metric was originally proposed in quantum mechanics to compare density matrices, which are positive-definite matrices whose traces are equal to one. The density matrices are noncommutative analogues of probability vectors. In the commutative case whenP is restricted to be a diagonal matrix whose diagonal entries consist of a probability vector, then the Hellinger–Bures metric gP,Bures(Δ) is equal the Fisher information metric, which will be discussed in Section III in detail. A generalization of the Hellinger–Bures metric was also introduced in [27] for multivariate spectral densities.

C. Hellinger–Bures-Based Linear Systems

Here, we present an alternative expression of (9) using the linear system in (3). For this purpose, we define that

fPt(At)tr(AtPtAt)

which is equal toEpt(x˙t2) ift is given by (3). The following theorem relates the geodesicsPtomt to a linear system.

Theorem 1: GivenP0,P1Sym++n. Let fP be defined as above. LetPtomt,Atomt be a pair of minimizer of

minPt,At{01fPt(At)dt|P˙t=AtPt+PtAt,P0,P1 given}.(10)

Then,Ptomt is equal to (2) and

Atomt=Q(tQI)1(11)

whereQ=IP012(P012P1P012)12P012. The optimal value of the objective function in (10) is equal todw2(P0,P1)2.

Proof: The optimization problem (10) appears as a special case of (7) with the additional constraint that the velocity fieldvt (x) =Atx. Thus,dw2(P0,P1)2 is a lower bound of (10). Therefore, we need to show thatAtomt satisfies

P˙tomt=AtomtPtomt+Ptomt(Atomt).(12)

To this end, we rewrite (2) asPtomt=(ItQ)P0(ItQ). Taking the derivative ofPtomt gives

P˙tomt =QP0(ItQ)(ItQ)P0Q=Q(tQI)1Pt+PtQ(tQI)1

which proves (12). Since all the eigenvalues ofQ are smaller than 1, the matrixtQI is invertible for allt ∈ [0, 1]. Therefore, (12) holds. Moreover, we have

tr(AtomtPtomtAtomt)=tr(QP0Q)=P11/2U^P01/2F2=dw2(P0,P1)2

which completes the proof. ■

We note that the matrixAtomt is symmetric. Moreover, the matricesAt1omt andAt2omt commute for anyt1, t2. Therefore, the eigenspace ofAt is fixed on the intervalt ∈ [0, 1].

The results from Theorem 1 can be further extended to obtain the optimal solutions corresponding to the following objective function:

fPtW(At)=Ept(x˙tW2)=tr(WAtPtAt)(13)

wheret =Atx,WSym++n, andx˙tW2xWx. By applying change of variables, we define

PW,tW12PtW12(14)
AW,tW12AtW12.(15)

Thus,fPtW(At)=tr(AW,tPW,tAW,t)=fPW,t(AW,t). Moreover, if (4) holds, then

P˙W,t=AW,tPW,t+PW,tAW,t.

Therefore, ifAtomt is the optimal system matrix that steersPW,0 toPW,1 with respect to fP (A) given by Theorem 1, thenW12AtomtW12 is the optimal solution with respect tofPW(A). In the following section, we investigate a further extension offPW() by using a time-dependent weighting matrix, which provides a point of contact between OMT and information geometry.

III. Information-Geometry-Based Covariance Paths

A. Fisher–Rao Metric

For two probability density functionsp(x) andp^(x) onn, the Kullback–Leibler (KL) divergence

dKL(pp^)np log (pp^)dx(16)

represents a well-established notion of distance between the two [28], [29]. Ifp^=p+δ withδ representing a small perturbation, then the quadratic term of the Taylor’s expansion of dKL(pp +δ) in terms ofδ is the Fisher information metric

gp,Fisher(δ)=δ2pdx.

For a probability distributionp(x,θ) parameterized by a vector θ, the corresponding metric is referred to as the Fisher–Rao metric and is given by

gθ,Rao(δθ)=δθE[( log pθ)( log pθ)]δθ.

Ifp(x) is a zero-mean Gaussian probability density function parameterized by a covariance matrixP, then the metric becomes

gP,Rao(Δ)=tr(P1ΔP1Δ).

GivenP0,P1Sym++n, the geodesic in (1) is equal to the solution

Ptinfo=argminPt01gP,Rao(P˙t)dt(17)

which is the shortest path connectingP0 andP1. The corresponding path length is equal to (see, e.g., [30, Th. 6.1.6])

dinfo(P0,P1)=log (P012P1P012)F.(18)

B. Fisher–Rao-Metric-Based Linear Systems

Following (5), we will introduce some positive quadratic forms f(A) so that the corresponding optimal covariance path is equal toPtinfo. One choice for the quadratic form would be given by

fPinfo,1(A):=gP,Rao(AP+PA)=2tr(AA+P1APA)(19)

which satisfies thatfPinfo,1(A)0,An×n andPSym++n.

Theorem 2: GivenP0,P1Sym++n. LetfPinfo,1() be defined as above. DefinePtinfo,Ainfo as a pair of minimizer of

minPt,At{01fPtinfo,1(At)dt|P˙t=AtPt+PtAt,P0,P1 given}.(20)

Then, the optimal solutionPtinfo is equal to (1), and

Ainfo12P012log (P012P1P012)P012.(21)

Moreover, the optimal value of the objective function is equal to dinfo(P0, P1)2.

Proof: We rewrite (1) as

Ptinfo=P012(P012P1P012)tP012=P012e12log (P012P1P012)te12log (P012P1P012)tP012=eAinfotP0eAinfot

where the last equation is obtained usingeXYX1=XeYX1. Take the derivative ofPtinfo to give that

P˙tinfo=AinfoPtinfo+PtinfoAinfo(22)

which completes the proof. ■

Note that the metric dinfo(·) is invariant with respect to congruence transforms, i.e., dinfo(P0,P1) = dinfo(TP0T′,TP1T′) for any invertible matrixT. IfAtinfo is the optimal solution of (20), thenTAinfoT−1 is the optimal solution corresponding to the pairTP0T′,TP1T′.

C. Weighted-Mass-Transport View

Since the map fromAt toP˙t in (4) is injective, the quadratic forms ofAt that lead to the geodesicPtinfo are not unique. Here, we provide an alternative quadratic form, which provides an interesting relation between OMT and the Fisher–Rao metric. For this purpose, we define

fPtinfo,2(At)=4Ept(x˙tPt12)=4tr(Pt1AtPtAt)(23)

which is a special form of (13) with the weighting matrixW =Pt and scaled by the factor of 4. The following lemma draws the relation betweenfPinfo,2(A) andfPinfo,1(A).

Lemma 1: ConsiderfPinfo,1() andfPinfo,2() be defined in (19) and (23), respectively. Then, we have

fPinfo,2(A)fPinfo,1(A)PSym++n,An×n.

Proof: Take the difference

fPinfo,2(A)fPinfo,1(A)=2tr(P1APAAA)=tr((P12AP12P12AP12)(P12AP12P12AP12))

which is nonnegative. ■

We note that ifP12AP12 is symmetric, thenfPinfo,1(A) is equal tofPinfo,2(A). This gives rise to the following proposition in parallel to Theorem 2.

Proposition 1: GivenP0,P1Sym++n. ThenPtinfo andAinfo given by (1) and (21), respectively, are the unique pair of minimizer of

minPt,At{01fPtinfo,2(At)dt|P˙t=AtPt+PtAt,P0,P1 given}(24)

withfPtinfo,2(At) defined as in (23).

Proof: From Lemma 1, we have

01fPtinfo,2(At)dt01fPtinfo,1(At)dtdinfo(P0,P1)2(25)

for any feasible pairs ofPt andAt. It is straightforward to verify that the above inequalities become equalities with the givenPtinfo andAinfo. Therefore, the proposition holds. ■

D. Fluid-Mechanics Interpretation

Note thatfPtinfo,2(At) is a special case offPtW(A) in (13) whenW=4Pt1. It is equal to

fPtinfo,2(At)=4Ept(vt(x)Pt12)

with the velocity field given byvt(x) =Atx. Thus, if the initial distributionp0(x) is Gaussian, so ispt(x), ∀t ≥ 0. Proposition 1 implies that among all the trajectories that connect two Gaussian probability density functionsp0 andp1, the lowest weighted-mass-transport cost is obtained by Gaussian density functions whose covariance matrices are equal toPtinfo. But this optimal trajectory is obtained under the linear constraint of velocity fields. Next, we remove this constraint and show that this trajectory is still optimal.

Theorem 3: Given two zero-mean Gaussian probability density functionsp0 (x),p1 (x) inn with covariance matricesP0,P1Sym++n, respectively, defineptinfo(x),vtinfo(x) as the minimizer of

infpt,vt{401Ept(vt(x)Pt12)dt|p˙t+x(ptvt)=0}.(26)

Then,ptinfo(x) is zero-mean Gaussian whose covariance matrix is equal toPtinfo in (1) andvtinfo(x)=Ainfox almost surely withAinfo given by (21). Moreover, the optimal value is equal to dinfo(P0, P1)2.

Proof: First, we define

[VtCtCtPt]Ept([vt(x)x][vt(x)x]).(27)

Then, applying integral by parts, we obtain

P˙t=nxxp˙t(x)dx=nxx(pt(x)vt(x))dx=Ct+Ct.(28)

Therefore, the following optimization problem

minCt,Vt,Pt{401tr(Pt1Vt)dt|[VtCtCtPt]Sym+2n×2n,P˙t=Ct+Ct}(29)

provides a lower bound of (26) because the higher order moments of the probability density functions are not considered. On the other hand, (24) provides an upper bound of (26) because the velocity field is constrained to satisfy the linear system. Then, we show that the two bounds coincide.

Note thatVtCtPt1CtSym+n. Thus, the optimalVt of (29) should satisfy thatVt=CtPt1Ct. Therefore, (29) is equal to

minCt,PtSym+n{401tr(Pt1CtPt1Ct)dt|P˙t=Ct+Ct}.(30)

Note that the constraintPtSym+n is automatically satisfied due to the inverse barrier objective function. Therefore, we drop the constraint thatPtSym+n in the following analysis.

The optimization problem (30) is viewed as an optimal control problem, withCt being matrix-valued control. Then, we derive the optimal solution using Pontryagin’s minimum principle (see, e.g., [31, Ch. 2]). A necessary condition for the optimal solution is that it must annihilate the variation of the Hamiltonian

h1(Ct,Pt,Πt)4tr(Pt1CtPt1Ct)+tr(Πt(Ct+Ct))

with respect to the controlCt. Here, Πt is a symmetric matrix representing the Lagrange multiplier, i.e., the costate. By setting the partial derivative ofh1 (·) with respect toCt to zero, we obtain that

Ct=14PtΠtPt(31)

which provides a necessary condition that the optimalCt is symmetric. Therefore,Ct=12P˙t and the objective function in (29) becomestr(Pt1P˙tPt1P˙t)=gP,Rao(P˙t). Thus, the theorem directly follows from (17). For completeness, we finish the proof based on the Hamiltonian in the following.

The optimalΠ˙t must annihilate the partial derivative ofh1 (·) with respect toPt. This gives rise to

Π˙t=8Pt1CtPt1CtPt1.(32)

Then, substituting (31) into (28) and (32), we obtain

P˙t=12PtΠtPt
Π˙t=12ΠtPtΠt.(33)

Note thatP˙tΠt+PtΠ˙t=0 for allt. Hence,PtΠt is constant. We set

14PtΠt=A.(34)

Thus, (31) is equal toCt =APt =PtA′. Multiplying both sides byPt12 gives thatPt12CtPt12=Pt12APt12, which is symmetric for allt. Substituting (34) into (33) gives

P˙t=APt+PtA.

Therefore, we have

Pt=eAtP0eAt.

MultiplyingP012 to both sides gives

P012PtP012=P012eAtP0eAtP012=e2P012AP012t.

By settingt = 1, we solve that

A=12P012 log (P012P1P012)P012

which is equal toAinfo. Furthermore, from (22), the correspondingPt is equal toPtinfo. Then, the optimal covariance matrix in (27) is singular and has rankn. Thus, the optimal velocity fieldvt(x) is equal toAinfox almost surely, implying that the correspondingpt(x) is Gaussian. Therefore, the theorem is proved. ■

Note that the system matrixAinfo is constant. Thus, bothAinfo andAtomt have fixed eigenspaces. Next, we introduce a different quadratic form ofAt, which leads to system matrices with rotating eigenspaces.

IV. Rotation-Linear-System-Based Covariance Paths

A. Weighted-Least-Squares Cost Functions

Note that ifX is an antisymmetric matrix, i.e.,X′ = −X, theneXt is a rotation matrix. Consequently, if the system matrixA is antisymmetric, then the state covariance matrix has rotating eigenspaces. In this regard, we decompose

A=As+Aa

whereAs12(A+A),Aa12(AA) are the symmetric and antisymmetric parts ofA, respectively. Then, we define the following weighted-least-squares (WLS) function:

fϵ(A)AsF2+ϵAaF2(35)

where the scalarϵ > 0 scalar weighs the relative significance of symmetric and asymmetric parts ofA. IfA satisfiesP˙=AP+PA for a given pairP˙ andP, then fϵ (A) is considered as a quadratic form of the noncommutative division ofP˙ byP, similar to the Fisher–Rao metric. Actually, for scalar-valued covariances, fϵ (A) is equal to the Fisher–Rao metric.

Following (5), we consider the optimal solution to

minPt,At{01fϵ(At)dt|P˙t=AtPt+PtAt,P0,P1 given}(36)

for a given pair of endpointsP0,P1Sym++n and a scalarϵ > 0.

B. Optimal Covariance Paths

To introduce the solution to (36), we define

Tϵ,t(A)e(1+ϵ)Aate(As+ϵAa)t.(37)

The next lemma shows thatTϵ,t(·) is equal to the state transition matrix of a linear time-varying system.

Lemma 2: GivenAn×n and a scalarϵ > 0, define

Aϵ,te(1+ϵ)AatAe(1+ϵ)Aat.

Then,T˙ϵ,t(A)=Aϵ,tTϵ,t(A).

Proof:

T˙ϵ,t(A)=e(1+ϵ)Aat((1+ϵ)Aa)e(As+ϵAa)t+e(1+ϵ)Aat(As+ϵAa)e(As+ϵAa)t=e(1+ϵ)AatAe(As+ϵAa)t=Aϵ,tTϵ,t(A).

The following corollary is a direct result of Lemma 2.

Corollary 1: GivenP0Sym++n,An×n and a scalarϵ > 0, define

Pϵ,tTϵ,t(A)P0Tϵ,t(A).

Then, the following equation holds:

P˙ϵ,t=Aϵ,tPϵ,t+Pϵ,tAϵ,t.(38)

The solution to (36) is presented as follows.

Theorem 4: GivenP0,P1Sym++n and a scalarϵ > 0, if there exists a matrix Π0 ∈ Symn such that the covariance path

Pϵ,twls=Tϵ,t(A0)P0Tϵ,t(A0)(39)

withTϵ,t(·) given by (37) and

A0=12(P0Π0+Π0P0)12ϵ(Π0P0P0Π0)(40)

satisfiesPϵ,1wls=P1 att = 1, thenPϵ,twls is a minimizer of (36). The corresponding optimalAt is equal to

Aϵ,twls=e(1+ϵ)AatA0e(1+ϵ)Aat.(41)

Proof: Consider (36) as an optimal control problem, withAt being matrix-valued control. Then, the Hamiltonian is as follows:

h2(At,Pt,Πt)=14At+AtF2+ϵ4AtAtF2+tr(Πt(AtPt+PtAt)),=1+ϵ2tr(AtAt)+1ϵ2tr(AtAt)+tr(Πt(AtPt+PtAt)).

It is necessary thatΠ˙t annihilates the partial derivative ofh2 (·) with respect toPt, which gives rise to

Π˙t=ΠtAtAtΠt.(42)

Moreover, the partial derivative ofh2 (·) with respect to the controlAt vanishes, which leads to

(At+At)+ϵ(AtAt)+2ΠtPt=0.(43)

SolvingAt from (43), we obtain

At=12(PtΠt+ΠtPt)12ϵ(ΠtPtPtΠt).(44)

Then, substituting (44) into (4) and (42), respectively, we obtain

P˙t=(1+1ϵ)PtΠtPt(12+12ϵ)(ΠtPt2+Pt2Πt)
Π˙t=(11ϵ)ΠtPtΠt+(12+12ϵ)(Πt2Pt+PtΠt2).

Next, it can be verified that(ΠtPt)(PtΠt)=0. Thus, the asymmetric part ofAt, which is equal to

(At)a=12ϵ(ΠtPtPtΠt)

is constant and denoted byAa. Taking the derivative of its symmetric part

(At)s=12(At+At)=12(PtΠt+ΠtPt)

gives that

(At)s=12(PtΠt)12(ΠtPt)=1+ϵ2ϵ(PtΠt2PtΠtPt2Πt)=(1+ϵ)(Aa(At)s+(At)sAa).

SinceAa is constant, the solution to the above equation is equal to

(At)s=e(1+ϵ)AatAse(1+ϵ)Aat.

Therefore, the optimalAt has the form

At=e(1+ϵ)AatAse(1+ϵ)Aat+Aa=e(1+ϵ)AatAe(1+ϵ)Aat

withA =Aa +As being the initial value ofAt. Next, we define a new variable

P^te(1+ϵ)AatPte(1+ϵ)Aat(45)

whose derivative is equal to

P^˙t=e(1+ϵ)Aat(AtPt+PtAt)e(1+ϵ)Aat+(1+ϵ)AaP^t+P^t(1+ϵ)Aa=(As+ϵAa)P^t+P^t(As+ϵAa).

Thus, the solution toP^t is equal to

P^t=e(As+ϵAa)tP0e(As+ϵAa)t.

Substituting this solution to (45), we obtain that the optimalPt has the form

Pt=e(1+ϵ)Aate(As+ϵAa)tP0e(As+ϵAa)te(1+ϵ)Aat.

In a similar way, we define

Π^t=e(1+ϵ)AatΠte(1+ϵ)Aat.

Then, we have

Π^˙t=(As+ϵAa)Π^t+Π^t(As+ϵAa)

whose solution is equal to

Πt=e(1+ϵ)Aate(As+ϵAa)tΠ0e(As+ϵAa)te(1+ϵ)Aat.

If (40) holds, thenAs+ϵAa=P0Π0. It is straightforward to show that (44) holds for allt > 0 for the provided expressions forAt, Pt, Πt. In this case,fϵ(At)=AsF2+ϵAaF2 for allt. Therefore, ifA is equal toA^ in (40), then the proposed trajectoriesAϵ,twls andPϵ,twls are local minimizer, which completes the proof. ■

It is interesting to note that the geodesicsPtinfo are special cases of the covariance paths in (39) whenϵ = −1. The existence and uniqueness of the covariance paths that satisfy the conditions in Theorem 4 forϵ > 0 will be discussed somewhere else [32].

Next, we provide an upper bound of the optimal value of (36) for allϵ > 0. For this purpose, we define

A^=log (P012(P012P1P012)12P012)

which is symmetric. The corresponding covariance path

Pt=(P012(P012P1P012)12P012)tP0(P012(P012P1P012)12P012)t(46)

is a feasible solution to (36). Therefore, the following proposition holds.

Proposition 2: GivenP0,P1Sym++n, the optimal value of (36) for anyϵ > 0 is not larger thanlog(P012(P012P1P012)12P012)F2.

Ifϵ → ∞, then the symmetric matrixA^ becomes the optimal solution. Moreover, ifP0 andP1 commute, thenPt in (46) is equal toPtinfo.

V. Examples

A. Interpolating Covariance Matrices

In this example, we highlight the difference betweenPϵ,twls and the other two types of trajectories, i.e.,Ptomt andPtinfo, using the following two matrices as the endpoints:

P0=[1002],P1=[2001].(47)

Applying (1) and (2), we obtain

Ptomt=[(1+(21)t)200(2+(12)t)2]
Ptinfo=[2t0021t]

which are all diagonal. On the other hand, ifϵ = 0, there are infinitely many antisymmetric matricesA0wls of the following form:

A0wls=[0±(2k+1)π2(2k+1)π20]

withk being an arbitrary integer. All these matrices are able to steerP0 toP1. Therefore, the corresponding objective functionf0(A0wls) is equal to zero. The covariance pathP0,twls is equal to

[1+sin2((2k+1)π2t)±12sin ((2k+1)πt)±12sin ((2k+1)πt)1+cos2((2k+1)π2t)]

which is not diagonal. A further discussion on the covariance paths corresponding to nonzeroϵ will be discussed somewhere else [32]. Therefore, the pathPϵ,twls could better track the rotation of energy between different variables via their correlations for a suitable choice ofϵ.

B. Smooth Covariance Paths for rsfMRI Analysis

We investigate an application to use the proposed covariance paths to fit noisy sample covariance matrices from a rsfMRI dataset. In the following, we provide detailed descriptions about the data, the method, and experimental results.

1). Data:

The sample covariance matrices are computed using an rsfMRI dataset from the Human Connectome Project [33]. This dataset consists of 1200 rsfMRI image volumes measured in a 15-min time window. The provided data have already been processed by the ICA-FIX method [34]. These are further processed using global signal regression, as suggested in [35]. Then, we apply the label map from [36] to separate brain cortical surface into seven nonoverlapping regions. The data sequences from each region are averaged into a one-dimensional time series, providing a seven-dimensional time series sampled at 1200 time points. The same dataset and preprocessing method have been used in our early work [37]. Next, we normalize each dimension of the time series by its standard deviation. The normalized time series is denoted by {xτ = 1, …, 1200}. Moreover, we split the entire sequence into ten equal-length segments and compute the corresponding sample covariance matrices as

P˜k=1120i=1120x120×k+ix120×k+i, for k=0,,9.

Then, the time scale is changed so thatP˜tk is equal toP˜k witht0 = 0 andt9 = 1. The color arrays in the first row ofFig. 1 illustrate several representativeP˜tk att=0,13,23,1, respectively. These figures show thatP˜tk has significant fluctuations, which is consistent with the observations from [11] and [12]. The main goal of this proof-of-concept experiment is to use the proposed covariance paths to fit these sample covariances and compare their differences. The neuroscience aspects of this experiment will not be discussed in this paper.

Fig. 1.

Fig. 1.

First row illustrates the sample covariance matrices between seven brain regions computed from different segments of an rsfMRI dataset from a human brain. The directed networks in the second row illustrate the estimated system matrices corresponding to the proposed WLS trajectories.

2). Method:

We solve optimization problems of the following form:

minPtPk=0KPtkP˜tkF2(48)

to obtain smooth paths that fit the measurements, whereK = 9 andP represents a suitable set of smooth paths. Based on results from the previous sections, we propose three sets of parametric models for the smooth paths, which are described in the following.

Based on Theorem 1, we define

Pomt{Pt|Pt=(ItQ)P0(ItQ),P0Sym++n,Qn×n}

wheren = 7 in this example. Note that the matrixQ could be asymmetric so thatPomt contains the OMT-based geodesics in the form of (2). We use the more general family of covariance pathsPomt in order to obtain better fitting results. It is also clear from Theorem 1 that aPtPomt is the state covariance of a linear time-varying system withAt = −Q(IQt)−1. We apply thefminsdp function1 in MATLAB to obtain an optimal solution. The initial values forP0 andQ are set toP˜0 and the zero matrix, respectively. The same initial values and optimization algorithm are used to solve the subsequent optimization problems. The corresponding optimal value is denoted byP^tomt.

Following Theorem 2, we define the second set of smooth paths as

Pinfo={Pt|Pt=eAtP0eAt,P0Sym++n,An×n}

which includes the geodesicsPtinfo. The optimal path in this set is denoted byP^tinfo. Clearly, a trajectory inPtPinfo is equal to the state covariance of a linear time-invariant system.

Based on Theorem 4, we define the set

Pϵ,wls={Pt|Pt=Tϵ,t(A)P0Tϵ,t(A),P0Sym++n,An×n}

for a givenϵ > 0. The corresponding optimal paths are denoted byP^ϵ,twls. This set includes all the trajectories that are solutions of (36). A trajectory inPϵ,wls is equal to the state covariance of a linear time-varying system with the system matrices expressed in the forme(1+ϵ)AatAe(1+ϵ)Aat. The system matrices corresponding toP^ϵ,twls is denoted byA^ϵ,twls. The parameterϵ is then searched over a discrete set in [0, 100] to minimize fitting errors. Based on the fitting results, we set the value ofϵ at 20.

3). Results:

Fig. 2 illustrates the fitting results of six representative entries ofP˜tk. The black stars represent the noisy measurements. The blue, green, and red plots represent the estimated pathsP^tomt,P^tinfo, andP^ϵ,twls, respectively.P^tomt andP^tinfo are very similar with each other. Clearly,P^ϵ,twls has more oscillations, which better fits the fluctuations in the measurements. The normalized square errors(k=0KP^tkP˜tkF2)/(kKP^tkF2) corresponding toP^tomt,P^tinfo, andP^ϵ,twls are equal to 0.1683, 0.1671, and 0.1543, respectively. Thus,P^ϵ,twls has the smallest fitting error. The overall relative large residual is partly due to the low signal-to-noise ratio of rsfMRI data [38]. Therefore, the corresponding system matricesA^ϵ,twls could better explain the dynamic interdependence between brain regions. The directed networks in the second row ofFig. 1 illustrates the matricesA^ϵ,twls att=0,13,23,1, respectively. The red and blue colors represent positive and negative values, respectively. The edge widths are weighted by the absolute value of the corresponding entries. To simplify visualization, edges with weight smaller than 0.15 are not displayed.

Fig. 2.

Fig. 2.

Black stars in each image panel illustrate the noisy sample covariance matrices at different time points. The blue, green, and red lines are the fitted curves using the proposed three sets of smooth paths.

VI. Discussion

In this paper, we have investigated a framework to derive covariance paths on the Riemannian manifolds of positive-definite matrices by using quadratic forms of system matrices to regularize the path lengths. We have analyzed three types of quadratic forms and the corresponding covariance paths. The first and the second quadratic forms lead to the well-known geodesics derived from the Hellinger–Bures metric from OMT and the Fisher–Rao metric from information geometry, respectively. In the process, we have derived a fluid-mechanics interpretation of the Fisher–Rao metric in Theorem 3, which provides an interesting weighted-OMT view for the Fisher–Rao metric. Another contribution of this paper is the introduction of the third type of quadric form, which gives rise to a general family of covariance paths that are steered by system matrices with rotating eigenspaces.

We note that similar types of trajectories of positive-definite matrices with rotating eigenspaces have been investigated in [39]–[41] from different angles. This work is also developed along similar lines as [42] and [43], which focus on the optimal steering of state covariances via linear systems using external input. But the approach for developing covariance paths used in this paper is different from early work.

In a proof-of-concept example, we have applied the three types of smooth paths to fit noisy sample covariance matrices from an rsfMRI dataset. The goal of this experiment is to understand directed interactions among brain regions via the estimated system matrices. As expected, the rotation-system based covariance path has the best performance in terms of fitting errors. Therefore, the corresponding system matrices could provide a data-driven tool to understand the structured fluctuations of functional brain activities. The spatial–temporal varying features of functional brain networks could potentially provide useful information to understand the pathologies of mental disorders. In our future work, we will explore the proposed covariance paths to analyze data from other neuroimaging modalities such as electroencephalography/magnetoencephalography.

Acknowledgment

The author would like to thank T. Georgiou and Y. Rathi for insightful discussions. The author would also like to thank the anonymous reviewers for their valuable comments and suggestions to improve the quality of this paper.

This work was supported under Grant R21MH116352, Grant R21MH115280 (PI: Ning), Grant R01MH097979 (PI: Rathi), Grant R01MH111917 (PI: Rathi), and Grant R01MH074794 (PI: Westin).

Biography

graphic file with name nihms-1063649-b0001.gif

Lipeng Ning received the B.Sc. and M.Sc. degrees in control science and engineering from the Beijing Institute of Technology, Beijing, China, in 2006 and 2008, respectively, and the Ph.D. degree in electrical and computer engineering from the University of Minnesota, Minneapolis, MN, USA, in November 2013.

He is currently a Faculty Member with the Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA. He is interested in the application of mathematics in neuroimaging and neuroscience research. His current research focuses on developing neuroimaging techniques to improve the diagnosis and treatment of brain diseases.

Footnotes

References

  • [1].Porikli F, Tuzel O, and Meer P, “Covariance tracking using model update based on lie algebra,” in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit, Jun.2006, vol. 1, pp. 728–735. [Google Scholar]
  • [2].Wu Yet al. , “Real-time probabilistic covariance tracking with efficient modelupdate,”IEEE Trans. Image Process, vol. 21, no. 5, pp. 2824–2837, May2012. [DOI] [PubMed] [Google Scholar]
  • [3].Yang JF and Kaveh M, “Adaptive eigensubspace algorithms for direction or frequency estimation and tracking,” IEEE Trans. Acoust., Speech, Signal Process, vol. 36, no. 2, pp. 241–251, Feb.1988. [Google Scholar]
  • [4].Jiang X, Ning L, and Georgiou TT, “Distances and Riemannian metrics for multivariate spectral densities,” IEEE Trans. Autom. Control, vol. 57, no. 7, pp. 1723–1735, Jul.2012. [Google Scholar]
  • [5].Lenglet C, Rousson M, Deriche R, and Faugeras O, “Statistics on the manifold of multivariate normal distributions: Theory and application to diffusion tensor MRI processing,” J. Math. Imag. Vis, vol. 25, no. 3, pp. 423–444, Oct.2006. [Online]. Available: 10.1007/s10851-006-6897-z [DOI] [Google Scholar]
  • [6].Dryden IL, Koloydenko A, and Zhou D, “Non-Euclidean statistics for covariance matrices, with applications to diffusion tensor imaging,” Ann. Appl. Statist, vol. 3, no. 3, pp. 1102–1123, 2009. [Online]. Available:http://www.jstor.org/stable/30242879 [Google Scholar]
  • [7].Hao X, Whitaker RT, and Fletcher PT, Adaptive Riemannian Metrics for Improved Geodesic Tracking of White Matter. Berlin, Germany: Springer, 2011, pp. 13–24. [Online]. Available: 10.1007/978-3-642-22092-0_2 [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [8].Biswal B, Yetkin FZ, Haughton VM, and Hyde JS, “Functional connectivity in the motor cortex of resting human brain using echo-planar MRI,” Magn. Reson. Med, vol. 34, no. 4, pp. 537–541, 1995. [Online]. Available: 10.1002/mrm.1910340409 [DOI] [PubMed] [Google Scholar]
  • [9].Buckner RL, Krienen FM, and Yeo BTT, “Opportunities and limitations of intrinsic functional connectivity MRI,” Nature Neurosci, vol. 16, pp. 832–837, 2013. [DOI] [PubMed] [Google Scholar]
  • [10].Smith SMet al. , “Functional connectomics from resting-state fMRI,” Trends Cogn. Sci, vol. 17, no. 12, pp. 666–682, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [11].Chang C and Glover GH, “Time-frequency dynamics of resting-state brain connectivity measured with fMRI,” NeuroImage, vol. 50, no. 1, pp. 81–98, 2010. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [12].Preti MG, Bolton TA, and Ville DVD, “The dynamic functional connectome: State-of-the-art and perspectives,” NeuroImage, vol. 160, pp. 41–54, 2017. [DOI] [PubMed] [Google Scholar]
  • [13].Rao C, “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc, vol. 37, pp. 81–91, 1945. [Google Scholar]
  • [14].Amari S-I and Nagaoka H, Methods of Information Geometry. Providence, RI, USA: Amer. Math. Soc., 2000. [Google Scholar]
  • [15].Cencov N, Statistical Decision Rules and Optimal Inference. Providence, RI, USA: Amer. Math. Soc., 1982. [Google Scholar]
  • [16].Kass R and Vos P, Geometrical Foundations of Asymptotic Inference. New York, NY, USA: Wiley, 1997. [Google Scholar]
  • [17].Villani C, Topics in Optimal Transportation. Providence, RI, USA: Amer. Math. Soc., 2003. [Google Scholar]
  • [18].Rachev S and Rüschendorf L, Mass Transportation Problems. Vol. I and II(Probability and Its Applications). NewYork, NY, USA: Springer, 1998. [Google Scholar]
  • [19].Knott M and Smith CS, “On the optimal mapping of distributions,” J. Optim. Theory Appl, vol. 43, no. 1, pp. 39–49, May1984. [Google Scholar]
  • [20].Takatsu A, “Wasserstein geometry of Gaussian measures,” Osaka J. Math, vol. 48, pp. 1005–1026, 2011. [Google Scholar]
  • [21].Jordan R, Kinderlehrer D, and Otto F, “The variational formulation of the Fokker–Planck equation,” SIAM J. Math. Anal, vol. 29, no. 1, pp. 1–17, 1998. [Google Scholar]
  • [22].Benamou J-D and Brenier Y, “A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem,” Numer. Math, vol. 84, no. 3, pp. 375–393, Jan.2000. [Google Scholar]
  • [23].Ning L, Jiang X, and Georgiou T, “On the geometry of covariance matrices,” IEEE Signal Process. Lett, vol. 20, no. 8, pp. 787–790, Aug.2013. [Google Scholar]
  • [24].Bhatia R, Jain T, and Lim Y, “On the Bures-Wasserstein distance between positive definite matrices,” Expo. Math, 2018. [Online]. Available: 10.1016/j.exmath.2018.01.002 [DOI] [Google Scholar]
  • [25].Uhlmann A, “The metric of Bures and the geometric phase,” in Quantum Groups and Related Topics: Proceedings of the First Max Born Symposium, Gielerak R, Lukierski J, and Popowicz Z, Eds. Dordrecht, The Netherlands: Springer, 1992, p. 267. [Google Scholar]
  • [26].Petz D, “Geometry of canonical correlation on the state space of a quantum system,” J. Math. Phys, vol. 35, pp. 780–795, 1994. [Google Scholar]
  • [27].Ferrante A, Pavon M, and Ramponi F, “Hellinger versus Kullback-Leibler multivariate spectrum approximation,” IEEE Trans. Autom. Control, vol. 53, no. 4, pp. 954–967, May2008. [Google Scholar]
  • [28].Kullback S and Leibler RA, “On information and sufficiency,” Ann. Math. Statist, vol. 22, no. 1, pp. 79–86, 1951. [Google Scholar]
  • [29].Cover T and Thomas J, Elements of Information Theory. Hoboken, NJ, USA: Wiley-Interscience, 2008. [Google Scholar]
  • [30].Bhatia R, Positive Definite Matrices. Princeton, NJ, USA: Princeton Univ. Press, 2007. [Google Scholar]
  • [31].Schättler H and Ledzewicz U, Geometric Optimal Control: Theory, Methods and Examples (Interdisciplinary Applied Mathematics). New York, NY, USA: Springer, 2012. [Google Scholar]
  • [32].Ning L, “Regularization of covariance matrices on Riemannian manifolds using linear systems,” May2018, arXiv: 1805.11699.
  • [33].Essen DVet al. , “The human connectome project: A data acquisition perspective,” NeuroImage, vol. 62, no. 4, pp. 2222–2231, 2012. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [34].Smith etal SM., “Resting-state fMRI in the human connectome project,” NeuroImage, vol. 80, pp. 144–168, 2013. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [35].Fox M, Zhang D, Snyder A, and Raichle M, “The global signal and observed anticorrelated resting state brain networks,” J. Neurophysiol, vol. 101, pp. 3270–3283, 2009. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [36].Yeo BTet al. , “The organization of the human cerebral cortex estimated by intrinsic functional connectivity,” J. Neurophysiol, vol. 106, pp. 1125–1165, 2011. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [37].Ning L and Rathi Y, “A dynamic regression approach for frequency-domain partial coherence and causality analysis of functional brain networks,” IEEE Trans. Med. Imag, vol. 37, no. 9, pp. 1957–1969, Sep.2018. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [38].Murphy K, Bodurka J, and Bandettini PA, “How long to scan? The relationship between fMRI temporal signal to noise ratio and necessary scan duration,” NeuroImage, vol. 34, no. 2, pp. 565–574, 2007. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [39].Ning L, Georgiou TT, and Tannenbaum A, “On matrix-valued Monge-Kantorovich optimal mass transport,”IEEE Trans. Autom. Control, vol. 60, no. 2, pp. 373–382, Feb.2015. [DOI] [PMC free article] [PubMed] [Google Scholar]
  • [40].Yamamoto K, Chen Y, Ning L, Georgiou TT, and Tannenbaum A, “Regularization and interpolation of positive matrices,” IEEE Trans. Autom. Control, vol. 63, no. 4, pp. 1208–1212, Apr.2018. [Google Scholar]
  • [41].Chen Y, Georgiou T, and Tannenbaum A, “Matrix optimal mass transport: A quantum mechanical approach,” IEEE Trans. Autom. Control, vol. 63, no. 8, pp. 2612–2619, Aug.2018. [Google Scholar]
  • [42].Chen Y, Georgiou TT, and Pavon M, “Optimal steering of a linear stochastic system to a final probability distribution, part I,” IEEE Trans. Autom. Control, vol. 61, no. 5, pp. 1158–1169, May2016. [Google Scholar]
  • [43].Chen Y, Georgiou TT, and Pavon M, “Optimal steering of a linear stochastic system to a final probability distribution, part II,” IEEE Trans. Autom. Control, vol. 61, no. 5, pp. 1170–1180, May2016. [Google Scholar]

ACTIONS

RESOURCES


[8]ページ先頭

©2009-2025 Movatter.jp