Movatterモバイル変換

Precision matrix estimation requires selecting appropriate regularization parameterλ to balance sparsity (number of edges) and model fit (likelihood), and a mixing parameterα to trade off between element-wise (individual-level) and block-wise (group-level) penalties.

Background: Negative Log-Likelihood

In a Gaussian graphical model (GGM), the data matrixX_n × d consists ofn independent and identically distributed observationsX₁, …, X_n drawn fromN_d(μ, Σ). LetΩ = Σ⁻¹ denote the precision matrix, and define the empirical covariance matrix as$S = n^{-1} \sum_{i=1}^n (X_i-\bar{X})(X_i-\bar{X})^\top$. Up to an additive constant, the negative log-likelihood (nll) forΩ simplified to$$\mathrm{nll}(\Omega) = \frac{n}{2}[-\log\det(\Omega) + \mathrm{tr}(S\Omega)].$$ The edge setE(Ω) is determined by the non-zero off-diagonal entries: an edge(i, j) is included if and only ifω_ij ≠ 0 fori < j. The number of edges is therefore given by|E(Ω)|.

Selection Criteria

whereξ ∈ [0, 1] is a tuning parameter. Settingξ = 0 reduces EBIC to the classic BIC.

Figure 1 illustrates theK-fold cross-validation procedure used for tuning the parametersλ andα. The notation#λ and#α denotes the number of candidate values considered forλ andα, respectively, forming a grid of#λ × #α total parameter combinations. For each of theK iterations, negative log-likelihood loss is evaluated for all parameter combinations, yieldingK performance values per combination. The optimal parameter pair is selected as the one achieving the lowest average loss across theK iterations.

Reference

Akaike, Hirotogu. 1973.“Information Theory and an Extension of the Maximum Likelihood Principle.” InSecond International Symposium on Information Theory, edited by Boris Nikolaevich Petrov and Frigyes Csáki, 267–81. Budapest, Hungary: Akadémiai Kiadó.

Chen, Jiahua, and Zehua Chen. 2008.“ExtendedBayesian Information Criteria for Model Selection with Large Model Spaces.”Biometrika 95 (3): 759–71.https://doi.org/10.1093/biomet/asn034.

Fan, Jianqing, Han Liu, Yang Ning, and Hui Zou. 2017.“High Dimensional Semiparametric Latent Graphical Model for Mixed Data.”Journal of the Royal Statistical Society Series B: Statistical Methodology 79 (2): 405–21.https://doi.org/10.1111/rssb.12168.

Foygel, Rina, and Mathias Drton. 2010.“Extended Bayesian Information Criteria forGaussian Graphical Models.” InAdvances in Neural Information Processing Systems 23 (NIPS 2010), edited by J. Lafferty, C. Williams, J. Shawe-Taylor, R. Zemel, and A. Culotta, 604–12. Red Hook, NY, USA: Curran Associates, Inc.https://dl.acm.org/doi/10.5555/2997189.2997257.

Schwarz, Gideon. 1978.“Estimating the Dimension of a Model.”The Annals of Statistics 6 (2): 461–64.https://doi.org/10.1214/aos/1176344136.

Wang, Lan, Yongdai Kim, and Runze Li. 2013.“Calibrating Nonconvex Penalized Regression in Ultra-High Dimension.”The Annals of Statistics 41 (5): 2505–36.https://doi.org/10.1214/13-AOS1159.

Movatterモバイル変換

Selection Criteria for Parameters in grasps

Introduction

Background: Negative Log-Likelihood

Selection Criteria

Reference