Movatterモバイル変換


[0]ホーム

URL:


Jump to content
WikipediaThe Free Encyclopedia
Search

Semiparametric regression

From Wikipedia, the free encyclopedia
Regression models that combine parametric and nonparametric models
This article includes a list ofgeneral references, butit lacks sufficient correspondinginline citations. Please help toimprove this article byintroducing more precise citations.(August 2015) (Learn how and when to remove this message)
Part of a series on
Regression analysis
Models
Estimation
Background

Instatistics,semiparametric regression includesregression models that combineparametric andnonparametric models. They are often used in situations where the fully nonparametric model may not perform well or when the researcher wants to use a parametric model but the functional form with respect to a subset of the regressors or the density of the errors is not known. Semiparametric regression models are a particular type ofsemiparametric modelling and, since semiparametric models contain a parametric component, they rely on parametric assumptions and may bemisspecified andinconsistent, just like a fully parametric model.

Methods

[edit]

Many different semiparametric regression methods have been proposed and developed. The most popular methods are the partially linear, index and varying coefficient models.

Partially linear models

[edit]

Apartially linear model is given by

Yi=Xiβ+g(Zi)+ui,i=1,,n,{\displaystyle Y_{i}=X'_{i}\beta +g\left(Z_{i}\right)+u_{i},\,\quad i=1,\ldots ,n,\,}

whereYi{\displaystyle Y_{i}} is the dependent variable,Xi{\displaystyle X_{i}} is ap×1{\displaystyle p\times 1} vector of explanatory variables,β{\displaystyle \beta } is ap×1{\displaystyle p\times 1} vector of unknown parameters andZiRq{\displaystyle Z_{i}\in \operatorname {R} ^{q}}. The parametric part of the partially linear model is given by the parameter vectorβ{\displaystyle \beta } while the nonparametric part is the unknown functiong(Zi){\displaystyle g\left(Z_{i}\right)}. The data is assumed to be i.i.d. withE(ui|Xi,Zi)=0{\displaystyle E\left(u_{i}|X_{i},Z_{i}\right)=0} and the model allows for a conditionallyheteroskedastic error processE(ui2|x,z)=σ2(x,z){\displaystyle E\left(u_{i}^{2}|x,z\right)=\sigma ^{2}\left(x,z\right)} of unknown form. This type of model was proposed by Robinson (1988) and extended to handle categorical covariates by Racine and Li (2007).

This method is implemented by obtaining an{\displaystyle {\sqrt {n}}} consistent estimator ofβ{\displaystyle \beta } and then deriving an estimator ofg(Zi){\displaystyle g\left(Z_{i}\right)} from thenonparametric regression ofYiXiβ^{\displaystyle Y_{i}-X'_{i}{\hat {\beta }}} onz{\displaystyle z} using an appropriate nonparametric regression method.[1]

Index models

[edit]

A single index model takes the form

Y=g(Xβ0)+u,{\displaystyle Y=g\left(X'\beta _{0}\right)+u,\,}

whereY{\displaystyle Y},X{\displaystyle X} andβ0{\displaystyle \beta _{0}} are defined as earlier and the error termu{\displaystyle u} satisfiesE(u|X)=0{\displaystyle E\left(u|X\right)=0}. The single index model takes its name from the parametric part of the modelxβ{\displaystyle x'\beta } which is ascalar single index. The nonparametric part is the unknown functiong(){\displaystyle g\left(\cdot \right)}.

Ichimura's method

[edit]

The single index model method developed by Ichimura (1993) is as follows. Consider the situation in whichy{\displaystyle y} is continuous. Given a known form for the functiong(){\displaystyle g\left(\cdot \right)},β0{\displaystyle \beta _{0}} could be estimated using thenonlinear least squares method to minimize the function

i=1(Yig(Xiβ))2.{\displaystyle \sum _{i=1}\left(Y_{i}-g\left(X'_{i}\beta \right)\right)^{2}.}

Since the functional form ofg(){\displaystyle g\left(\cdot \right)} is not known, we need to estimate it. For a given value forβ{\displaystyle \beta } an estimate of the function

G(Xiβ)=E(Yi|Xiβ)=E[g(Xiβo)|Xiβ]{\displaystyle G\left(X'_{i}\beta \right)=E\left(Y_{i}|X'_{i}\beta \right)=E\left[g\left(X'_{i}\beta _{o}\right)|X'_{i}\beta \right]}

usingkernel method. Ichimura (1993) proposes estimatingg(Xiβ){\displaystyle g\left(X'_{i}\beta \right)} with

G^i(Xiβ),{\displaystyle {\hat {G}}_{-i}\left(X'_{i}\beta \right),\,}

theleave-one-outnonparametric kernel estimator ofG(Xiβ){\displaystyle G\left(X'_{i}\beta \right)}.

Klein and Spady's estimator

[edit]

If the dependent variabley{\displaystyle y} is binary andXi{\displaystyle X_{i}} andui{\displaystyle u_{i}} are assumed to beindependent, Klein and Spady (1993) propose a technique for estimatingβ{\displaystyle \beta } usingmaximum likelihood methods. The log-likelihood function is given by

L(β)=i(1Yi)ln(1g^i(Xiβ))+iYiln(g^i(Xiβ)),{\displaystyle L\left(\beta \right)=\sum _{i}\left(1-Y_{i}\right)\ln \left(1-{\hat {g}}_{-i}\left(X'_{i}\beta \right)\right)+\sum _{i}Y_{i}\ln \left({\hat {g}}_{-i}\left(X'_{i}\beta \right)\right),}

whereg^i(Xiβ){\displaystyle {\hat {g}}_{-i}\left(X'_{i}\beta \right)} is theleave-one-out estimator.

Smooth coefficient/varying coefficient models

[edit]

Hastie and Tibshirani (1993) propose a smooth coefficient model given by

Yi=α(Zi)+Xiβ(Zi)+ui=(1+Xi)(α(Zi)β(Zi))+ui=Wiγ(Zi)+ui,{\displaystyle Y_{i}=\alpha \left(Z_{i}\right)+X'_{i}\beta \left(Z_{i}\right)+u_{i}=\left(1+X'_{i}\right)\left({\begin{array}{c}\alpha \left(Z_{i}\right)\\\beta \left(Z_{i}\right)\end{array}}\right)+u_{i}=W'_{i}\gamma \left(Z_{i}\right)+u_{i},}

whereXi{\displaystyle X_{i}} is ak×1{\displaystyle k\times 1} vector andβ(z){\displaystyle \beta \left(z\right)} is a vector of unspecified smooth functions ofz{\displaystyle z}.

γ(){\displaystyle \gamma \left(\cdot \right)} may be expressed as

γ(Zi)=(E[WiWi|Zi])1E[WiYi|Zi].{\displaystyle \gamma \left(Z_{i}\right)=\left(E\left[W_{i}W'_{i}|Z_{i}\right]\right)^{-1}E\left[W_{i}Y_{i}|Z_{i}\right].}

See also

[edit]

Notes

[edit]
  1. ^See Li and Racine (2007) for an in-depth look at nonparametric regression methods.

References

[edit]
Continuous data
Center
Dispersion
Shape
Count data
Summary tables
Dependence
Graphics
Study design
Survey methodology
Controlled experiments
Adaptive designs
Observational studies
Statistical theory
Frequentist inference
Point estimation
Interval estimation
Testing hypotheses
Parametric tests
Specific tests
Goodness of fit
Rank statistics
Bayesian inference
Correlation
Regression analysis (see alsoTemplate:Least squares and regression analysis
Linear regression
Non-standard predictors
Generalized linear model
Partition of variance
Categorical
Multivariate
Time-series
General
Specific tests
Time domain
Frequency domain
Survival
Survival function
Hazard function
Test
Biostatistics
Engineering statistics
Social statistics
Spatial statistics
Retrieved from "https://en.wikipedia.org/w/index.php?title=Semiparametric_regression&oldid=1086588362"
Category:
Hidden categories:

[8]ページ先頭

©2009-2025 Movatter.jp