Movatterモバイル変換


[0]ホーム

URL:


How to use bife

Binary choice models with individual fixed effects

In econometrics, fixed effects binary choice models are importanttools for panel data analysis. Our package provides an approachsuggested byStammann, Heiss, and McFadden(2016) to estimate logit and probit panel data models of thefollowing form:

\[y_{it} = \mathbf{1}\left[\mathbf{x}_{it}\boldsymbol{\beta} + \alpha_{i}> \epsilon_{it}\right] \;,\]

where\(i = 1, \dots, N\) and\(t = 1, \dots, T_i\) denote different panelindices. In many applications,\(i\)represents individuals, firms or other cross-sectional units and\(t\) represents time in a longitudinal dataset. But the setup is also useful for instance if\(i\) represents ZIP code areas and\(t\) is an index of individuals.

We are primarily interested in estimating the parameters\(\boldsymbol{\beta}\), but the model alsoincludes individual fixed effects\(\alpha_{i}\). We assume\(E(\epsilon_{it} | \mathbf{X}_{i}, \alpha_{i}) =0\) but do not make any assumptions about the marginaldistribution of\(\alpha_{i}\) or itscorrelation with the regressors\(\mathbf{x}_{i1},\dots,\mathbf{x}_{iT_i}\).

The estimator implemented in this package is based on maximumlikelihood estimation (ML) of both\(\boldsymbol{\beta}\) and\(\alpha_{1}, \dots, \alpha_{N}\). Itactually is equivalent to a generalized linear model(glm()) for binomial data where the set of regressors isextended by a dummy variable for each individual. The main difference isthatbife() applies a pseudo-demeaning algorithm proposedbyStammann, Heiss, and McFadden (2016) toconcentrate out the fixed effects from the optimization problem.1 Itscomputational costs are lower by orders of magnitude if\(N\) is reasonably large.

It is well known that as\(N \rightarrow\infty\), the ML estimator is not consistent. This “incidentalparameters problem” can be severe if\(T\) is small. To tackle this problem, weprovide an analytical bias correction for the structural parameters\(\boldsymbol{\beta}\) and the averagepartial effects derived byFernández-Val(2009).2 Thus this package is well suited to analysebig micro-data where\(N\) and/or\(T\) are large.

Estimating a binary-choice model with individual effects

In the following we utilize an example from labor economics todemonstrate the capabilities ofbife(). More precisely, weuse a balanced micro panel data set from thePanel Study of IncomeDynamics to analyze the intertemporal labor force participation of1,461 married women observed for nine years. A similar empiricalillustration is used inFernández-Val(2009) and is an adoption fromHyslop(1999).

Before we start, we briefly inspect the data set to get an idea aboutits structure and potential covariates.

data(psid,package ="bife")head(psid)
##    ID LFP KID1 KID2 KID3     INCH AGE TIME## 1:  1   1    1    1    1 58807.81  26    1## 2:  1   1    1    0    2 41741.87  27    2## 3:  1   1    0    1    2 51320.73  28    3## 4:  1   1    0    1    2 48958.58  29    4## 5:  1   1    0    1    2 53634.62  30    5## 6:  1   1    0    0    3 50983.13  31    6

ID andTIME are individual andtime-specific identifiers,LFP is an indicator equal to oneif a woman is in labor force,KID1 -KID3 arethe number of children in a certain age group,INCH is theannual income of the husband, andAGE is the age of thewoman.

First, we use a specification similar toFernández-Val (2009) and estimate a static modelof women’s labor supply where we control for unobserved individualheterogeneity (so called individual fixed effects).

library(bife)stat<-bife(  LFP~ KID1+ KID2+ KID3+log(INCH)+ AGE+I(AGE^2)| ID,data  = psid,model ="probit"  )summary(stat)
## binomial - probit link#### LFP ~ KID1 + KID2 + KID3 + log(INCH) + AGE + I(AGE^2) | ID#### Estimates:##             Estimate Std. error z value Pr(> |z|)## KID1      -0.7144667  0.0562414 -12.704   < 2e-16 ***## KID2      -0.4114554  0.0515524  -7.981  1.45e-15 ***## KID3      -0.1298776  0.0415477  -3.126   0.00177 **## log(INCH) -0.2417657  0.0541720  -4.463  8.08e-06 ***## AGE        0.2319724  0.0375351   6.180  6.40e-10 ***## I(AGE^2)  -0.0028846  0.0004989  -5.781  7.41e-09 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1#### residual deviance= 6058.88,## null deviance= 8152.05,## n= 5976, N= 664#### ( 7173 observation(s) deleted due to perfect classification )#### Number of Fisher Scoring Iterations: 6#### Average individual fixed effect= -1.121

Asglm(), the summary statistic of the model providesdetailed information about the coefficients and some information aboutthe model fit (residual deviance andnull deviance). Furthermore, we report statistics that arespecific to fixed effects models. More precisely, we learn that only5,976 observations out of 13,149 contribute to the identification of thestructural parameters. This is indicated by the message that 7,173observations are deleted due to perfect classification. With respect tobinary choice models those are observations that are related to womenwho never change their labor force participation status during the nineyears observed. Thus those women were either always employed orunemployed. Overall the estimation results are based on 664 womenobserved for nine years.

Because coefficients itself are not very meaningful, researchers areusually interested in so called partial effects (also known as marginalor ceteris paribus effects). A commonly used statistic is the averagepartial effect.bife offers a post-estimation routine toestimate average partial effects and their corresponding standarderrors.

apes_stat<-get_APEs(stat)summary(apes_stat)
## Estimates:##             Estimate Std. error z value Pr(> |z|)## KID1      -9.278e-02  7.728e-03 -12.006   < 2e-16 ***## KID2      -5.343e-02  7.116e-03  -7.508  5.99e-14 ***## KID3      -1.687e-02  5.995e-03  -2.813    0.0049 **## log(INCH) -3.140e-02  7.479e-03  -4.198  2.69e-05 ***## AGE        3.012e-02  5.258e-03   5.729  1.01e-08 ***## I(AGE^2)  -3.746e-04  7.015e-05  -5.340  9.29e-08 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

A widespread reason that prevents the use of non-linear fixed effectsmodels in practice is the so-called incidental parameter bias problem(IPP) first mentioned byNeyman andScott (1948). Fortunately, for classical panel data sets, like inthis example, there already exist several asymptotic bias correctionstackling theIPP (seeFernández-Val andWeidner (2018) for an overview). Our package provides apost-estimation routine that applies the analytical bias correctionderived byFernández-Val (2009).

stat_bc<-bias_corr(stat)summary(stat_bc)
## binomial - probit link#### LFP ~ KID1 + KID2 + KID3 + log(INCH) + AGE + I(AGE^2) | ID#### Estimates:##             Estimate Std. error z value Pr(> |z|)## KID1      -0.6308839  0.0555073 -11.366   < 2e-16 ***## KID2      -0.3635269  0.0511325  -7.110  1.16e-12 ***## KID3      -0.1149869  0.0413488  -2.781   0.00542 **## log(INCH) -0.2139549  0.0536613  -3.987  6.69e-05 ***## AGE        0.2052708  0.0373054   5.502  3.75e-08 ***## I(AGE^2)  -0.0025520  0.0004962  -5.143  2.70e-07 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1#### residual deviance= 6058.88,## null deviance= 8152.05,## n= 5976, N= 664#### ( 7173 observation(s) deleted due to perfect classification )#### Number of Fisher Scoring Iterations: 6#### Average individual fixed effect= -0.969
apes_stat_bc<-get_APEs(stat_bc)summary(apes_stat_bc)
## Estimates:##             Estimate Std. error z value Pr(> |z|)## KID1      -1.016e-01  7.582e-03 -13.394   < 2e-16 ***## KID2      -5.852e-02  7.057e-03  -8.292   < 2e-16 ***## KID3      -1.851e-02  5.951e-03  -3.110   0.00187 **## log(INCH) -3.444e-02  7.376e-03  -4.669  3.03e-06 ***## AGE        3.304e-02  5.235e-03   6.312  2.76e-10 ***## I(AGE^2)  -4.108e-04  6.986e-05  -5.880  4.10e-09 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Whereas analytical bias corrections for static models get more andmore attention in applied work, it is not well known that they can alsobe used for dynamic models with fixed effects.

Before we can adjust our static to a dynamic specification, we firsthave to generate a lagged dependent variable.

library(data.table)setDT(psid)setkey(psid, ID, TIME)psid[, LLFP:=shift(LFP), by= ID]

Contrary to the bias correction for the static models, we need toadditionally provide a bandwidth parameter (L) that isrequired for the estimation of spectral densities (seeHahn and Kuersteiner (2011)).Fernández-Val and Weidner (2018) suggest to do asensitivity analysis and try different values forL but notlarger than four.

dyn<-bife(  LFP~ LLFP+ KID1+ KID2+ KID3+log(INCH)+ AGE+I(AGE^2)| ID,data  = psid,model ="probit"  )dyn_bc<-bias_corr(dyn,L =1L)summary(dyn_bc)
## binomial - probit link#### LFP ~ LLFP + KID1 + KID2 + KID3 + log(INCH) + AGE + I(AGE^2) |##     ID#### Estimates:##             Estimate Std. error z value Pr(> |z|)## LLFP       1.0025625  0.0473066  21.193   < 2e-16 ***## KID1      -0.4741275  0.0679073  -6.982  2.91e-12 ***## KID2      -0.1958365  0.0625921  -3.129  0.001755 **## KID3      -0.0754042  0.0505110  -1.493  0.135482## log(INCH) -0.1946970  0.0621143  -3.134  0.001722 **## AGE        0.2009569  0.0477728   4.207  2.59e-05 ***## I(AGE^2)  -0.0024142  0.0006293  -3.836  0.000125 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1#### residual deviance= 4774.57,## null deviance= 6549.14,## n= 4792, N= 599#### ( 1461 observation(s) deleted due to missingness )## ( 6896 observation(s) deleted due to perfect classification )#### Number of Fisher Scoring Iterations: 6#### Average individual fixed effect= -1.939
apes_dyn_bc<-get_APEs(dyn_bc)summary(apes_dyn_bc)
## Estimates:##             Estimate Std. error z value Pr(> |z|)## LLFP       1.826e-01  6.671e-03  27.378   < 2e-16 ***## KID1      -7.525e-02  7.768e-03  -9.687   < 2e-16 ***## KID2      -3.108e-02  7.239e-03  -4.294  1.76e-05 ***## KID3      -1.197e-02  5.886e-03  -2.033     0.042 *## log(INCH) -3.090e-02  6.992e-03  -4.419  9.91e-06 ***## AGE        3.189e-02  5.403e-03   5.903  3.57e-09 ***## I(AGE^2)  -3.832e-04  7.107e-05  -5.391  7.00e-08 ***## ---## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

References

Chamberlain, Gary. 1980.“Analysis of Covariance with QualitativeData.”The Review of Economic Studies 47 (1): 225–38.
Fernández-Val, Iván. 2009.“Fixed Effects Estimation of StructuralParameters and Marginal Effects in Panel Probit Models.”Journal of Econometrics 150 (1): 71–85.
Fernández-Val, Iván, and Martin Weidner. 2018.“Fixed EffectsEstimation of Large-t Panel Data Models.”Annual Review ofEconomics 10 (1): 109–38.
Greene, William. 2004.“The Behaviour of the Maximum LikelihoodEstimator of Limited Dependent Variable Models in the Presence of FixedEffects.”Econometrics Journal 7 (1): 98–119.
Hahn, Jinyong, and Guido Kuersteiner. 2011.“Bias Reduction forDynamic Nonlinear Panel Models with Fixed Effects.”Econometric Theory 27 (6): 1152–91.
Hahn, Jinyong, and Whitney Newey. 2004.“Jackknife and AnalyticalBias Reduction for Nonlinear Panel Models.”Econometrica72 (4): 1295–1319.
Hyslop, Dean R. 1999.“State Dependence, Serial Correlation andHeterogeneity in Intertemporal Labor Force Participation of MarriedWomen.”Econometrica 67 (6): 1255–94.
Neyman, Jerzy, and Elizabeth L. Scott. 1948.“Consistent EstimatesBased on Partially Consistent Observations.”Econometrica 16 (1): 1–32.
Stammann, Amrei, Florian Heiss, and Daniel McFadden. 2016.“Estimating Fixed Effects Logit Models with Large PanelData.”

  1. The proposed pseudo-demeaning algorithm is in spirit ofGreene (2004) andChamberlain (1980).↩︎

  2. The bias correction is an refinement of(Hahn and Newey 2004) that is also applicable todynamic models.↩︎


[8]ページ先頭

©2009-2025 Movatter.jp