Movatterモバイル変換


[0]ホーム

URL:


KARDL

Introduction

Thekardl package is an R tool for estimating symmetricand asymmetric Autoregressive Distributed Lag (ARDL) and Nonlinear ARDL(NARDL) models, designed for econometricians and researchers analyzingcointegration and dynamic relationships in time series data. It offersflexible model specifications, allowing users to include deterministicvariables, asymmetric effects for short- and long-run dynamics, andtrend components. The package supports customizable lag structures,model selection criteria (AIC, BIC, AICc, HQ), and parallel processingfor computational efficiency. Key features include:

This vignette demonstrates how to use thekardl()function to estimate an asymmetric ARDL model, perform diagnostic tests,and visualize results, using economic data from Turkey.

Installation

kardl in R can easily be installed from its CRANrepository:

install.packages("kardl")library(kardl)

Alternatively, you can use thedevtools package to loaddirectly from GitHub:

# Install required packagesinstall.packages(c("stats","msm","lmtest","nlWaldTest","car","strucchange","utils"))# Install kardl from GitHubinstall.packages("devtools")devtools::install_github("karamelikli/kardl")

Load the package:

library(kardl)

Estimating an AsymmetricARDL Model

This example estimates an asymmetric ARDL model to analyze thedynamics of exchange rate pass-through to domestic prices in Turkey,using a sample dataset (imf_example_data) with variablesfor Consumer Price Index (CPI), Exchange Rate (ER), Producer Price Index(PPI), and a COVID-19 dummy variable.

Step 1: Data Preparation

Assumeimf_example_data contains monthly data for CPI,ER, PPI, and a COVID dummy variable. We prepare the data by ensuringproper formatting and adding the dummy variable. We retrieve data fromthe IMF’s International Financial Statistics (IFS) dataset and prepareit for analysis.

Note: Theimf_example_data is a placeholder fordemonstration purposes. You should replace it with your actual dataset.The data can be loaded byreadxl or other data importfunctions.

Step 2: Define the ModelFormula

We define the model formula using R’s formula syntax, incorporatingasymmetric effects and deterministic variables. We useasym() for variables with both short- and long-runasymmetry,deterministic() for fixed effects, andtrend for a linear time trend.

# Define the model formulaMyFormula<- CPI~ ER+ PPI+asym(ER+ PPI)+deterministic(covid)+ trend

Step 3: Model Estimation

We estimate the ARDL model using differentmode settingsto demonstrate flexibility in lag selection. Thekardl()function supports various modes:"grid","grid_custom","quick", or a user-defined lagvector.

Usingmode = "grid"

The"grid" mode evaluates all lag combinations up tomaxlag and provides console feedback.

# Set model optionskardl_set(criterion ="BIC",differentAsymLag =TRUE,data=imf_example_data)# Estimate model with grid modekardl_model<-kardl(data=imf_example_data,model= MyFormula,maxlag =4,mode ="grid")# View resultskardl_model# Display model summarysummary(kardl_model)

Using User-Defined Lags

Specify custom lags to bypass automatic lag selection:

kardl_model2<-kardl(data=imf_example_data, MyFormula,mode =c(2,1,1,3,0))# View resultskardl_model2$properLag

Using All Variables

Use the. operator to include all variables except thedependent variable:

kardl_set(data=imf_example_data)kardl(model =  CPI~ .+deterministic(covid),mode ="grid")

Visualizing Lag Criteria

TheLagCriteria component contains lag combinations andtheir criterion values. We visualize these to compare model selectioncriteria (AIC, BIC, HQ).

library(dplyr)library(tidyr)library(ggplot2)# Convert LagCriteria to a data frameLagCriteria<-as.data.frame(kardl_model[["LagCriteria"]])colnames(LagCriteria)<-c("lag","AIC","BIC","AICc","HQ")LagCriteria<- LagCriteria%>%mutate(across(c(AIC, BIC, HQ), as.numeric))# Pivot to long formatLagCriteria_long<- LagCriteria%>%select(-AICc)%>%pivot_longer(cols =c(AIC, BIC, HQ),names_to ="Criteria",values_to ="Value")# Find minimum valuesmin_values<- LagCriteria_long%>%group_by(Criteria)%>%slice_min(order_by = Value)%>%ungroup()# Plotggplot(LagCriteria_long,aes(x = lag,y = Value,color = Criteria,group = Criteria))+geom_line()+geom_point(data = min_values,aes(x = lag,y = Value),color ="red",size =3,shape =8)+geom_text(data = min_values,aes(x = lag,y = Value,label = lag),vjust =1.5,color ="black",size =3.5)+labs(title ="Lag Criteria Comparison",x ="Lag Configuration",y ="Criteria Value")+theme_minimal()+theme(axis.text.x =element_text(angle =45,hjust =1))

Step 4: Long-Run Coefficients

We calculate long-run coefficients usingkardl_longrun(), which standardizes coefficients bydividing them by the negative of the dependent variable’s long-runparameter.

# Long-run coefficientsmylong<-kardl_longrun(kardl_model)mylong

Step 5: Asymmetry Test

Theasymmetrytest() function performs Wald tests toassess short- and long-run asymmetry in the model.

ast<- imf_example_data%>%kardl(CPI~ ER+ PPI+asym(ER+ PPI)+deterministic(covid)+ trend,mode =c(1,2,3,0,1))%>%asymmetrytest()ast# View long-run hypothesesast$Lhypotheses# Detailed resultssummary(ast)

Step 6: Cointegration Tests

We perform cointegration tests to assess long-term relationshipsusingpssf(),psst(),narayan(),banerjee(), andrecmt().

PSS F Bound Test

Thepssf() function tests for cointegration using thePesaran, Shin, and Smith F Bound test.

A<- kardl_model%>%pssf(case =3,signif_level ="0.05")cat(paste0("The F statistic = ", A$statistic," where k = ", A$k,"."))cat(paste0("\nWe found '", A$Cont,"' at ", A$siglvl,"."))A$criticalValuessummary(A)

PSS t Bound Test

Thepsst() function tests the significance of the laggeddependent variable’s coefficient.

A<- kardl_model%>%psst(case =3,signif_level ="0.05")cat(paste0("The t statistic = ", A$statistic," where k = ", A$k,"."))cat(paste0("\nWe found '", A$Cont,"' at ", A$siglvl,"."))A$criticalValuessummary(A)

Narayan Test

Thenarayan() function is tailored for small samplesizes.

A<- kardl_model%>%narayan(case =3,signif_level ="0.05")cat(paste0("The F statistic = ", A$statistic," where k = ", A$k,"."))cat(paste0("\nWe found '", A$Cont,"' at ", A$siglvl,"."))A$criticalValuessummary(A)

Banerjee Test

Thebanerjee() function is designed for small datasets(≤100 observations).

A<- kardl_model%>%banerjee(signif_level ="0.05")cat(paste0("The ECM parameter = ", A$coef,", k = ", A$k,", t statistic = ", A$statistic,"."))cat(paste0("\nWe found '", A$Cont,"' at ", A$siglvl,"."))A$criticalValuessummary(A)

Restricted ECM Test

Therecmt() function tests for cointegration using anError Correction Model.

recmt_model<- imf_example_data%>%recmt(MyFormula,mode ="grid_custom",case =3)recmt_model# View resultssummary(recmt_model)

Step 7: ARCH Test

Thearchtest() function checks for autoregressiveconditional heteroskedasticity in the model residuals.

arch_result<-archtest(kardl_model$finalModel$model$residuals,q =2)summary(arch_result)

Step 8: CustomizingAsymmetric Variables

We demonstrate how to customize prefixes and suffixes for asymmetricvariables usingkardl_set().

# Set custom prefixes and suffixeskardl_reset()kardl_set(AsymPrefix =c("asyP_","asyN_"),AsymSuffix =c("_PP","_NN"))kardl_custom<-kardl(data=imf_example_data, MyFormula,mode ="grid_custom")kardl_custom$properLag

Key Functions and Parameters

For detailed documentation, use?kardl,?kardl_set,?kardl_longrun,?asymmetrytest,?pssf,?psst,?narayan,?banerjee,?recmt, or?archtest.

Options

The options for the KARDL package are set by thekardl_set() function in R. The default values are set inthekardl_set list. You can change the options by using thekardl_set() function with the desired parameters. Thefollowing options are available:

Option NameDefaultDescription
dataNULLThe data to be used for the model estimation
modelNULLThe formula to be used for the model estimation
maxlag4The maximum number of lags to be considered for the modelestimation
mode“quick”The mode of the model estimation, can be “quick”, “grid”,“grid_custom” or a user-defined vector
criterion“AIC”The criterion for model selection, can be “AIC”, “BIC”, “HQ” or auser-defined function
differentAsymLagFALSEIf TRUE, the asymmetry lags will be different for positive andnegative shocks
AsymPrefixc()Prefix for asymmetry variables, default is empty
AsymSuffixc(“_POS”, “_NEG”)Suffix for asymmetry variables, default is “_POS” and “_NEG”
LongCoef“L{lag}.{varName}”Prefix for long-run coefficients, default is “L1.”
ShortCoef“L{lag}.d.{varName}”Prefix for short-run coefficients, default is “L1.d.”
batch“1/1”Batch size for parallel processing, default is “1/1”

The details of the options are as follows:

1. data

data is a data frame or time series object containingthe variables to be used in the model estimation. The default value isNULL, which means that the user must provide a dataset whencalling thekardl() function.

Details

Thedata parameter is essential for thekardl() function to perform model estimation. It shouldcontain all the variables specified in the model formula, including thedependent variable and any independent variables, asymmetric components,and deterministic variables defined in the formula. The trend will begenerated automatically if specified in the formula. Input data can bein the form of a data frame, tibble, or time series object (e.g.,ts,xts,zoo).

When providing thedata, ensure that: - The dataset isclean and free of missing values for the variables used in the model. -The variables are appropriately formatted (e.g., numeric for continuousvariables). - The time series data is ordered correctly, especially ifthe analysis involves lagged variables.

2. model

model is a formula object specifying the model to beestimated. The default value isNULL, which means that theuser must provide a model formula when calling thekardl()function.

Details

Themodel parameter defines the structure of the ARDL orNARDL model to be estimated. It should include the dependent variable onthe left side of the formula and the independent variables, asymmetriccomponents, deterministic variables, and trend (if applicable) on theright side. The formula can include: -asym(): To specifyvariables with asymmetric effects in both short- and long -run dynamics.-asymL(): To specify variables with asymmetric effectsonly in the long-run -dynamics. -asymS(): To specifyvariables with asymmetric effects only in the short-run -dynamics. -deterministic(): To include fixed dummy variables (e.g.,seasonal d -ummies, event dummies). -trend: To include alinear time trend in the model. When constructing themodelformula, ensure that: - All variables used in the formula are present inthedata provided. - The formula is syntactically correctand follows R’s formula conventions. - The use of asymmetric anddeterministic functions is appropriate for the research question anddata characteristics.

3. maxlag

maxlag is an integer value specifying the maximum numberof lags to be considered for the model estimation. The default value is4.

Details

Themaxlag parameter sets the upper limit for the numberof lags that thekardl() function will evaluate whenoptimizing the lag structure of the model. This is particularlyimportant when using modes like"grid" or"grid_custom", where the function systematically testsdifferent lag combinations up to the specified maximum. When choosing avalue formaxlag, consider the following: -DataFrequency: For monthly data, amaxlag of 4 isoften sufficient to capture short-term dynamics. For quarterly data, alowermaxlag ( e.g., 2) may be appropriate, while for dailydata, a highermaxlag (e.g., 8 or more) might be necessary.-Sample Size: A largermaxlag increasesthe number of parameters to estimate, which can be problematic withsmall sample sizes. Ensure that the sample size is adequate to supportthe number of lags being considered. -ModelComplexity: Highermaxlag values lead to morecomplex models, which may overfit the data. Balance the need forcapturing dynamics with the risk of overfitting. -ComputationalResources: Evaluating a large number of lag combinations can becomputationally intensive. Consider the available resources and timeconstraints when settingmaxlag.

4. mode

mode is a character string or numeric vector specifyingthe mode of the model estimation. The default value is"quick". The available options are:
-“quick”: This mode provides a fast estimation of themodel without optimizing the lags. It is suitable for initialexplorations or when the user has a predefined lag structure. -“grid”: This mode performs a grid search over allpossible lag combinations up to the specifiedmaxlag. Itprovides verbose output, including the lag criteria for eachcombination, and is useful for thorough lag optimization. -“grid_custom”: Similar to"grid", but withsilent execution. It is more efficient for large datasets or when theuser wants to avoid console output during the lag optimization process.-User-defined vector: The user can specify a customlag structure by providing a numeric vector (e.g.,c(1, 2, 4, 5)) or a named vector (e.g.,c(CPI = 2, ER_POS = 3, ER_NEG = 1, PPI = 3)). This allowsfor complete control over the lag selection process.

Details

Themode parameter determines how thekardl() function approaches the estimation of the ARDL orNARDL model. Each mode has its advantages and is suited to differentscenarios: - Use"quick" for rapid assessments when the lagstructure is already known or when computational speed is a priority. -Use"grid" for comprehensive lag optimization, especiallywhen the optimal lag structure is unknown. This mode is ideal forexploratory analysis and model selection. - Use"grid_custom" for efficient lag optimization withoutconsole output, particularly for large datasets or when running multiplemodels in batch mode. - Use a user-defined vector when the user hasspecific knowledge about the appropriate lags for each variable,allowing for tailored model specifications. When using"grid" or"grid_custom", ensure that themaxlag parameter is set appropriately to balance thethoroughness of the search with computational feasibility.

5. criterion

criterion is a character string specifying the criterionto be used for selecting the optimal lags. The default value is"AIC". The available options are:

Details

For detailed information on the model selection criteria used in themethods, see the documentation for themodelCriterionfunction.

The choice of the criterion can significantly impact the selected laglength and, consequently, the performance of the model. Each criterionhas its strengths and is suited to specific scenarios:

Ensure that the selected criterion aligns with the goals of youranalysis and the characteristics of your data.

Examples

Default criterion (AIC)
kardl_set(criterion ="AIC")kardl(data, MyFormula)
Using BIC for lag selection
kardl_set(criterion ="BIC")kardl(data, MyFormula)
Using AICc for small samplesizes
kardl_set(criterion ="AICc")data%>%kardl(MyFormula)
Using HQ criterion
kardl_set(criterion ="HQ")kardl(data, MyFormula)

6. differentAsymLag

differentAsymLag is a logical value (TRUEorFALSE) indicating whether positive and negativeasymmetric variables should be assigned different lags during theestimation process. The default value isFALSE, meaningthat both positive and negative components will use the same lag.

Details

Asymmetric decomposition separates a variable into its positive andnegative changes. In some models, it may be desirable to assigndifferent lags to these components to capture distinct dynamicbehaviors. SettingdifferentAsymLag = TRUE allows thefunction to optimize lags for positive and negative componentsindependently. WhendifferentAsymLag = FALSE, bothcomponents will share the same lag.

This parameter is particularly useful when:

Attention!

Examples

Usingthe same lag for positive and negative components (default)
kadrl_set(differentAsymLag =FALSE)kardl(data, MyFormula)
Assigningdifferent lags for positive and negative components
kardl_set(differentAsymLag =TRUE)kardl(data, MyFormula)

7. AsymPrefix

AsymPrefix is a character vector specifying the prefixesused for naming asymmetric variables created during positive andnegative decomposition. The default value is an empty vectorc(), indicating that no prefixes are added by default.

When specified, the prefixes are added to the beginning of variablenames to represent the positive and negative decomposition:

Details

Asymmetric decomposition is used to analyze the separate effects ofpositive and negative changes in a variable. For example, given avariableX, prefixes can be used to generatePOS_X andNEG_X for the positive and negativecomponents, respectively.

By default, no prefixes are applied (AsymPrefix = c()).However, users can define custom prefixes by providing a vector with twoelements. For example:

Attention!

Examples

Using default (no prefixes)

kardl_set(AsymPrefix =c())

Customprefixes for positive and negative decomposition

kardl_set(AsymPrefix =c("POS_","NEG_"))

Combining prefixes withsuffixes

kardl_set(AsymPrefix =c("Change_","Fall_"),AsymSuffix =c("_High","_Low"))

8. AsymSuffix

AsymSuffix is a character vector specifying the suffixesused for naming asymmetric variables created during positive andnegative decomposition. The default value isc("_POS", "_NEG"), where:

Details

The order of the suffixes is important:

Asymmetric decomposition is commonly used in models to separate theeffects of positive and negative changes in a variable. For example,given a variableX, the decomposition may result inX_POS andX_NEG to represent its positive andnegative components, respectively.

By default, the suffixes"_POS" and"_NEG"are used, but users can customize them as needed by providing a customvector. For example:

Attention!

9. LongCoef

LongCoef is a character string specifying the prefixformat for naming long-run coefficients in the model output. The defaultvalue is"L{lag}.{varName}", where:

This format generates names likeL1.X for the first lagof variableX.

Details

Long-run coefficients represent the long-term relationships betweenthe dependent variable and independent variables in an ARDL or NARDLmodel. TheLongCoef parameter allows users to customize howthese coefficients are named in the output, making it easier to identifyand interpret them. The default format"L{lag}.{varName}"is widely used and provides clear information about the lag and variableassociated with each coefficient. Users can modify the format bychanging theLongCoef string. For example:

Attention!

10. ShortCoef

ShortCoef is a character string specifying the prefixformat for naming short-run coefficients in the model output. Thedefault value is"L{lag}.d.{varName}", where:

This format generates names likeL1.d.X for the firstlag of the differenced variableX.

Details

Short-run coefficients capture the immediate effects of changes inindependent variables on the dependent variable in an ARDL or NARDLmodel. TheShortCoef parameter allows users to customizehow these coefficients are named in the output, facilitating easieridentification and interpretation. The default format"L{lag}.d.{varName}" is commonly used and provides clearinformation about the lag, differencing, and variable associated witheach coefficient. Users can modify the format by changing theShortCoef string. For example:

Attention!

11. batch

batch is a character string specifying the batch sizefor parallel processing during model estimation. The default value is"1/1", indicating that the model estimation will beexecuted as a single job without batching.

Details

Thebatch parameter is particularly useful when dealingwith large datasets or complex models that require significantcomputational resources. By specifying a batch size, users can dividethe model estimation process into smaller, more manageable segments,which can be processed in parallel. The format for thebatch parameter is"m/n", where:

For example, settingbatch = "2/4" would divide theestimation into 4 batches, with 2 batches being processedsimultaneously. This can significantly reduce computation time,especially for models with extensive lag structures or large numbers ofvariables.

Attention!

Examples

Default batch size (nobatching)
kardl_set(batch ="1/1")kardl(data, MyFormula)
Using a batch size of 2 out of4
kardl_set(batch ="2/4")kardl(data, MyFormula)
Using a batch size of 3 out of6
kardl_set(batch ="3/6")kardl(data, MyFormula)

Conclusion

Thekardl package is a versatile tool for econometricanalysis, offering robust support for symmetric and asymmetricARDL/NARDL modeling, cointegration tests, stability diagnostics, andheteroskedasticity checks. Its flexible formula specification, lagoptimization, and support for parallel processing make it ideal forstudying complex economic relationships. For more information, visithttps://github.com/karamelikli/kardl or contact theauthors athakperest@gmail.com.

Disclosures


[8]ページ先頭

©2009-2025 Movatter.jp