Movatterモバイル変換


[0]ホーム

URL:


Exploratory Regression Shiny App

Catherine B. Hurley

2023-08-21

The Exploratory Regression Shiny App (ERSA) package consists of acollection of functions for displaying the results of a regressioncalculation, which are then packaged together as a shiny appfunction.

To use ERSA first do

library(ERSA)

Then construct a regression model of your choice.

f<-lm(Fertility~ . ,data = swiss)exploreReg(f,swiss)

Here is a screen shot of the result:

The summary or drop1 display

The app display consists of four panels. In the top left corner is adisplay of the model summary t-statistics, from

f<-lm(Fertility~ . ,data = swiss)summary(f)
## ## Call:## lm(formula = Fertility ~ ., data = swiss)## ## Residuals:##      Min       1Q   Median       3Q      Max ## -15.2743  -5.2617   0.5032   4.1198  15.3213 ## ## Coefficients:##                  Estimate Std. Error t value Pr(>|t|)    ## (Intercept)      66.91518   10.70604   6.250 1.91e-07 ***## Agriculture      -0.17211    0.07030  -2.448  0.01873 *  ## Examination      -0.25801    0.25388  -1.016  0.31546    ## Education        -0.87094    0.18303  -4.758 2.43e-05 ***## Catholic          0.10412    0.03526   2.953  0.00519 ** ## Infant.Mortality  1.07705    0.38172   2.822  0.00734 ** ## ---## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1## ## Residual standard error: 7.165 on 41 degrees of freedom## Multiple R-squared:  0.7067, Adjusted R-squared:  0.671 ## F-statistic: 19.76 on 5 and 41 DF,  p-value: 5.594e-10

This display (Plot1) shows the magnitude of each t-statistic. The reddashed line shows the 5% significance level. There are a few otheroptions for this display. The display may be switched to “CI”, whichshows parameter confidence intervals or “CI stdX” for confidenceintervals in standard X units. If the model contains factors with morethan two levels, then better choices are “F stat” or “Adj. SS”. Thesegive the summaries from the Sum of Sq or F stat columns of the drop1results:

drop1(f,test="F")
## Single term deletions## ## Model:## Fertility ~ Agriculture + Examination + Education + Catholic + ##     Infant.Mortality##                  Df Sum of Sq    RSS    AIC F value    Pr(>F)    ## <none>                        2105.0 190.69                      ## Agriculture       1    307.72 2412.8 195.10  5.9934  0.018727 *  ## Examination       1     53.03 2158.1 189.86  1.0328  0.315462    ## Education         1   1162.56 3267.6 209.36 22.6432 2.431e-05 ***## Catholic          1    447.71 2552.8 197.75  8.7200  0.005190 ** ## Infant.Mortality  1    408.75 2513.8 197.03  7.9612  0.007336 ** ## ---## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Clicking on this display removes the closest predictor, clickingagain adds it in. Once a predictor is added or dropped, this change isalso reflected in the other ERSA displays.

When FixedScales box near Plot1 is ticked, the axes on Plot1 remainunchanged as predictors are added or dropped. Sometimes the extent onthe x-axis is not large enough, or is too large, in this case untick theFixedScales box and the x-axis will vary to accomodate the includedpredictors.

The anova display

The second panel in the top right shows the results of

anova(f)
## Analysis of Variance Table## ## Response: Fertility##                  Df  Sum Sq Mean Sq F value    Pr(>F)    ## Agriculture       1  894.84  894.84 17.4288 0.0001515 ***## Examination       1 2210.38 2210.38 43.0516 6.885e-08 ***## Education         1  891.81  891.81 17.3699 0.0001549 ***## Catholic          1  667.13  667.13 12.9937 0.0008387 ***## Infant.Mortality  1  408.75  408.75  7.9612 0.0073357 ** ## Residuals        41 2105.04   51.34                      ## ---## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

as Plot2. Each predictor of this output is represented by a colouredslice, whose height represents the sum of squares. These sums of squaresdepend on the order in which predictors were entered into the model fit.The second barchart (Plot3) represents the anova table obtained byreversing the predictor order. The dropdown menus give other choices.Forwards and backwards give the order of predictors as visited by theforwards and backwards selection algorithms. By choosing order “Random”,a user can click on a slice to move it up one position, or double clickto move it down.

The Parallel coordinates plot display

The lower part of the app has a parallel coordinate plot on the lowerright, and a control panel on the left. The user can choose to showeither (i) Variables (ii) Residuals, (iii) Hatvalues and (iv) CooksD.Each axis is assigned to a variable, using the order from Plot1, Plot2and Plot3. The residuals and hatvalues are obtained by leaving out onepredictor at a time, if the selected order is Plot1, or by addingpredictors in sequence if the order is Plot2 or Plot3. When residualsare plotted, the Difference option when selected shows thedifference

Specifically, let\(e\) denote thevector of residuals from the full model and let\(e^{-j}\) denote the residual vector fromthe fit using all predictors except the\(j\)th. When residuals are selected, usingorder from Plot1, the PCP shows\(e^{-j}\) on the\(j\)th axis, or\(e^{-j} - e\) when the Difference button isselected in Residual options. Let\(e^{j}\) be the residuals from the modelincluding the first\(j\) predictors.When Residuals are selected, using order from Plot2 or Plot3, the PCPshows\(e^{j}\) on the\(j\)th axis, or\(e^{j} - e^{j-1}\) when the Differencebutton is selected in Residual options. For any of the Plot orders, whenthe Absolute button is selected, either absolute residuals or absoluteresidual differences are plotted.

Dragging a brush over the PCP axes will highlight cases and printinformation for the selected cases. Clicking on Remove Brushed willremove the highlighted cases from view. All regression models arerecalculated and the displays are updated. Clicking the All Cases buttomwill update all displays to use the complete dataset. Double clicking onthe PCP itself will un-highlight all cases.

Individual plots

There are functions to construct static plot versions of all theplots in the Exploratory Regression shiny app.

For the Plot1 displays use

plottStats(f)
## Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as## of ggplot2 3.3.4.## ℹ The deprecated feature was likely used in the ERSA package.##   Please report the issue to the authors.## This warning is displayed once every 8 hours.## Call `lifecycle::last_lifecycle_warnings()` to see where this warning was## generated.

cols<-termColours(f)plottStats(f, cols)

or

plotCIStats(f,cols)plotCIStats(f, cols,stdunits=TRUE)plotAnovaStats(f, cols,type="F")plotAnovaStats(f, cols,type="SS")

For the Plot2 displays use

fr<-revPredOrder(f, swiss)plotSeqSS(list(f,fr), cols,legend=TRUE)

Other orders are

fselOrder(f)bselOrder(f)randomPredOrder(f)regsubsetsOrder(f)

To plot the PCP display of the data use:

pcpPlot(swiss, f)

Cases are automatically coloured by the magnitude of the response.

To plot a PCP display of the residuals leaving out one predictor at atime use

pcpPlot(swiss, f,type="Residuals")

In residual, hatvalue and CooksD plots cases are automaticallycoloured by the magnitude of full model residuals. Using the optionsequential=T gives the residuals adding model terms in sequence, as theyappear in the supplied fit f.

Swapping “Residuals” with “Hatvalues” shows the fit hat values,similarly “CooksD”.

pcpPlot(swiss, f,type="Hatvalues",sequential=T)


[8]ページ先頭

©2009-2026 Movatter.jp