We assume:
This looks OK, there is no evidence against normality here
data(ravens) crowGun= ravens[ravens$treatment==1,]t.test(crowGun$Before, crowGun$After,paired=TRUE,alternative="less")#>#> Paired t-test#>#> data: crowGun$Before and crowGun$After#> t = -2.6, df = 11, p-value = 0.01235#> alternative hypothesis: true mean difference is less than 0#> 95 percent confidence interval:#> -Inf -0.335048#> sample estimates:#> mean difference#> -1.083333There is some evidence that ravens fly towards the sound ofgunshots.
Is there evidence that the response to gunshot different toresponse to other treatments?
We will use a linear model, withlocation included as ablocking factor to control for differences in raven abundance from oneplace to another, see Code Box 4.3.
library(reshape2) crowLong=melt(crowGun,measure.vars =c("Before","After"),variable.name="time",value.name="ravens")head(crowLong)#> delta site treatment trees time ravens#> 1 2 pilgrim 1 1 Before 0#> 2 1 pacific 1 1 Before 0#> 3 4 uhl hil 1 1 Before 0#> 4 1 wolff r 1 1 Before 0#> 5 0 teton p 1 1 Before 0#> 6 3 glacier 1 1 Before 2 ravenlm=lm(ravens~site+time,data=crowLong)anova(ravenlm)#> Analysis of Variance Table#>#> Response: ravens#> Df Sum Sq Mean Sq F value Pr(>F)#> site 11 55.458 5.0417 4.84 0.007294 **#> time 1 7.042 7.0417 6.76 0.024694 *#> Residuals 11 11.458 1.0417#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1To analyse, we first subset to the three treatments of interest(1=gunshot, 2=airhorn, 3=whistle):
crowAfter= ravens[ravens$treatment<=3,]crowAfter$treatment=factor(crowAfter$treatment)ft_crowAfter=lm(After~site+treatment,data=crowAfter)anova(ft_crowAfter)#> Analysis of Variance Table#>#> Response: After#> Df Sum Sq Mean Sq F value Pr(>F)#> site 11 28.667 2.6061 1.0118 0.4679#> treatment 2 10.667 5.3333 2.0706 0.1500#> Residuals 22 56.667 2.5758To check assumptions we could look at diagnostic plots:
The residual vs fits plot has a fan-shaped pattern, suggesting thattransformation might be a good idea here!
Is there an effect of distance of isolation after controlling forwet mass?
Q’s – this is a hypothesis test, for an effect onTotal ofDist after controlling forWmass.
P’s We have three variables of interest:
Total is the quantitative responseDist is a factor (0, 2 or 10)Wmass is a quantitative variable, to be use as acovariateWe can get all three variables onto the same graph using ascatterplot, with different colours for different treatment levels, asin Code Box 4.4.
data(seaweed) seaweed$Dist=factor(seaweed$Dist)plot(Total~Wmass,data=seaweed,col=Dist,xlab="Wet Mass [log scale]",ylab="Density (per gram) [log scale]")legend("topright",levels(seaweed$Dist),col=1:3,pch=1) seaweed$logTot=log(seaweed$Total) seaweed$logWmass=log(seaweed$Wmass) lmMassDist=lm(logTot~logWmass+Dist,data=seaweed)anova(lmMassDist)#> Analysis of Variance Table#>#> Response: logTot#> Df Sum Sq Mean Sq F value Pr(>F)#> logWmass 1 7.7216 7.7216 35.7165 1.975e-07 ***#> Dist 2 2.1415 1.0708 4.9528 0.01067 *#> Residuals 53 11.4582 0.2162#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Independence can be guaranteed by randomly allocatingDist treatments to plots.
Remaining assumptions can be checked using diagnostic plots.
(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
Notice that switching the order changes the results in the ANOVAtable, and their interpretation.
lmDistMass=lm(logTot~Dist+logWmass,data=seaweed)anova(lmDistMass)#> Analysis of Variance Table#>#> Response: logTot#> Df Sum Sq Mean Sq F value Pr(>F)#> Dist 2 4.8786 2.4393 11.283 8.273e-05 ***#> logWmass 1 4.9845 4.9845 23.056 1.329e-05 ***#> Residuals 53 11.4582 0.2162#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1The correct model is...logWmass+Dist so we can look foran effect ofDist after controlling forlogWmass.
drop1(lmMassDist,test="F")#> Single term deletions#>#> Model:#> logTot ~ logWmass + Dist#> Df Sum of Sq RSS AIC F value Pr(>F)#> <none> 11.458 -83.448#> logWmass 1 4.9845 16.443 -64.861 23.0561 1.329e-05 ***#> Dist 2 2.1415 13.600 -77.681 4.9528 0.01067 *#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Q’s – here is the research question:Doesinvertebrate density change with isolation? Does the isolation effectvary with time period?
We want to test a specific hypothesis (no effect ofDist, no interaction ofDist withTime).
P’s There are three variables involved in thisresearch question:
Total is invertebrate density (quantitative)Dist is a factor (0, 2 or 10)Time is also a factor (5 or 10 weeks)Given previous information we could also addlogWmass asanother covariate…
Comparative boxplots would be a good way to start, as in Code Box4.8.
plot(Total~interaction(Dist,Time),data=seaweed,log="y")## and as usual use xlabel, ylabel to name axespar(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0)) ft_seaweedFact=lm(logTot~Time*Dist,data=seaweed)plotenvelope(ft_seaweedFact,n.sim=99)(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
ft_seaweedFact=lm(logTot~Time*Dist,data=seaweed)anova(ft_seaweedFact)#> Analysis of Variance Table#>#> Response: logTot#> Df Sum Sq Mean Sq F value Pr(>F)#> Time 1 0.2433 0.24333 0.8512 0.3605470#> Dist 2 5.0323 2.51613 8.8023 0.0005201 ***#> Time:Dist 2 1.4673 0.73367 2.5666 0.0866804 .#> Residuals 51 14.5784 0.28585#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1There is marginal evidence of an interaction and strong evidence of aDist main effect. So there is an effect ofDist and it could potentially vary withTime.
Note we usedTime*Dist becauseTime is notthe variable of primary interest, it isDist, and we onlyincludeTime to control for sampling time effects
library(dplyr) seaweed$Time=as.factor(seaweed$Time) by_DistTime=group_by(seaweed,Dist,Time) distTimeMeans=summarise(by_DistTime,logTotal=mean(log(Total))) distTimeMeans#> # A tibble: 6 × 3#> # Groups: Dist [3]#> Dist Time logTotal#> <fct> <fct> <dbl>#> 1 0 5 1.58#> 2 0 10 1.31#> 3 2 5 2.01#> 4 2 10 2.36#> 5 10 5 1.68#> 6 10 10 2.13library(ggplot2)library(ggthemes)#loads special themesggplot(seaweed,aes(x =factor(Dist),y = Total,colour = Time))+geom_point()+geom_line(data = distTimeMeans,aes(y =exp(logTotal),group = Time))+theme_few()+xlab("Distance of Isolation")+ylab("Total abundance [log scale]")+scale_y_log10(breaks=c(2,5,10,20))Alternatively, for a simpler plot without the data points on it,try:
interaction.plot(seaweed$Dist, seaweed$Time, ft_seaweedFact$fitted,xlab="Isolation of patch",ylab="Total density [log]",trace.label="Time")anova gone wrongdata(seaweed)ft_nofactor=lm(log(Total)~Time*Dist,data=seaweed)anova(ft_nofactor)#> Analysis of Variance Table#>#> Response: log(Total)#> Df Sum Sq Mean Sq F value Pr(>F)#> Time 1 0.2433 0.24333 0.6671 0.41771#> Dist 1 0.7164 0.71635 1.9640 0.16691#> Time:Dist 1 1.0303 1.03031 2.8248 0.09871 .#> Residuals 53 19.3313 0.36474#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 seaweed$Dist=factor(seaweed$Dist) seaweed$Time=factor(seaweed$Time) seaweed$logTot=log(seaweed$Total) ft_seaweedFact=lm(logTot~Time*Dist,data=seaweed)library(multcomp) contFact=mcp(Dist="Tukey")# telling R to compare on the Dist factor compFact=glht(ft_seaweedFact,linfct=contFact)# run multiple comparisions#> Warning in mcp2matrix(model, linfct = linfct): covariate interactions found -- default contrast might be#> inappropriate ft_seaweedMain=lm(logTot~Time+Dist,data=seaweed)# note it is + not * contrast=mcp(Dist="Tukey")# telling R to compare on the Dist factor compDistMain=glht(ft_seaweedMain,linfct=contrast)confint(compDistMain)#>#> Simultaneous Confidence Intervals#>#> Multiple Comparisons of Means: Tukey Contrasts#>#>#> Fit: lm(formula = logTot ~ Time + Dist, data = seaweed)#>#> Quantile = 2.4109#> 95% family-wise confidence level#>#>#> Linear Hypotheses:#> Estimate lwr upr#> 2 - 0 == 0 0.72650 0.28778 1.16521#> 10 - 0 == 0 0.45838 0.03889 0.87787#> 10 - 2 == 0 -0.26812 -0.70683 0.17060This approach is wasteful as it compares some pairs we are notinterested in (e.g. 2.10 vs 0.5).
td=interaction(seaweed$Dist,seaweed$Time) ft_seaweedInt=lm(logTot~td,data=seaweed)# Time*Dist as a single term contInt=mcp(td="Tukey")# telling R to compare on all Time*Dist levels compDistInt=glht(ft_seaweedInt,linfct=contInt)summary(compDistInt)#>#> Simultaneous Tests for General Linear Hypotheses#>#> Multiple Comparisons of Means: Tukey Contrasts#>#>#> Fit: lm(formula = logTot ~ td, data = seaweed)#>#> Linear Hypotheses:#> Estimate Std. Error t value Pr(>|t|)#> 2.5 - 0.5 == 0 0.4356 0.2391 1.822 0.4605#> 10.5 - 0.5 == 0 0.1013 0.2391 0.424 0.9982#> 0.10 - 0.5 == 0 -0.2643 0.2391 -1.105 0.8766#> 2.10 - 0.5 == 0 0.7852 0.2635 2.980 0.0476 *#> 10.10 - 0.5 == 0 0.5512 0.2391 2.305 0.2104#> 10.5 - 2.5 == 0 -0.3343 0.2391 -1.398 0.7272#> 0.10 - 2.5 == 0 -0.6999 0.2391 -2.927 0.0541 .#> 2.10 - 2.5 == 0 0.3496 0.2635 1.327 0.7684#> 10.10 - 2.5 == 0 0.1156 0.2391 0.483 0.9965#> 0.10 - 10.5 == 0 -0.3656 0.2391 -1.529 0.6468#> 2.10 - 10.5 == 0 0.6839 0.2635 2.596 0.1168#> 10.10 - 10.5 == 0 0.4499 0.2391 1.882 0.4240#> 2.10 - 0.10 == 0 1.0495 0.2635 3.983 0.0029 **#> 10.10 - 0.10 == 0 0.8155 0.2391 3.411 0.0151 *#> 10.10 - 2.10 == 0 -0.2340 0.2635 -0.888 0.9475#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> (Adjusted p values reported -- single-step method)This is the best approach to use if you think there is an interactionand are primarily interested in Dist.
levels(seaweed$Time)=c("five","ten")#need non-numbers for mcp to worklevels(seaweed$Dist)=c("Zero","Two","Ten") td=interaction(seaweed$Dist,seaweed$Time) ft_seaweedInt=lm(log(Total)~td,data=seaweed)# Time*Dist as a single term contDistinTime=mcp(td =c("Two.five - Zero.five = 0","Ten.five - Zero.five = 0","Ten.five - Two.five = 0","Two.ten - Zero.ten = 0","Ten.ten - Zero.ten = 0","Ten.ten - Two.ten = 0")) compDistinTime=glht(ft_seaweedInt,linfct=contDistinTime)summary(compDistinTime)#>#> Simultaneous Tests for General Linear Hypotheses#>#> Multiple Comparisons of Means: User-defined Contrasts#>#>#> Fit: lm(formula = log(Total) ~ td, data = seaweed)#>#> Linear Hypotheses:#> Estimate Std. Error t value Pr(>|t|)#> Two.five - Zero.five == 0 0.4356 0.2391 1.822 0.31173#> Ten.five - Zero.five == 0 0.1013 0.2391 0.424 0.99083#> Ten.five - Two.five == 0 -0.3343 0.2391 -1.398 0.57120#> Two.ten - Zero.ten == 0 1.0495 0.2635 3.983 0.00127 **#> Ten.ten - Zero.ten == 0 0.8155 0.2391 3.411 0.00714 **#> Ten.ten - Two.ten == 0 -0.2340 0.2635 -0.888 0.87445#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#> (Adjusted p values reported -- single-step method) lmMassDistInter=lm(logTot~log(Wmass)*Dist,data=seaweed)anova(lmMassDistInter)#> Analysis of Variance Table#>#> Response: logTot#> Df Sum Sq Mean Sq F value Pr(>F)#> log(Wmass) 1 7.7216 7.7216 35.3587 2.489e-07 ***#> Dist 2 2.1415 1.0708 4.9032 0.01128 *#> log(Wmass):Dist 2 0.3208 0.1604 0.7345 0.48475#> Residuals 51 11.1374 0.2184#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1Do the effects on plant height of latitude and rainfallinteract?
We need an interaction term in the model, but becauselat andrain are both quantitative, this is aquadratic term and other quadratic terms should enter the model too, asin Code Box 4.17.
ft_latRain2=lm(log(height)~poly(rain,lat,degree=2),data=globalPlants)summary(ft_latRain2)#>#> Call:#> lm(formula = log(height) ~ poly(rain, lat, degree = 2), data = globalPlants)#>#> Residuals:#> Min 1Q Median 3Q Max#> -3.3656 -0.9546 -0.0749 0.9775 3.1311#>#> Coefficients:#> Estimate Std. Error t value Pr(>|t|)#> (Intercept) 1.1675 0.2094 5.577 1.44e-07 ***#> poly(rain, lat, degree = 2)1.0 5.7178 2.3419 2.441 0.01603 *#> poly(rain, lat, degree = 2)2.0 -1.1417 2.0304 -0.562 0.57493#> poly(rain, lat, degree = 2)0.1 -5.0623 1.8212 -2.780 0.00628 **#> poly(rain, lat, degree = 2)1.1 -6.1984 38.1037 -0.163 0.87104#> poly(rain, lat, degree = 2)0.2 -2.3615 2.5031 -0.943 0.34729#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#>#> Residual standard error: 1.441 on 125 degrees of freedom#> Multiple R-squared: 0.2706, Adjusted R-squared: 0.2415#> F-statistic: 9.277 on 5 and 125 DF, p-value: 1.558e-07Does time from snowmelt to flowering vary with snowmelt date,beyond that explained by elevation? If so, how does it vary?
The data are available as snowmelt in the ecostats package, averagedover measurements taken at three sampling times. Answer the researchquestions using the appropriate linear model, being sure to check yourassumptions.
snowReduced=na.omit(snowmelt[,c("flow","snow","elev")])#this line not normally needed, lm can handle NA's, but seems needed because of a weird conflict with MCMCglmm ft_snow=lm(flow~elev+snow,data=snowReduced)par(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(ft_snow,n.sim=99)(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
This doesn’t look great, there is the occasional large value forflowering time, and a suggestion of non-linearity (note from thescatterplot that at early snowmelt dates there is a strong decreasingrelationship, but it can’t continue to increase at the same rate becausethen flowering time would become negative). Try log-transformingflowering time…
plot(flow~snow,data=snowmelt,log="y")#> Warning in xy.coords(x, y, xlabel, ylabel, log): 1 y value <= 0 omitted from logarithmic plot snowReduced$logFlow=log(snowReduced$flow) snowReduced2= snowReduced[snowReduced$logFlow>-Inf,]# look it's a bit naughty, removing the infinite value, but no biggie as only one value ft_logsnow=lm(logFlow~elev+snow,data=snowReduced2)par(mfrow=c(1,2),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))plotenvelope(ft_logsnow,n.sim=99)summary(ft_logsnow)#>#> Call:#> lm(formula = logFlow ~ elev + snow, data = snowReduced2)#>#> Residuals:#> Min 1Q Median 3Q Max#> -0.90161 -0.15794 0.02338 0.15834 1.08662#>#> Coefficients:#> Estimate Std. Error t value Pr(>|t|)#> (Intercept) 3.8805208 0.3193019 12.153 < 2e-16 ***#> elev 0.0003357 0.0001226 2.739 0.00651 **#> snow -0.0103172 0.0007551 -13.663 < 2e-16 ***#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1#>#> Residual standard error: 0.2909 on 316 degrees of freedom#> Multiple R-squared: 0.3774, Adjusted R-squared: 0.3735#> F-statistic: 95.78 on 2 and 316 DF, p-value: < 2.2e-16confint(ft_logsnow)#> 2.5 % 97.5 %#> (Intercept) 3.252294e+00 4.5087472418#> elev 9.457986e-05 0.0005768232#> snow -1.180299e-02 -0.0088315037(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
So as snowmelt date increases by 10 days, log(flowering time)decreases by between 0.12 and 0.09 (meaning about a 9-12% decrease).
(Aside: Strictly speaking,snow should not be treated asa linear variable because it is a time-of-year variable, so is acyclical variable with a period of 365. But in this case the values areall focussed over a few months towards the middle of the year so alinear approximation is OK.)
Is there an effect of the netting treatment on changes in aphidnumbers across the sampling times?
data(aphidsBACI)str(aphidsBACI)#> 'data.frame': 16 obs. of 5 variables:#> $ Plot : Factor w/ 8 levels "1","2","3","4",..: 1 2 3 4 5 6 7 8 1 2 ...#> $ Treatment: Factor w/ 2 levels "excluded","present": 2 2 2 2 1 1 1 1 2 2 ...#> $ Time : Factor w/ 2 levels "X18.06.","X15.07.": 1 1 1 1 1 1 1 1 2 2 ...#> $ counts : int 78 71 122 271 119 152 120 202 60 48 ...#> $ logcount : num 4.37 4.28 4.81 5.61 4.79 ...plot(logcount~interaction(Time,Treatment),data=aphidsBACI)OK log(count) looks fine, different treatments have similarvariation. Eyeballing the data it looks like there is a treatmenteffect, with many less aphids in the excluded group than in thecontrol…
When building a model, we need to take into account the pairing ofthe data – Ingo and friends took repeated measures in each plot, so weincludePlot as a blocking variable to account for this.This in turn means there is no need for aTreatment maineffect, because different plots received different treatments.
par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))lm_aphids=lm(logcount~Plot+Time+Treatment:Time,data=aphidsBACI)plotenvelope(lm_aphids,which=1:3,n.sim=99)(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
This all looks fine to me. The term if interest isTreatment:Time, which tells us if changes in aphids overtime varied due to the bird exclusion treatment:
anova(lm_aphids)#> Analysis of Variance Table#>#> Response: logcount#> Df Sum Sq Mean Sq F value Pr(>F)#> Plot 7 0.8986 0.1284 0.4603 0.833357#> Time 1 5.4675 5.4675 19.6038 0.004434 **#> Time:Treatment 1 0.7397 0.7397 2.6522 0.154527#> Residuals 6 1.6734 0.2789#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1There is no evidence of an interaction, meaning there is no evidenceof an effect of bird exclusion on aphid numbers. Oh :(
I guess the reason for this is that the levels of replication werelow – there were only eight plots, four in each treatment, so an effectwould have to be very strong to be statistically significant.
data(seaweed)str(seaweed)#> 'data.frame': 57 obs. of 22 variables:#> $ Size : chr "SMALL" "SMALL" "SMALL" "SMALL" ...#> $ Dist : int 0 0 0 0 0 2 2 2 2 2 ...#> $ Time : int 5 5 5 5 5 5 5 5 5 5 ...#> $ Rep : int 1 2 3 4 5 1 2 3 4 5 ...#> $ Wmass : num 15.8 13.1 25.1 29.7 17.4 13.3 4.7 6.8 7.7 14.3 ...#> $ Amph : num 75 16 14 21 55 17 9 11 28 37 ...#> $ Cope : num 0 7 37 37 0 14 6 29 21 83 ...#> $ Poly : num 17 8 2 5 1 2 1 3 6 10 ...#> $ Anem : num 2 0 0 0 0 0 0 1 0 0 ...#> $ Iso : num 0 2 1 1 15 0 12 51 19 56 ...#> $ Bival : num 3 2 1 0 0 2 0 4 3 0 ...#> $ Gast : num 6 18 33 9 9 1 10 7 4 15 ...#> $ Turb : num 1 0 0 0 0 0 0 0 0 0 ...#> $ Prawn : num 0 0 0 0 5 1 0 0 0 1 ...#> $ Urchin: num 0 0 0 0 0 0 0 0 0 0 ...#> $ Fish : num 0 0 0 0 1 0 0 0 0 0 ...#> $ Crab : num 0 1 1 0 0 0 0 1 0 0 ...#> $ Caddis: num 0 0 0 0 1 0 0 0 0 0 ...#> $ Opi : num 0 1 0 1 0 0 0 0 0 0 ...#> $ Ost : num 0 0 6 3 0 0 1 1 0 11 ...#> $ Bstar : num 0 0 0 0 0 0 0 0 0 1 ...#> $ Total : num 6.58 4.2 3.78 2.59 5 ...seaweed$logTot=log(seaweed$Total)seaweed$Dist=factor(seaweed$Dist)par(mar=c(4,7,1,1))plot(logTot~interaction(Dist,Size,Time),data=seaweed,xlab="",ylab="Total abundance [log scale]",horizontal=TRUE,col=2:4,las=1)legend("topleft",legend=paste0("Dist=",levels(seaweed$Dist)),pch=15,col=2:4,pt.cex=2)All good so far, no obvious trend, except thatDist=0seems to have lower total abundance at most times/sizes. Will analyse asa three-way ANOVA
par(mfrow=c(1,3),mar=c(3,3,1,1),mgp=c(1.75,0.75,0))ft_seaweed3=lm(log(Total)~Size*Time*Dist,data=seaweed)plotenvelope(ft_seaweed3,which=1:3,n.sim=99)anova(ft_seaweed3)#> Analysis of Variance Table#>#> Response: log(Total)#> Df Sum Sq Mean Sq F value Pr(>F)#> Size 1 0.9940 0.99399 3.5849 0.0647472 .#> Time 1 0.1923 0.19234 0.6937 0.4093064#> Dist 2 4.7737 2.38686 8.6084 0.0006829 ***#> Size:Time 1 0.0605 0.06055 0.2184 0.6425429#> Size:Dist 2 0.1800 0.09000 0.3246 0.7244968#> Time:Dist 2 1.4117 0.70583 2.5456 0.0896686 .#> Size:Time:Dist 2 1.2318 0.61592 2.2214 0.1202189#> Residuals 45 12.4772 0.27727#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1(Note thatplotenvelope was run with just99 iterations, to speed up computation time.)
From the plots, we can’t see any problems with assumptions.
The ANOVA suggests a main effect ofDist, with marginalevidence of an interaction withTime (judging from theboxplots, this is probably because of the larger values forlogTot atTime=5 in large plots).
We could reanalyse the data includingWmass as acovariate, for a three-factor ANCOVA:
ft_seaweedW3=lm(logTot~Wmass*Size*Time*Dist,data=seaweed)anova(ft_seaweedW3)#> Analysis of Variance Table#>#> Response: logTot#> Df Sum Sq Mean Sq F value Pr(>F)#> Wmass 1 6.7793 6.7793 28.2266 7.328e-06 ***#> Size 1 0.7106 0.7106 2.9585 0.094792 .#> Time 1 0.0747 0.0747 0.3109 0.580899#> Dist 2 2.8114 1.4057 5.8529 0.006675 **#> Wmass:Size 1 0.0941 0.0941 0.3917 0.535700#> Wmass:Time 1 0.0233 0.0233 0.0970 0.757459#> Size:Time 1 0.3501 0.3501 1.4579 0.235849#> Wmass:Dist 2 0.1979 0.0990 0.4121 0.665636#> Size:Dist 2 0.0886 0.0443 0.1844 0.832475#> Time:Dist 2 1.3263 0.6631 2.7611 0.077848 .#> Wmass:Size:Time 1 0.0081 0.0081 0.0337 0.855482#> Wmass:Size:Dist 2 0.1404 0.0702 0.2924 0.748389#> Wmass:Time:Dist 2 0.2767 0.1384 0.5761 0.567645#> Size:Time:Dist 2 0.3440 0.1720 0.7162 0.496020#> Wmass:Size:Time:Dist 2 0.1701 0.0850 0.3540 0.704475#> Residuals 33 7.9257 0.2402#> ---#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1We get pretty much the same results.Wmass seems to bean important covariate, but after controlling for it, there is still astrongDist main effect and not much else happening.