Expressed in plain words, colored noise is a stochastic processwherein values tend to be correlated with other values nearby in spaceor time. Colored noise in spatial data sets is said to be spatiallyautocorrelated, while colored noise in time series is said to betemporally autocorrelated. In this package we only deal with thelatter.
Expressed in math, colored noise can be modeled with the followingequation, where\(\omega\) is a randomstandard normal variable, and\(\epsilon\) is a temporally autocorrelatedvariable:
\(\epsilon_{t + 1} = k\epsilon_t +\omega_t\sqrt{1 - k^2}\)
Mathematically, we can also think of unautocorrelated stochasticityas random sine waves evenly distributed across all wavelengths, whileautocorrelated stochasticity is biased either toward sine waves of longwavelengths (red) or short wavelengths (blue).
Expressed graphically, colored noise looks like this:
In the blue noise plot above, each random number is negativelycorrelated to the one preceding it. In the red noise plot, each randomnumber is positively correlated to the one preceding it.
To show you how you can use this package to simulate and analyzecolored noise, I will present a problem for us to solve. Let’s say wewant to find out whether our estimates of temporal autocorrelation arebiased when the standard deviation of the colored noise is high. First,let’s use theraw_noise function to generate some colorednoise with high variance.
red<-colored_noise(timesteps =100,mean =0.3,sd =1.2,phi =0.5)red[1:10]#> [1] 0.28001168 0.24419633 1.19486720 -0.51629994 0.25145762 0.99853910#> [7] -0.78700971 -1.48547374 -1.12394821 -0.09313764blue<-colored_noise(timesteps =100,mean =0.3,sd =1.2,phi =-0.5)blue[1:10]#> [1] 0.3848162738 1.3793977037 -0.2213794596 -0.1328270269 0.6059131329#> [6] -0.8291852531 -0.0008303299 -0.3999374662 1.9556737736 -1.2631057905ggplot(data =NULL,aes(x =c(1:100),y = blue))+geom_line(color="blue")+theme_minimal()+theme(axis.title =element_blank())+ggtitle("Blue Noise")ggplot(data =NULL,aes(x =c(1:100),y = red))+geom_line(color="red")+theme_minimal()+theme(axis.title =element_blank())+ggtitle("Red Noise")The red noise and blue noise look the way they should. Now let’sestimate the mean, SD, and autocorrelation of these two sets of randomnumbers to see how well they match the parameters we used to generatethem.
invoke_map(list(mean, sd, autocorrelation),rep(list(list(red)),3))#> Warning: `invoke_map()` was deprecated in purrr 1.0.0.#> ℹ Please use map() + exec() instead.#> This warning is displayed once every 8 hours.#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was#> generated.#> [[1]]#> [1] -0.1847692#>#> [[2]]#> [1] 1.099992#>#> [[3]]#> [1] 0.3475821invoke_map(list(mean, sd, autocorrelation),rep(list(list(blue)),3))#> [[1]]#> [1] 0.3008564#>#> [[2]]#> [1] 1.32102#>#> [[3]]#> [1] -0.6571447The estimates for these seem to be quite accurate. But let’s trygenerating a whole lot of colored noise across a range of variance andcolor and estimating the autocorrelation of each replicate.
raw_sims<-cross_df(list(timesteps =20,phi =c(-0.5,0,0.5),mean =0.3,sd =c(0.5,0.7,0.9,1.1,1.3,1.5)))%>%rerun(.n =30,pmap(., colored_noise))#> Warning: `rerun()` was deprecated in purrr 1.0.0.#> ℹ Please use `map()` instead.#> # Previously#> rerun(30, ., pmap(., colored_noise))#>#> # Now#> map(1:30, ~ list(., pmap(., colored_noise)))#> This warning is displayed once every 8 hours.#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was#> generated.#> Warning: `cross_df()` was deprecated in purrr 1.0.0.#> ℹ Please use `tidyr::expand_grid()` instead.#> ℹ See <https://github.com/tidyverse/purrr/issues/768>.#> This warning is displayed once every 8 hours.#> Call `lifecycle::last_lifecycle_warnings()` to see where this warning was#> generated.labels<- raw_sims%>%map(1)%>%rbindlist()noise<- raw_sims%>%map(2)%>%flatten()estimates<-data.table(mu =map_dbl(noise, mean),sigma =map_dbl(noise, sd),autocorrelation =map_dbl(noise, autocorrelation))sd_range<-cbind(labels, estimates)head(sd_range)#> timesteps phi mean sd mu sigma autocorrelation#> <num> <num> <num> <num> <num> <num> <num>#> 1: 20 -0.5 0.3 0.5 0.2149788 0.4644747 -0.62111510#> 2: 20 0.0 0.3 0.5 0.2054291 0.3984204 0.02775106#> 3: 20 0.5 0.3 0.5 0.2002314 0.7034121 1.07074602#> 4: 20 -0.5 0.3 0.7 0.3342406 0.5834011 -0.60093868#> 5: 20 0.0 0.3 0.7 0.2648067 0.7611858 0.11988858#> 6: 20 0.5 0.3 0.7 0.1537005 0.7856864 0.19798068We can visualize these results by finding the mean and confidenceintervals of the autocorrelation estimates for each combination ofvariance and noise color, and plotting them out.
summ<- sd_range[, .(lower.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95))[[1]]})(autocorrelation)),upper.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95))[[2]]})(autocorrelation)),mean =mean(autocorrelation)), keyby= .(phi, sd)]ggplot(summ,aes(x = sd,y = mean))+geom_pointrange(aes(ymin = lower.ci,ymax = upper.ci,color =factor(phi)),size =0.8)+geom_hline(yintercept =0,linetype =2,color ="#C0C0C0")+geom_hline(yintercept =-0.5,linetype =2,color ="#0033FF")+geom_hline(yintercept =0.5,linetype =2,color ="#CC0000")+theme(text=element_text(size=20))+xlab("Standard Deviation")+ylab("Estimated Autocorrelation")+scale_colour_manual(values =c("#0033FF","#C0C0C0","#CC0000"))+labs(color="Noise color")+theme_light()As we can see, the variance of the colored noise does not seem toaffect the estimates of autocorrelation in any case. However, anotherinteresting pattern emerges: the estimates of autocorrelation for rednoise (phi = 0.5) are consistently biased too low. This may be becausethe number of timesteps we simulated (20) is too short to get a fullmeasurement of the long wavelengths of red noise.
Let’s explore this possibility further.
We now suspect, based on estimates of colored noise, that long timeseries are required in order to detect positive temporal autocorrelationwithout bias. However, the problem becomes even more complex when we usecolored noise to simulate populations. Now, the environmentalstochasticity we’re modeling as colored noise is compounded bydemographic stochasticity, the random drift resulting from the discreteprobability distribution of births and deaths, especially in smallpopulations.
To give an example of what I mean, let’s model a small populationwith positively autocorrelated environmental stochasticity. Thispopulation is closed (no migration), assumes constant fertility andmortality for all individuals, and can be thought of as asexuallyreproducing or female-only.
set.seed(3935)series1<-unstructured_pop(start=20,timesteps=20,survPhi=0.4,fecundPhi=0.4,survMean=0.5,survSd=0.2,fecundMean=1,fecundSd=0.7)ggplot(series1,aes(x=timestep,y=population))+geom_line()Now let’s try addressing the question of whether short time seriesbias our estimates of autocorrelation in red noise populations. We cando this by simulating populations along a range of time series lengthsand noise color. Let’s try a full range from blue noise to red noise inthe survival vital rate, and a range of time series lengths from fiveyears to a hundred years.
sims<-autocorr_sim(timesteps =seq(5,60,5),start =200,survPhi =c(-0.5,-0.25,-0.2,-0.1,0,0.1,0.2,0.25,0.5),fecundPhi =0,survMean =0.4,survSd =0.05,fecundMean =1.8,fecundSd =0.2,replicates =100)ggplot(sims[[6]],aes(x=timestep,y=population))+geom_line()Here we see one example of a population simulated by thefunction.
Now let’s estimate the autocorrelation in each of these simulatedpopulations. There is a utility functionautocorrelation inthis package that measures temporal autocorrelation at a time lag of1.
estimates<- sims%>%map(~.[, .(acf.surv =autocorrelation(est_surv,biasCorrection = F)),keyby = .(survPhi, timesteps)])%>%rbindlist()I find that this format of ggplot displays the estimates by noisecolor very nicely.
summ2<- estimates[, .(lower.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95))[[1]]})(acf.surv)),upper.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95))[[2]]})(acf.surv)),mean =mean(acf.surv)), keyby= .(survPhi, timesteps)]# Noise color values for the graph in hexadecimalnoise=c("#0033FF","#3366FF","#6699FF","#99CCFF","#FFFFFF","#FF9999","#FF6666","#FF0000","#CC0000")ggplot(summ2,aes(x=timesteps,y=mean))+geom_point(size=8,aes(color=factor(survPhi)))+# This creates confidence intervals around the autocorrelationsgeom_pointrange(aes(ymin=lower.ci,ymax=upper.ci),size=0.8)+# Adds a line for the true autocorrelation valuegeom_hline(aes(yintercept=survPhi),linetype=2)+# This facets the plots by true autocorrelation valuefacet_wrap(~ survPhi)+# This increases the font size and labels everything nicelytheme(text=element_text(size=13))+xlab("Time series lengths")+ylab("Estimated Autocorrelation")+scale_colour_manual(values=noise)+labs(color="Noise color")+ggtitle("Bias in estimates of autocorrelation of survival")+scale_y_continuous(limits=c(-1.1,1.2))We can see here that autocorrelation estimates are biased negative atshort time lengths, especially for populations with positive temporalautocorrelation - when the time series is only 5 timesteps, all of thered noise populations read as white noise populations. Whenautocorrelation is strongly negative, as in the population at -0.5,short time lengths are slightly positively biased. This matches thefindings of M. Kendall in his volumeTime-Series, where heshowed that there is a bias in autocorrelation estimates such that
\[ \hat{\phi} = \phi - \frac{1 + 3\phi}{T- 1} \] where is the autocorrelation and T is the length of thetime series.
Here is what the autocorrelation estimates look like when I turn onthe bias correction in the autocorrelation function.
summ3<- sims%>%map(~.[, .(acf.surv =autocorrelation(est_surv,biasCorrection =TRUE)),keyby = .(survPhi, timesteps)])%>%rbindlist()%>% .[, .(lower.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95),na.rm=T)[[1]]})(acf.surv)),upper.ci = ((function(bar) {quantile(bar,probs =c(0.05,0.95),na.rm=T)[[2]]})(acf.surv)),mean =mean(acf.surv)), keyby= .(survPhi, timesteps)]ggplot(summ3,aes(x=timesteps,y=mean))+geom_point(size=8,aes(color=factor(survPhi)))+geom_pointrange(aes(ymin=lower.ci,ymax=upper.ci),size=0.8)+geom_hline(aes(yintercept=survPhi),linetype=2)+facet_wrap(~ survPhi)+theme(text=element_text(size=13))+xlab("Time series lengths")+ylab("Estimated Autocorrelation")+scale_colour_manual(values=noise)+labs(color="Noise color")+ggtitle("Bias in estimates of autocorrelation of survival")+scale_y_continuous(limits=c(-1.1,1.2))#> Warning: Removed 1 rows containing missing values (`geom_segment()`).The estimates are no longer biased for short time series, though they dobecome less precise with the bias correction.