Let us begin by defining two spatial cross-sectional series\(\{x_s\}_{s \in S}\) and\(\{y_s\}_{s \in S}\), where\(S\) represents the study area.
We use:
\[\mathcal{X}_W = \{ W_i x \mid W_i \in \mathcal{W}(x, y) \}\]
\[\mathcal{Y}_W = \{ W_i y \mid W_i \in \mathcal{W}(x, y) \}\]
to denote the sets of spatial lags of\(x\) and\(y\) given by all the weighting matrices in\(\mathcal{W}(x, y)\)(that is the setof spatial dependence structures between\(x\) and\(y\)).
We say that\(\{x_s\}_{s \in S}\)does not cause\(\{y_s\}_{s \in S}\)under the spatial structures\(\mathcal{X}_W\) and\(\mathcal{Y}_W\) if
\[h_{y |\mathcal{Y}_W}(m) = h_{y | \mathcal{Y}_W,\mathcal{X}_W}(m)\] A unilateral non-parametric test can be applied to assess thespatial causality via the following null hypothesis:
\[H_0 : \{x_s\}_{s \in S} \text{ does not cause } \{y_s\}_{s \in S} \text{under the spatial structures } \mathcal{X}_W \text{ and } \mathcal{Y}_W,\]
with the following statistic:
\[\hat{\delta}(\mathcal{Y}_W, \mathcal{X}_W) = \hat{h}_{y|\mathcal{Y}_W}(m) - \hat{h}_{y | \mathcal{Y}_W,\mathcal{X}_W}(m)\]
where\(\hat{h}_*(m)\) is theestimated conditional symbolic entropy using Shannon’s entropy with\(m-1\) nearest neighbors. Thealternative is that the null hypothesis of is not true.
If\(\mathcal{X}_W\) does notcontain extra information about\(y\)then\(\hat{\delta}(\mathcal{Y}_W,\mathcal{X}_W) = 0\), otherwise, $ (_W, _W) > 0$.
\(h_{y |\mathcal{Y}_W}(m)\) measuresthe uncertainty of the distribution of symbols of\(y\), conditional to the symbols of itsspatial lag, $ _W$. Moreover,\(h_{y |\mathcal{Y}_W,\mathcal{X}_W}(m)\) measures the uncertainty of thedistribution of symbols of\(y\),conditional to the symbols of the spatial lags of\(y\),\(\mathcal{Y}_W\), and of\(x\), $ _W$. If the second variable,\(x\), indeed causes the first one then thereshould be a significant decrease in the entropy, and the statistic\(\hat{\delta}(\mathcal{Y}_W,\mathcal{X}_W)\) will take on high positive values. If there isonly a spatial correlation, but not causation, the difference betweenboth entropies will be small. The statistical significance of\(\hat{\delta}(\mathcal{Y}_W,\mathcal{X}_W)\) is assessed using spatial block bootstrap.
Load thespEDM package and the columbus OH dataset:
library(spEDM)columbus= sf::read_sf(system.file("case/columbus.gpkg",package="spEDM"))columbus## Simple feature collection with 49 features and 6 fields## Geometry type: POLYGON## Dimension: XY## Bounding box: xmin: 5.874907 ymin: 10.78863 xmax: 11.28742 ymax: 14.74245## Projected CRS: Undefined Cartesian SRS with unknown unit## # A tibble: 49 × 7## hoval inc crime open plumb discbd geom## <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <POLYGON>## 1 80.5 19.5 15.7 2.85 0.217 5.03 ((8.624129 14.23698, 8.5597 14.74245, 8…## 2 44.6 21.2 18.8 5.30 0.321 4.27 ((8.25279 14.23694, 8.282758 14.22994, …## 3 26.4 16.0 30.6 4.53 0.374 3.89 ((8.653305 14.00809, 8.81814 14.00205, …## 4 33.2 4.48 32.4 0.394 1.19 3.7 ((8.459499 13.82035, 8.473408 13.83227,…## 5 23.2 11.3 50.7 0.406 0.625 2.83 ((8.685274 13.63952, 8.677577 13.72221,…## 6 28.8 16.0 26.1 0.563 0.254 3.78 ((9.401384 13.5504, 9.434411 13.69427, …## 7 75 8.44 0.178 0 2.40 2.74 ((8.037741 13.60752, 8.062716 13.60452,…## 8 37.1 11.3 38.4 3.48 2.74 2.89 ((8.247527 13.58651, 8.2795 13.5965, 8.…## 9 52.6 17.6 30.5 0.527 0.891 3.17 ((9.333297 13.27242, 9.671007 13.27361,…## 10 96.4 13.6 34.0 1.55 0.558 4.33 ((10.08251 13.03377, 10.0925 13.05275, …## # ℹ 39 more rowsDetect spatial causality among the variablesinc,crime, andhoval :
# house value and crimesc.test(columbus,"hoval","crime",k =15)## spatial causality test## hoval -> crime: statistic = 1.114, p value = 0.549## crime -> hoval: statistic = 1.555, p value = 0.008# household income and crimesc.test(columbus,"inc","crime",k =15)## spatial causality test## inc -> crime: statistic = 0.739, p value = 0.99## crime -> inc: statistic = 0.946, p value = 0.762# household income and house valuesc.test(columbus,"inc","hoval",k =15)## spatial causality test## inc -> hoval: statistic = 1.211, p value = 0.311## hoval -> inc: statistic = 0.992, p value = 0.702Load thespEDM package and its farmland NPP data:
library(spEDM)npp= terra::rast(system.file("case/npp.tif",package ="spEDM"))# To save the computation time, we will aggregate the data by 3 timesnpp= terra::aggregate(npp,fact =3,na.rm =TRUE)npp## class : SpatRaster## size : 135, 161, 5 (nrow, ncol, nlyr)## resolution : 30000, 30000 (x, y)## extent : -2625763, 2204237, 1867078, 5917078 (xmin, xmax, ymin, ymax)## coord. ref. : CGCS2000_Albers## source(s) : memory## names : npp, pre, tem, elev, hfp## min values : 187.50, 390.3351, -47.8194, -110.1494, 0.04434316## max values : 15381.89, 23734.5330, 262.8576, 5217.6431, 42.68803711Detect spatial causality among the variablespre,tem, andnpp :
# precipitation and nppsc.test(npp,"pre","npp",k =30)## spatial causality test## pre -> npp: statistic = 0.227, p value = 0## npp -> pre: statistic = 0.16, p value = 0# temperature and nppsc.test(npp,"tem","npp",k =30)## spatial causality test## tem -> npp: statistic = 0.168, p value = 0## npp -> tem: statistic = 0.169, p value = 0# precipitation and temperaturesc.test(npp,"pre","tem",k =30)## spatial causality test## pre -> tem: statistic = 0.19, p value = 0## tem -> pre: statistic = 0.147, p value = 0