Adding new segregation indices is not a big trouble. Pleaseopen an issue on GitHub to request an index to be added.
If you use thedplyr package, one pattern that works well is to usegroup_modify. Here, we compute the pairwise Black-White dissimilarity index for each state separately:
library("segregation")library("dplyr")schools00%>%filter(race%in%c("black","white"))%>%group_by(state)%>%group_modify(~dissimilarity(data = .x,group ="race",unit ="school",weight ="n" ))#> # A tibble: 3 × 3#> # Groups: state [3]#> state stat est#> <fct> <chr> <dbl>#> 1 A D 0.706#> 2 B D 0.655#> 3 C D 0.704A similar pattern works also well withdata.table:
library("data.table")schools00<-as.data.table(schools00)schools00[ race%in%c("black","white"),dissimilarity(data = .SD,group ="race",unit ="school",weight ="n"), by= .(state)]#> state stat est#> <fctr> <char> <num>#> 1: A D 0.7063595#> 2: B D 0.6548485#> 3: C D 0.7042057To compute many decompositions at once, it’s easiest to combine the data for the two time points. For instance, here’s adplyr solution to decompose the state-specific M indices between 2000 and 2005:
# helper function for decompositiondiff<-function(df, group) { data1<-filter(df, year==2000) data2<-filter(df, year==2005)mutual_difference(data1, data2,group ="race",unit ="school",weight ="n")}# add year indicatorsschools00$year<-2000schools05$year<-2005combine<-bind_rows(schools00, schools05)combine%>%group_by(state)%>%group_modify(diff)%>%head(5)#> # A tibble: 5 × 3#> # Groups: state [1]#> state stat est#> <fct> <chr> <dbl>#> 1 A M1 0.409#> 2 A M2 0.445#> 3 A diff 0.0359#> 4 A additions -0.0159#> 5 A removals 0.0390Again, here’s also adata.table solution:
setDT(combine)combine[,diff(.SD), by= .(state)]%>%head(5)#> state stat est#> <fctr> <char> <num>#> 1: A M1 0.40859652#> 2: A M2 0.44454379#> 3: A diff 0.03594727#> 4: A additions -0.01585879#> 5: A removals 0.03903106tidycensus to compute segregation indices?Here are a few examples thanks toKyle Walker, the author of thetidycensus package.
First, download the data:
library("tidycensus")cook_data<-get_acs(geography ="tract",variables =c(white ="B03002_003",black ="B03002_004",asian ="B03002_006",hispanic ="B03002_012" ),state ="IL",county ="Cook")#> Getting data from the 2017-2021 5-year ACSBecause this data is in “long” format, it’s easy to compute segregation indices:
# compute index of dissimilaritycook_data%>%filter(variable%in%c("black","white"))%>%dissimilarity(group ="variable",unit ="GEOID",weight ="estimate" )#> stat est#> <char> <num>#> 1: D 0.7855711# compute multigroup M/H indicescook_data%>%mutual_total(group ="variable",unit ="GEOID",weight ="estimate" )#> stat est#> <char> <num>#> 1: M 0.5114435#> 2: H 0.4089561Producing a map of local segregation scores is also not hard:
library("tigris")library("ggplot2")local_seg<-mutual_local(cook_data,group ="variable",unit ="GEOID",weight ="estimate",wide =TRUE)# download shapefileseg_geom<-tracts("IL","Cook",cb =TRUE,progress_bar =FALSE)%>%left_join(local_seg,by ="GEOID")#> Retrieving data for the year 2021ggplot(seg_geom,aes(fill = ls))+geom_sf(color =NA)+coord_sf(crs =3435)+scale_fill_viridis_c()+theme_void()+labs(title ="Local segregation scores for Cook County, IL",fill =NULL )When usingmutual_difference, supplymethod = "shapley_detailed" to get two different local segregation scores that are margins-adjusted (one is coming from adjusting forward, the other from adjusting backwards). By averaging them we can create a single margins-adjusted local segregation score:
diff<-mutual_difference(schools00, schools05,"race","school",weight ="n",method ="shapley_detailed")diff[stat%in%c("ls_diff1","ls_diff2"), .(ls_diff_adjusted =mean(est)), by= .(school)]#> school ls_diff_adjusted#> <fctr> <num>#> 1: A1_3 -0.088983164#> 2: A2_2 -0.044338042#> 3: A2_3 -0.101696519#> 4: A2_4 -0.020134162#> 5: A2_6 -0.138567163#> ---#> 1706: C164_2 -0.031329845#> 1707: C165_1 -0.023978101#> 1708: C165_3 0.003781632#> 1709: C166_1 0.010270713#> 1710: C167_1 -0.002663687