Movatterモバイル変換


[0]ホーム

URL:


Digitizing Qualitative GIS Data withqualmap

Christopher Prener

2024-01-08

Overview

This package implements a process for converting qualitative GIS datafrom an exercise where respondents are asked to identify salientlocations on a map. This article focuses primarily on the use of thesoftware to digitize these data.

Motivation and Approach

Qualitative GIS outputs are notoriously difficult to work withbecause individuals’ conceptions of space can vary greatly from eachother and from the realities of physical geography themselves.qualmap builds on a semi-structured approach to qualitativeGIS data collection. Respondents use a specially designed basemap thatallows them free reign to identify geographic features of interest andmakes it easy to convert their annotations into digital map features.This is facilitated by including on the basemap a series of polygons,such as neighborhood boundaries or census geography, along with anidentification number that can be used byqualmap. A circledrawn on the map can therefore be easily associated with the featuresthat it touches or contains.

qualmap provides a suite of functions for entering,validating, and creatingsf objects based on these handdrawn clusters and their associated identification numbers. Once theclusters have been created, they can be summarized and analyzed eitherwithin R or using another tool.

This approach provides an alternative to either unstructuredqualitative GIS data, which are difficult to work with empirically, andto digitizing respondents’ annotations as rasters, which require asophisticated workflow. This semi-structured approach makes integratingqualitative GIS with existing census and administrative data simple andstraightforward, which in turn allows these data to be used as measuresin spatial statistical models.

Cartographica Article

Anarticledescribingqualmap’s approach to qualitative GIS hasbeen published inCartographica. All data associated with thearticle are also available onOpenScience Framework, and the code are available viaOpen Science Framework andGitHub. Pleasecite the paper if you usequalmap in your work!

Installation

The easiest way to getqualmap is to install it fromCRAN:

install.packages("qualmap")

You can install the development version ofqualmap fromGithub with theremotes package:

# install.packages("remotes")remotes::install_github("chris-prener/qualmap")

Note that installations that requiresf to be built fromsource will require additional software regardless of operatingsystem. You should check thesf packagewebsite for the latest details on installing dependencies for thatpackage. Instructions vary significantly by operating system.

Basics

qualmap is built around a number of fundamentalprinciples. The primary data objects created byqm_combine() arelong data rather thanwide. This is done to facilitate easy, consistent datamanagement. The package also implements simple features objects usingthesf package. This provides a modern interface forworking with spatial data inR.

Core Verbs

qualmap implements six core verbs for working withmental map data:

  1. qm_define() - create a vector of feature id numbersthat constitute a single “cluster”
  2. qm_validate() - check feature id numbers against areference data set to ensure that the values are valid
  3. qm_preview() - plot cluster on an interactive map toensure the feature ids have been entered correctly (the preview shouldmatch the map used as a data collection instrument)
  4. qm_create() - create a single cluster object once thedata have been validated and visually inspected
  5. qm_combine() - combine multiple cluster objectstogether into a single tibble data object
  6. qm_summarize() - summarize the combined data objectbased on a single qualitative construct to prepare for mapping

The order that these functions are listed here is the approximateorder in which they should be utilized. Data should be defined,validated and previewed, and then cluster objects should be created,combined, and summarized.

Main Arguments

All of the main functions exceptqm_define() andqm_combine() rely on two key arguments:

Additionally, a number of the initial functions have a thirdessential argument:

Data Preparation

To begin, you will need a simple features object containing thepolygons you will be matching respondents’ data to. Census geographypolygons can be downloaded viatigris, and other polygonshapefiles can be read intoR using thesfpackage.

Here is an example of preparing data downloaded viatigris:

library(dplyr)# data wranglinglibrary(sf)# simple features objectslibrary(tigris)# access census tiger/line datastLouis<-tracts(state ="MO",county =510)stLouis<-mutate(stLouis,TRACTCE =as.numeric(TRACTCE))

We download the census tract data for St. Louis and convert theTRACTCE variable to numeric format.

If you want to use your own base data instead, you can use thest_read() function fromsf to bring them intoR.

Data Entry

Once we have a reference data set constructed, we can begin enteringthe tract numbers that constitute a single circle on the map or“cluster”. We use theqm_define() function to input theseid numbers into a vector:

cluster1<-qm_define(118600,119101,119300)

We can then use theqm_validate() function to check eachvalue in the vector and ensure that these values all match thekey variable in the reference data:

>qm_validate(ref = stLouis,key = TRACTCE,value = cluster1)[1]TRUE

Ifqm_validate() returns aTRUE value, alldata are matches. If it returnsFALSE, at least one of theinput values does not match any of thekey variable values.In this case, ourkey is theTRACTCE variablein thesf object we created earlier.

Once the data are validated, we can preview them interactively usingqm_preview(), which will show the features identified inthe given vector in red on the map:

qm_preview(ref = stLouis,key = TRACTCE,value = cluster1)

Create Cluster Object

A cluster object is tibble data frame that is “tidy” - each featurein the reference data is a row. Cluster objects also contain metadataabout the cluster itself: the respondent’s identification number fromthe study, a cluster identification number, and a category thatdescribes what the cluster represents. Clusters are created usingqm_create():

> cluster1_obj<-qm_create(ref = stLouis,key = TRACTCE,value = cluster1,rid =1,cid =1,category ="positive")> cluster1_obj# A tibble: 3 x 5    RID   CID CAT      TRACTCE COUNT*<int><int><chr><dbl><dbl>111 positive1193001.00211 positive1186001.00311 positive1191011.00

Combine and Summarize Multiple Clusters

Once several cluster objects have been created, they can be combinedusingqm_combine() to produce a tidy tibble formatted dataobject:

> clusters<-qm_combine(cluster1_obj, cluster2_obj, cluster3_obj)> clusters# A tibble: 9 x 5    RID   CID CAT      TRACTCE COUNT<int><int><chr><dbl><dbl>111 positive1193001.00211 positive1186001.00311 positive1191011.00412 positive1193001.00512 positive1212001.00612 positive1211001.00713 negative1193001.00813 negative1186001.00913 negative1191011.00

Since the same census tract appears in multiple rows as part ofdifferent clusters, we need to summarize these data before we can mapthem. Part ofqualmap’s opinionated approach revolvesaround clusters representing only one construct. When we summarize,therefore, we also subset our data so that they represent only onephenomenon. In the above example, there are both “positive” and“negative” clusters. We can useqm_summarize() to extractonly the “positive” clusters and then summarize them so that we have onerow per census tract:

> pos<-qm_summarize(ref = stLouis,key = TRACTCE,clusters = clusters,+category ="positive",geometry =TRUE,use.na =FALSE)> posSimple feature collection with106 features and7 fieldsgeometry type:  POLYGONdimension:      XYbbox:           xmin:-90.32052 ymin:38.53185 xmax:-90.16657 ymax:38.77443epsg (SRID):4269proj4string:+proj=longlat+ellps=GRS80+towgs84=0,0,0,0,0,0,0+no_defsFirst10 features:   STATEFP COUNTYFP TRACTCE       GEOID NAME          NAMELSAD positive                       geometry129510112100295101121001121 Census Tract11210POLYGON ((-90.3044538.6328...229510116500295101165001165 Census Tract11650POLYGON ((-90.2430238.5975...329510110300295101103001103 Census Tract11030POLYGON ((-90.2403238.6643...429510103700295101037001037 Census Tract10370POLYGON ((-90.2987738.6028...529510103800295101038001038 Census Tract10380POLYGON ((-90.3205238.5941...629510104500295101045001045 Census Tract10450POLYGON ((-90.2943238.6209...729510106100295101061001061 Census Tract10610POLYGON ((-90.2900538.6705...829510105500295101055001055 Census Tract10550POLYGON ((-90.2860138.6589...929510105200295101052001052 Census Tract10520POLYGON ((-90.2948138.6473...1029510105300295101053001053 Census Tract10530POLYGON ((-90.2970538.6617...

Theqm_summarize() function has an options to returnNA values instead of0 values for features notincluded in any clusters (whenuse.na = TRUE), and canreturn a non-sf tibble of valid features instead of thesf object (whengeometry = FALSE).

Mapping Summarized Data

Finally, we can use thegeom_sf() geom fromggplot2 tomap our summarized data, highlighting areas most discussed as being“positive” parts of St. Louis in our hypothetical study:

library(ggplot2)library(viridis)ggplot()+geom_sf(data = qualData,mapping =aes(fill = positive))+scale_fill_viridis()

Sincequalmap output aresf objects, theywill work with any of the spatial packages that also supportsf.

Getting Help


[8]ページ先頭

©2009-2025 Movatter.jp