Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Source for state legislative district map tiles for openstates.org

License

NotificationsYou must be signed in to change notification settings

openstates/openstates-geo

Repository files navigation

  • Tracked in our central issue repository:Geo Issues

Geo lookup endpoint (v3-district-geo)

A small part of the code here, in the/endpoint directory, comprises a lambda function that is deployed as an AWSlambda calledv3-district-geo (in the openstates account). This function does geographic-specific querying. At thispoint I don't remember why it is a separately deployed endpoint (maybe because API v2 was sharing it under the hood?).

Upgrading Python version/runtime + associated layer

Occasionally the Python runtime needs to be upgraded. This also involves updatingtheLambda Layerthat the function depends on to provide its psycopg2 dependency.

To build and create a new layer:

  • I used theaws-psycopg2 package to obtain a version of psycopg2 compiled for the AWS environment
  • Change directory to theendpoint directory
  • Create a folder calledpython
  • Install the dependency to the folder:pip install --target ./python aws-psycopg2
  • Package up the folder as a zip:zip -r python39awspscycopg2.zip python
  • UsetheAWS console to upload the new layer(or add it as a new version to an existing layer)

To upgrade the Python version/runtime of the function:

  • Click the "Layers" icon intheAWS Lambda console UI
  • Click the "Edit" button
  • Choose the existing layer (associated with the previous python runtime/version) and Delete it
  • Go back to thev3-district-geo function, scroll down to "Runtime Settings" and click "Edit"
  • Change the python version
  • Now go back to the Layers section and use Add a Layer to associate the layer that is compatible with that pythonversion

Open States Geography Processing & Server

Generate and upload map tiles for the state-level legislative district mapsonopenstates.org, both forstate overviews andforindividual legislators.

  • Source: SLDL and SLDU shapefilesfromthe Census's TIGER/Line database
  • Output: a single nationwide MBTiles vector tile set, uploaded to Mapbox for hosting
    • Intermediate files are also built and retained locally, stored in thedata directory for debugging

Dependencies

Ensuring The Right Shape Files

We download our shapefiles fromcensus.gov.

The organization of files within TIGER's site means that we may have to change the layout of downloaded files from yearto year (inutils/tiger.py). As long as we consistently add proper files intodata/source_cache for the rest of thescripts to process, changing the initial download location shouldn't matter.

See Appendix A below on Geographic Data Sources for more context.

Re-running the script: cached data concerns

The geo generate script has a multitude of places where it checks for existing files and prefers to use them, ratherthan regenerate them. That means it is easy tothink you are regenerating data completely, but one or more partsare actually the previously generated data.

BUT there is a caveat: re-downloading source shapefiles (esp from US Census server) is very slow, and occasionallya source file becomes unavailable. Deleting you full./data/ folder can thus set you back.

Option 1: full clean

You can manually delete./data/ OR use the--clean-source parameter to do so when running the script. Please notethat re-downloading shapefiles from source sites will be slow, and occasionally US Census/Tiger will just not servea file.

So, before deleting, it is recommended to copy the files in./data/* (excluding subfolders) to a backup location.

Option 2: selectively delete generated files

If you want to regen without changing source files, these are some locations within./data/ you can delete:

  • ./data/source_cache/: this is the unzipped contents of the zipfiles downloaded from sources
    • EXCEPT for the.geojson files in this subfolder! They are generated byconvert_to_geojson() and if one alreadyexists, then the dependency process_merge_ids() will NOT run, and thus data in./data/geojson/ will NOT be(re)generated. So if you are working on_marge_ids() code, then you WILL want to delete at least the.geojsonfiles in this subdirectory.
  • Mapbox data/upload:
    • ./data/mapbox/: if geojson files exists in here, thencreate_tiles() will NOT regenerate them
    • ./data/cd.mbtiles and./data/sld.mbtiles: if either of these are present, thencreate_tiles() will NOTregenerate them.
  • ./data/boundaries/*.json: if these boundary files already exist, then_make_boundaries() will NOT regen them

National Boundary Update

config/settings.yml holds theBOUNDARY_YEAR config. This setting defines what to apply to our US boundary templatelink:

f"{TIGER_ROOT}/GENZ{boundary_year}/shp/cb_{boundary_year}_us_nation_5m.zip"

We should verify/update this setting to the most recently available boundary year whenever we run geo data.

Note on file naming

You'll see many files with names likesldu,sldl orcd during this process. Here is a quick layout of what thosefile name abbreviations mean:

  • sldu
    • State Level District Upper -> Upper Chamber District boundaries
  • sldl
    • State Level District Lower -> Lower Chamber District boundaries
  • cd
    • Congressional District -> Federal Congressional District boundaries

We do not collect boundaries for Federal Senate because each state has the same number of senators and they areconsidered "at-large" (having no district boundaries beyond the entire state).

Running

There are several steps, which typically need to be run in order:

  1. Setup Poetry:
  • poetry install

2 ) Make sure environment variables are set correctly:

  • DATABASE_URL: pointing at either thegeo database in production or to a local copy,e.g.DATABASE_URL=postgis://<user>:<password>@<db_host>/geo
  • MAPBOX_ACCESS_TOKEN: a API token for Mapbox with permissions to upload tilesets
  • AWS_ACCESS_KEY_ID andAWS_SECRET_ACCESS_KEY: AWS credentials to upload bulk versions of geo data
  1. Download and format geo data:
  • poetry run python generate-geo-data.py --run-migrations --upload-data
    • Note that this script does not fail on individual download failures. If you see failures in the run, make surethey are expected (e.g. NE/DC lower should fail)

Setting up environment variables

There are plenty of ways to set environment variables, but quick way to manage many environment variables is with an "environment file". e.g.

AWS_ACCESS_KEY_ID="user"export AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEY="test"export AWS_SECRET_ACCESS_KEYMAPBOX_ACCESS_TOKEN="token"export MAPBOX_ACCESS_TOKENDATABASE_URL="postgis://openstates:openstates@localhost:5405/openstatesorg"export DATABASE_URL

After that, we can easily load the file:

. env-file

Running within Docker

Instead of setting up your local environment you can instead run using Docker. Using Docker Compose will still allow youto access all intermediate files from the processing, within your localdata directory.

Build and run with Docker Compose. Similar to running without Docker, environment variables must be set in your localenvironment.

docker-compose up make-tiles

Appendix A: Geo Data Sources used by openstates-geo

openstates-geo works with shapefiles. Shapefiles can be opened by a tool calledqgisFor example, to inspect a source shapefile, such astl_2022_01_sldl.shp, open up qgis and navigate to the folder wherethat file resides. Open the file, it should appear in the main pane as a map. Use the "Select Features by Area or singleclick"button in the toolbar, and then select a district. Metadata should appear in the right pane.

US Census

Redistricting

During the next major sessions after a Census (e.g. 2022 was the major session formost jurisdictions after the 2020Census), the TIGER data we rely on may be significantly "behind" reality as the example note from 2022 indicates:

"We hold the districts used for the 2018 election until we collect the postcensal congressional and state legislativedistrict plansfor the 118th Congress and year 2022 state

legislatures"US Census CD/SLD note

As of 2022, TIGER was still the most consistent data source for district boundaries we were able to find.

US Census: TIGER

Files in the TIGER data source are organized according toFederal Information Processing System (FIPS) codes.Each numeric code corresponds to a US state (or other levels). For example01 represents Alabama.

TIGER SLDL

2022

This contains data, including shapefiles, about State Legislative Districts in Lower chambers (SLDL).

TIGER SLDU

2022

This contains data, including shapefiles, about State Legislative Districts in Upper chambers (SLDU).

TIGER CD

2022

This contains data, including shapefiles, about Congressional Districts.

Appendix B: Manually checking geographic data

As mentioned above, the open source desktop appQGIS can help to investigate and debug yourexpectations about both source geo files and geo files that are generated by the script. It can open files of the type:

  • .shp
  • .geojson
  • .mbtiles

About

Source for state legislative district map tiles for openstates.org

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors7


[8]ページ先頭

©2009-2025 Movatter.jp