Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Process GTFS Static/Schedule by importing it into a PostgreSQL database.

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Apache-2.0
LICENSE-APACHE
Unknown
LICENSE-PROSPERITY.md
NotificationsYou must be signed in to change notification settings

public-transport/gtfs-via-postgres

Repository files navigation

ImportGTFS Static/Schedule datasets into aPostgreSQL database, to allow for efficient querying and analysis.

npm versionbinary build statusProsperity/Apache licenseminimum Node.js versionsupport me via GitHub Sponsorschat with me on Twitter

  • ✅ handlesdaylight saving time correctly but retains reasonable lookup performance
  • ✅ supportsfrequencies.txt
  • ✨ joinsstop_times.txt/frequencies.txt,calendar.txt/calendar_dates.txt,trips.txt,route.txt &stops.txt intoviews for straightforward data analysis (see below)
  • 🚀 is carefully optimised to let PostgreSQL's query planner do its magic, yielding quick lookups even with large datasets (seeperformance section)
  • ✅ validates and importstranslations.txt
  • ✨ exposes (almost) all data via GraphQL usingPostGraphile, and as a RESTful API usingPostgREST

To work with the time-related data (stop_times etc.),gtfs-via-postgres supports two "mental models":

  • the time-unexpanded data that is almost directly taken from the GTFS Schedule data – This is useful if you want to do network analysis.
  • the time-expanded view that "applies" every trip'sstop_times rows to all of its service days – This is useful for routing & queries from the traveller's perspective.

Installation

npm install -g gtfs-via-postgres

Or usenpx. ✨

There are alsoprebuilt binaries andDocker images available.

Note:gtfs-via-postgresneeds PostgreSQL >=14 to work, as it uses theWITH … AS NOT MATERIALIZED syntax. You can check your PostgreSQL server's version withpsql -t -c 'SELECT version()'.

Getting Started

If you have a.zip GTFS feed, unzip it into individual files.

We're going to use the2022-07-01VBB feed as an example, which consists of individual files already.

wget --compression auto \    -r --no-parent --no-directories -R .csv.gz \    -P gtfs -N'https://vbb-gtfs.jannisr.de/2022-07-01/'## Downloaded 14 files in 20s.ls -lh gtfs# 3.3K agency.csv#  97K calendar.csv# 1.1M calendar_dates.csv# 2.5K datapackage.json#  64B frequencies.csv# 5.9K levels.csv# 246B license# 8.3M pathways.csv#  49K routes.csv# 146M shapes.csv# 368M stop_times.csv# 5.0M stops.csv# 4.7M transfers.csv#  16M trips.csv

Depending on your specific setup, configure access to the PostgreSQL database viaPG* environment variables:

export PGUSER=postgresexport PGPASSWORD=passwordenv PGDATABASE=postgres psql -c'create database vbb_2022_02_25'export PGDATABASE=vbb_2022_02_25

Note:gtfs-via-postgres generates SQL that contains theCREATE EXTENSION postgis instruction. For this to work, the PostgreSQL user you're connecting as needs theCREATEpermission on the database. Also, thepostgis extension must either be marked as trusted (by puttingtrusted = true into$(pg_config --sharedir)/extension/postgis.control), or your user must be a superuser.

Installgtfs-via-postgres and use it to import the GTFS data:

npm install -D gtfs-via-postgresnpmexec -- gtfs-to-sql --require-dependencies -- gtfs/*.csv| sponge| psql -b# agency# calendar# CREATE EXTENSION# BEGIN# CREATE TABLE# COPY 37## CREATE INDEX# CREATE VIEW# COMMIT

Importing will take 10s to 10m, depending on the size of the feed. On anM1 MacBook Air, importing the above feed takes about 4m; Importing the260kb 2021-10-06 Amtrak feed takes 6s.

In addition to a table for each GTFS file,gtfs-via-postgres adds these views to help with real-world analysis:

  • service_days (materialized) "applies"calendar_dates tocalendar to give you all days of operation for each "service" defined incalendar.
  • arrivals_departures "applies"stop_times/frequencies totrips andservice_days to give you all arrivals/departures at each stop with theirabsolute dates & times. It also resolves each stop's parent station ID & name.
  • connections "applies"stop_times/frequencies totrips andservice_days, just likearrivals_departures, but gives you departure (at stop A) & arrival (at stop B)pairs.
  • shapes_aggregated aggregates individual shape points inshapes into aPostGISLineString.
  • stats_by_route_date provides the number of arrivals/departures by route ID and date. –read more
  • stats_by_agency_route_stop_hour provides the number of arrivals/departures by agency ID, route ID, stop ID & hour. –read more
  • In contrast tostats_by_route_date &stats_by_agency_route_stop_hour,stats_active_trips_by_hour provides the number ofcurrently running trips for each hour in the feeds period of time.

As an example, we're going to use thearrivals_departures view to query allabsolute departures atde:11000:900120003 (S Ostkreuz Bhf (Berlin)) between2022-03-23T12:30+01 and2022-03-23T12:35+01:

SELECT*FROM arrivals_departuresWHERE station_id='de:11000:900120003'AND t_departure>='2022-03-23T12:30+01'AND t_departure<='2022-03-23T12:35+01'
route_idroute_short_nameroute_typetrip_iddatestop_sequencet_arrivalt_departurestop_idstop_namestation_idstation_name
10148_109S31091690357562022-03-23 00:00:00192022-03-23 12:31:24+012022-03-23 12:32:12+01de:11000:900120003:2:53S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
10148_109S31091690358992022-03-23 00:00:00102022-03-23 12:33:06+012022-03-23 12:33:54+01de:11000:900120003:3:55S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
10162_109S71091691283812022-03-23 00:00:00192022-03-23 12:33:54+012022-03-23 12:34:42+01de:11000:900120003:2:53S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
10162_109S71091691284952022-03-23 00:00:0092022-03-23 12:30:36+012022-03-23 12:31:24+01de:11000:900120003:3:55S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
10223_109S411091690543702022-03-23 00:00:00212022-03-23 12:30:24+012022-03-23 12:31:12+01de:11000:900120003:5:58S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
10227_109S421091690718822022-03-23 00:00:0062022-03-23 12:30:30+012022-03-23 12:31:12+01de:11000:900120003:5:59S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
19040_100RB141001787487212022-03-23 00:00:00132022-03-23 12:30:00+012022-03-23 12:30:00+01de:11000:900120003:1:50S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)
22664_2FEX21787481252022-03-23 00:00:0012022-03-23 12:32:00+012022-03-23 12:34:00+01de:11000:900120003:4:57S Ostkreuz Bhf (Berlin)de:11000:900120003S Ostkreuz Bhf (Berlin)

translations

There are some…_translated views (e.g.stops_translated,arrivals_departures_translated) that

  • join their respective source table withtranslations, so that each (translatable) field is translated in every provided language,
  • add a…_lang column for each translated column (e.g.stop_name_lang forstop_name) that indicates the language of the translation.

Assuming a dataset withtranslations.csv, let's query all stops with ade-CE translation, falling back to the untranslated values:

SELECT    stop_id,    stop_name, stop_name_lang,    stop_url,FROM stops_translatedWHERE (stop_name_lang='de-CH'OR stop_name_lang ISNULL)AND (stop_url_lang='de-CH'OR stop_url_lang ISNULL)

Usage

Usage:    gtfs-to-sql [options] [--] <gtfs-file> ...Options:    --silent                  -s  Don't show files being converted.    --require-dependencies    -d  Require files that the specified GTFS files depend                                  on to be specified as well (e.g. stop_times.txt                                  requires trips.txt). Default: false    --ignore-unsupported      -u  Ignore unsupported files. Default: false    --route-types-scheme          Set of route_type values to support.                                    - basic: core route types in the GTFS spec                                    - google-extended: Extended GTFS Route Types [1]                                    - tpeg-pti: proposed TPEG-PTI-based route types [2]                                    May also be a set of these schemes, separated by `,`.                                    Default: google-extended    --trips-without-shape-id      Don't require trips.txt items to have a shape_id.                                    Default if shapes.txt has not been provided.    --routes-without-agency-id    Don't require routes.txt items to have an agency_id.    --stops-without-level-id      Don't require stops.txt items to have a level_id.                                    Default if levels.txt has not been provided.    --stops-location-index        Create a spatial index on stops.stop_loc for efficient                                    queries by geolocation.    --lower-case-lang-codes       Accept Language Codes (e.g. in feed_info.feed_lang)                                    with a different casing than the official BCP-47                                    language tags (as specified by the GTFS spec),                                    by lower-casing all of them before validating.                                    http://www.rfc-editor.org/rfc/bcp/bcp47.txt                                    http://www.w3.org/International/articles/language-tags/    --stats-by-route-date         Wether to generate a stats_by_route_date view                                    letting you analyze all data per routes and/or date:                                    - none: Don't generate a view.                                    - view: Fast generation, slow access.                                    - materialized-view: Slow generation, fast access.                                    Default: none    --stats-by-agency-route-stop-hour                                  Generate a view letting you analyze arrivals/                                    departures per route, stop and hour.                                    The flag works like --stats-by-route-date.    --stats-active-trips-by-hour  Generate a view letting you analyze the number of                                    currently running trips over time, by hour.                                    Like --stats-by-route-date, this flag accepts                                    none, view & materialized-view.    --schema                      The schema to use for the database. Default: public                                    Even when importing into a schema other than `public`,                                    a function `public.gtfs_via_postgres_import_version()`                                    gets created, to ensure that multiple imports into the                                    same database are all made using the same version. See                                    also multiple-datasets.md in the docs.    --postgraphile                Tweak generated SQL for PostGraphile usage.                                    https://www.graphile.org/postgraphile/    --postgraphile-password       Password for the PostGraphile PostgreSQL user.                                    Default: $POSTGRAPHILE_PGPASSWORD, fallback random.    --postgrest                   Tweak generated SQL for PostgREST usage.                                    Please combine it with --schema.                                    https://postgrest.org/    --postgrest-password          Password for the PostgREST PostgreSQL user `web_anon`.                                    Default: $POSTGREST_PGPASSWORD, fallback random.    --postgrest-query-cost-limit  Define a cost limit [1] for queries executed by PostgREST                                    on behalf of a user. It is only enforced if                                    pg_plan_filter [2] is installed in the database!                                    Must be a positive float. Default: none                                    [1] https://www.postgresql.org/docs/14/using-explain.html                                    [2] https://github.com/pgexperts/pg_plan_filter    --import-metadata             Create functions returning import metadata:                                    - gtfs_data_imported_at (timestamp with time zone)                                    - gtfs_via_postgres_version (text)                                    - gtfs_via_postgres_options (jsonb)Examples:    gtfs-to-sql some-gtfs/*.txt | sponge | psql -b # import into PostgreSQL    gtfs-to-sql -u -- some-gtfs/*.txt | gzip >gtfs.sql.gz # generate a gzipped SQL dump[1] https://developers.google.com/transit/gtfs/reference/extended-route-types[2] https://groups.google.com/g/gtfs-changes/c/keT5rTPS7Y0/m/71uMz2l6ke0J

Some notable limitations mentioned in thePostgreSQL 14 documentation on date/time types:

Fortimestamp with time zone, the internally stored value is always in UTC (Universal Coordinated Time, traditionally known as Greenwich Mean Time, GMT). An input value that has an explicit time zone specified is converted to UTC using the appropriate offset for that time zone.

When atimestamp with time zone value is output, it is always converted from UTC to the currenttimezone zone, and displayed as local time in that zone. To see the time in another time zone, either changetimezone or use theAT TIME ZONE construct […].

You can run queries with date+time values in any timezone (offset) and they will be processed correctly, but the output will always be in the database timezone (offset), unless you have explicitly usedAT TIME ZONE.

With Docker

Note: Just like thenpm-installed variant, the Docker integration too assumes that your GTFS dataset consists of individual files (i.e. unzipped).

Instead of installing vianpm, you can usetheghcr.io/public-transport/gtfs-via-postgres Docker image:

# variant A: use Docker image just to convert GTFS to SQLdocker run --rm --volume /path/to/gtfs:/gtfs \ghcr.io/public-transport/gtfs-via-postgres --require-dependencies --'/gtfs/*.csv' \| sponge| psql -b

Note: Remember to pass the/gtfs/*.csv glob as a string (with'), so that it gets evaluatedinside the Docker container.

With the code above, thepsql -b process will runoutside of the Docker container, so your host machine needs access to PostgreSQL.

If you want to directly import the GTFS datafrom within the Docker container, you need addpsql to the image and run it from inside. To do that, write a new Dockerfile that extends theghcr.io/public-transport/gtfs-via-postgres image:

FROM ghcr.io/public-transport/gtfs-via-postgresENV PGPORT=5432 PGUSER=postgresWORKDIR /gtfs# pass all arguments into gtfs-via-postgres, pipe output into psql:ENTRYPOINT ["/bin/sh","-c","gtfs-via-postgres $0 $@ | sponge | psql -b"]
# start PostgreSQL DB in another container "db"docker run --name db -p 5432:5432 -e POSTGRES_PASSWORD=password postgis/postgis# variant B: use Docker image to convert GTFS to SQL and import it directlydocker build -t import-gtfs.# build helper Docker image from Dockerfiledocker run --rm --volume /path/to/gtfs:/gtfs \--link db -e PGHOST=db -e PGPASSWORD=password \import-gtfs --require-dependencies --'/gtfs/*.csv'

Importing a GTFS Schedule feed continuously

postgis-gtfs-importer importsGTFS Schedule data into aPostGIS database usinggtfs-via-postgres. It allows running a production service (e.g. an API) on top of programmatically re-imported data from a periodically changing GTFS feed without downtime.

Because it works asatomically as possible with PostgreSQL, it makes the import pipelinerobust, even if an import fails or if simultaneous imports get started.

Exporting data efficiently

If you want to export data from the database, use theCOPY command; On anM1 MacBook Air, PostgreSQL 14 can export about 500kconnections rows per second.

psql -c'COPY (SELECT * FROM connections) TO STDOUT csv HEADER'>connections.csv

In the nestedSELECT query, you can use features likeWHERE,ORDER BY andLIMIT. Becausepsql passes on the exported data right away, you could stream it into another process.

Querying stops by location efficiently

If you want to find stops by (geo)location, rungtfs-via-postgres with--stops-location-index. This will create aspatial index onstops.stop_loc, so that mostPostGIS functions & operators make use of it.

GraphQL support

The--postgraphile flag changes the SQL generated bygtfs-via-postgres slightly, so that you get a reasonably idiomatic GraphQL API out-of-the-box when runningPostGraphile v4 on it:

# import data into PostgreSQL with PostGraphile tweaksnpmexec -- gtfs-to-sql -d --postgraphile -- gtfs/*.csv| sponge| psql -b

In line with the intended PostGraphile usage,gtfs-via-postgres will create a PostgreSQL role/userpostgraphile with read-only access to the DB. You can set thepostgraphile's password with the--postgraphile-password option, or using the$POSTGRAPHILE_PGPASSWORD environment variable; By default, it will use (and log) a random password.

gtfs-via-postgresdoesn't specify PostGraphile as a regular dependency, but aspeerDependencies, in order to stay lightweight for users who don't need the GraphQL interface. Some versions of some package managers install unmet peer dependencies, some don't. Let's make sure that PostGraphile (and its plugins) are installed:

npm install \    postgraphile@^4.12 \    @graphile-contrib/pg-simplify-inflector@^6.1 \    @graphile/postgis@^0.2.0-0

Theserve-gtfs-via-graphql helper script configures and runs PostGraphile. WithNODE_ENV=development, it will

# listens on port 3000, this can be changed using $PORTenv NODE_ENV=development npm exec -- serve-gtfs-via-graphql

As an example for the GraphQL API, check out thetest query or open theGraphiQL UI served atlocalhost:3000/graphiql.

REST API support

With the--postgrest flag,gtfs-via-postgres will augment the schema with aweb_anon role and some comments, so that when runningPostgREST on the database, you will get a powerful REST API.

read more

more guides

Thedocs directory contains more instructions on how to usegtfs-via-postgres.

Correctness vs. Speed regarding GTFS Time Values

When matching time values fromstop_times against dates fromcalendar/calendar_dates, you have to take into account thatGTFS Time values can be >24h andare not relative to the beginning of the day but relative to noon - 12h. (There are a few libraries that don't do this.)

This means that, in order to determine allabsolute points in time where a particular trip departs at a particular stop, youcannot just loop over all "service dates" and add the time value (as inbeginning_of_date + departure_time); Instead, for each date, you have to determine noon, subtract 12h and then apply the time, which might extend arbitrarily far into the following days.

Let's consider two examples:

  • Adeparture_time of26:59:00 with a trip running on2021-03-01: The time, applied to this specific date, "extends" into the following day, so it actually departs at2021-03-02T02:59+01.
  • A departure time of03:01:00 with a trip running on2021-03-28: This is when the standard -> DST switch happens in theEurope/Berlin timezone. Because the dep. time refers to noon - 12h (not to midnight), it actually happens at2021-03-28T03:01+02 which isnot3h1m after2021-03-28T00:00+01.

gtfs-via-postgres always prioritizes correctness over speed. Because it follows the GTFS semantics, when filteringarrivals_departures byabsolute departure date+time, it cannot automatically filterservice_days (which iscalendar andcalendar_dates combined), becauseeven a datebefore the date of the desired departure time frame might still end upwithin, when combined with adeparture_time of e.g.27:30:00; Instead, it has to consider allservice_days and apply thedeparture_time to all of them to check if they're within the range.

However, if you determine your feed's largestarrival_time/departure_time, you can filter ondate when queryingarrivals_departures; This allows PostgreSQL to reduce the number of joins and calendar calculations by orders of magnitude, speeding up your queries significantly.gtfs-via-postgres provides two low-level helper functionslargest_arrival_time() &largest_departure_time() for this, as well as two high-level helper functionsdates_filter_min(t_min) &dates_filter_max(t_max) (see below).

For example, when querying allabsolute departures atde:11000:900120003 (S Ostkreuz Bhf (Berlin)) between2022-03-23T12:30+01 and2022-03-23T12:35+01 within the2022-02-25VBB feed, filtering bydate speeds it up nicely (Apple M1, PostgreSQL 14.2):

station_id filterdate filterquery timenr of results
de:11000:900120003none230ms~574k
de:11000:9001200032022-03-13 >=date <2022-04-08105ms~51k
de:11000:9001200032022-03-23 >=date <2022-03-2455ms~2k
de:11000:9001200032022-03-22 >date <2022-03-2455ms~2k
nonenone192s370m
none2022-03-13 >=date <2022-04-0834s~35m
none2022-03-22 >date <2022-03-242.4s~1523k

Usingdates_filter_min(t_min) &dates_filter_max(t_max), we can easily filter bydate. When filtering byt_departure (absolute departure date+time),t_min is the lowert_departure bound, whereast_max is the upper bound. The VBB example above can be queried like this:

SELECT*FROM arrivals_departures-- filter by absolute departure date+timeWHERE t_departure>='2022-03-23T12:30+01'AND t_departure<='2022-03-23T12:35+01'-- allow "cutoffs" by filtering by dateAND"date">= dates_filter_min('2022-03-23T12:30+01')-- evaluates to 2023-03-22AND"date"<= dates_filter_max('2022-03-23T12:35+01')-- evaluates to 2023-03-23

Performance

With all use cases I could think of,gtfs-via-postgres is reasonably fast. If there's a particular kind of query that you think should be faster, pleaseopen an Issue!

The following benchmarks were run with the2022-07-01 VBB GTFS dataset (41kstops, 6mstop_times, 207m arrivals/departures) usinggtfs-via-postgres@4.7.4 and PostgreSQL 14.7 on anM2 laptop running macOS 12.6.8; All measurements are in milliseconds.

queryavgminp25p50p75p95p99maxiterations
SELECT *
FROM stops
ORDER BY ST_Distance(stop_loc::geometry, ST_SetSRID(ST_MakePoint(9.7, 50.547), 4326)) ASC
LIMIT 100
1514.982151515151515.488100
SELECT *
FROM arrivals_departures
WHERE route_short_name = 'S1'
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
6160.901616161616261.778100
SELECT *
FROM arrivals_departures
WHERE station_id = 'de:11000:900100001' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
3333.129333333333333.34240
SELECT *
FROM arrivals_departures
WHERE station_id = 'de:11000:900100001' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
AND stop_sequence = 0
54.548555554.59850
SELECT *
FROM arrivals_departures
WHERE stop_id = 'de:11000:900100001::4' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
88.038888888.164100
SELECT *
FROM arrivals_departures
WHERE trip_id = '168977951'
AND date > '2022-08-08' AND date <= '2022-08-09'
21.878222221.911100
SELECT count(*)
FROM arrivals_departures
WHERE stop_id = 'de:11000:900100001::4' -- S+U Friedrichstr. (Berlin)
5857.485585858585857.789100
SELECT count(*)
FROM arrivals_departures
WHERE stop_id = 'definitely-non-existent'
21.832222221.876100
SELECT *
FROM arrivals_departures
WHERE t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02'::timestamp with time zone)
AND date <= dates_filter_max('2022-08-09T07:30+02'::timestamp with time zone)
63106238.819624162626311650365606573.76810
SELECT *
FROM arrivals_departures
WHERE t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= '2022-08-08'
AND date <= '2022-08-09'
49314914.388492549284937494649484948.68910
SELECT *
FROM connections
WHERE route_short_name = 'S1'
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
164163.018163164164164165166.568100
SELECT *
FROM connections
WHERE from_station_id = 'de:11000:900100001' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
5958.137585859606161.46140
SELECT *
FROM connections
WHERE from_station_id = 'de:11000:900100001' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
AND from_stop_sequence = 0
77.439777777.4950
SELECT *
FROM connections
WHERE from_stop_id = 'de:11000:900100001::4' -- S+U Friedrichstr. (Berlin)
AND t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02')
AND date <= dates_filter_max('2022-08-09T07:30+02')
1514.529151515151514.698100
SELECT *
FROM connections
WHERE trip_id = '168977951'
AND date > '2022-08-08' AND date <= '2022-08-09'
32.86333332.931100
SELECT count(*)
FROM connections
WHERE from_stop_id = 'de:11000:900100001::4' -- S+U Friedrichstr. (Berlin)
7372.687737373737373.35100
SELECT count(*)
FROM connections
WHERE from_stop_id = 'definitely-non-existent'
33.428333343.525100
SELECT *
FROM connections
WHERE t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= dates_filter_min('2022-08-09T07:10+02'::timestamp with time zone)
AND date <= dates_filter_max('2022-08-09T07:30+02'::timestamp with time zone)
ORDER BY t_departure
LIMIT 100
1312713056.841130861312513170131941319913200.0277
SELECT *
FROM connections
WHERE t_departure >= '2022-08-09T07:10+02' AND t_departure <= '2022-08-09T07:30+02'
AND date >= '2022-08-08'
AND date <= '2022-08-09'
ORDER BY t_departure
LIMIT 100
64176237.932634663946512656265706571.4557
SELECT *
FROM stats_by_route_date
WHERE route_id = '17452_900' -- M4
AND date >= '2022-08-08' AND date <= '2022-08-14'
AND is_effective = true
28622853.972286028632863286728672866.79810

Related Projects

There are some projects that are very similar togtfs-via-postgres:

Node-GTFS

Node-GTFS (gtfs npm package) is widely used. It covers three use cases: importing GTFS into anSQLite DB, exporting GTFS/GeoJSON from it, and generating HTML or charts for humans.

I don't use it though because

  • it doesn't handle GTFS Time values correctly (1/2, checked on 2022-03-01)
  • it doesn't always work in a streaming/iterative way (1/2, checked on 2022-03-01)
  • sometimes does synchronous fs calls (1/2, checked on 2022-03-01)

gtfs-sequelize

gtfs-sequelize usessequelize.js to import a GTFS feed and query the DB.

I don't use it because

  • it doesn't handle GTFS Time values correctly (1/2, cheked on 2022-03-01)
  • it doesn't provide much tooling for analyzing all arrivals/departures (checked on 2022-03-01)
  • some of its operations are quite slow, because they fetch related records of a record via JS instead of usingJOINs

gtfs-sql-importer

There are several forks of theoriginal outdated project;fitnr's fork seems to be the most recent one.

The project has a slightly different goal thangtfs-via-postgres: Whilegtfs-sql-importer is designed to import multiple versions of a GTFS dataset in an idempotent fashion,gtfs-via-postgres assumes thatone (version of a) GTFS dataset is imported intoone DB exactly once.

gtfs-via-postgres aims to provide more tools – e.g. thearrivals_departures &connections views – to help with the analysis of a GTFS dataset, whereasgtfs-sql-importer just imports the data.

other related projects

  • gtfsdb – Python library for converting GTFS files into a relational database.
  • pygtfs – A python (2/3) library for GTFS (fork ofgtfs-sql)
  • gtfspy – Public transport network analysis using Python and SQLite.
  • GTFS Kit – A Python 3.6+ tool kit for analyzing General Transit Feed Specification (GTFS) data.
  • GtfsToSql – Parses a GTFS feed into an SQL database (Java)
  • gtfs-to-sqlite – A tool for generating an SQLite database from a GTFS feed. (Java)
  • gtfs-lib – Java library & CLI for importing GTFS files into a PostgreSQL database.
  • gtfs-schema – PostgreSQL schemas for GTFS feeds. (plain SQL)
  • markusvalo/HSLtraffic – Scripts to create a PostgreSQL database for HSL GTFS-data. (plain SQL)

License

This project is dual-licensed:My (@derhuerst) contributions are licensed under theProsperity Public License,contributions of other people are licensed asApache 2.0.

This license allows you to use and share this software for noncommercial purposes for free and to try this software for commercial purposes for thirty days.

Personal use for research, experiment, and testing for the benefit of public knowledge, personal study, private entertainment, hobby projects, amateur pursuits, or religious observance, without any anticipated commercial application, doesn’t count as use for a commercial purpose.

Get in touch with me to buy a commercial license or read more aboutwhy I sell private licenses for my projects.

Contributing

If you have a question or need support usinggtfs-via-postgres, please double-check your code and setup first. If you think you have found a bug or want to propose a feature, usethe issues page.

By contributing, you agree to release your modifications under theApache 2.0 license.

About

Process GTFS Static/Schedule by importing it into a PostgreSQL database.

Topics

Resources

License

Unknown and 2 other licenses found

Licenses found

Unknown
LICENSE
Apache-2.0
LICENSE-APACHE
Unknown
LICENSE-PROSPERITY.md

Stars

Watchers

Forks

Packages

 
 
 

[8]ページ先頭

©2009-2025 Movatter.jp