- Notifications
You must be signed in to change notification settings - Fork2
FDSNWS-Availability implementation
License
EIDA/ws-availability
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
- WFCatalog DB (gray) - database used by the WFCatalog collector and API.
- FDSNWS-Availability API (blue) - Flask-based FDSNWS-Availability implementation.
- FDSNWS-Availability Cache (green) - Redis-based cache to store restriction information.
- FDSNWS-Availability Cacher (orange) - Python-based container to harvest and store restriction information.
- FDSNWS-Availability Update (purple) - JS script to fill the
availability
materialized view using WFCatalogdaily_streams
andc_segments
collections.
Following implementation requires MongoDB v4.2 or higher.
Clone the [https://github.com/EIDA/ws-availability] repository and go to its root
Copy
config.py.sample
toconfig.py
and adjust it as needed (please notice there are two sections -RUNMODE == "production"
andRUNMODE == "test"
; for Docker deployment use theproduction
section):# WFCatalog MongoDBMONGODB_HOST ="localhost"#MongoDB hostMONGODB_PORT = 27017#MongoDB portMONGODB_USR =""#MongoDB userMONGODB_PWD =""#MongoDB passwordMONGODB_NAME ="wfrepo"#MongoDB database nameFDSNWS_STATION_URL ="https://orfeus-eu.org/fdsnws/station/1/query"#FDSNWS-Station endpoint to harvest restriction information fromCACHE_HOST ="localhost"#Cache hostCACHE_PORT = 6379#Cache portCACHE_INVENTORY_KEY ="inventory"#Cache key for restriction informationCACHE_INVENTORY_PERIOD = 0#Cache invalidation period for `inventory` key; 0 = never invalidateCACHE_RESP_PERIOD = 1200#Cache invalidation period for API response
Build the containers:
docker-compose -p'fdsnws-availability' up -d --no-deps --build
When the Docker stack is deployed, you will see 3 containers running:
$ docker psCONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES4e3dace01fb0 fdsnws-availability_api"/bin/bash -c 'gunic…" 10 seconds ago Up 5 seconds 0.0.0.0:9001->9001/tcp fdsnws-availability-api3c91e0d1c5e6 fdsnws-availability_cacher"/bin/bash -c 'pytho…" 10 seconds ago Up 5 seconds 0.0.0.0:11211->11211/tcp fdsnws-availability-cacherd983e64d64a8 redis:7.0-alpine"docker-entrypoint.s…" 10 seconds ago Up 5 seconds 0.0.0.0:6379->6379/tcp fdsnws-availability-cache
You can follow the
fdsnws-availability-cacher
container to see the status of restriction information harvesting:$ docker logs --follow fdsnws-availability-cacher[2023-01-11 09:47:38 +0000] [0] [INFO] Getting inventory from FDSNWS-Station...[2023-01-11 09:47:39 +0000] [0] [INFO] Harvesting 33 from https://orfeus-eu.org/fdsnws/station/1/query?level=network: 2M,3T,6A...#...[2023-02-15 08:31:56 +0000] [0] [INFO] Completed caching inventory from FDSNWS-Station
Once
fdsnws-availability-cacher
is completed, it will go down. Harvested information is stored in the Redis DB served byfdsnws-availability-cache
container. To rebuild the cache, simply restart the container using:docker start fdsnws-availability-cacher
To automate cache rebuilding process, add following line to
cron
:# Rebuild FDSNWS-Availability restriction information cache daily at 3:00 AM0 3*** docker restart fdsnws-availability-cacher
It will harvest and overwrite the restricted information stored in Redis instance.
Materialized view
Initial build
When the stack is initially deployed, the materialized view is not yet in place. To build it, issue the following command:
# Script started on 2023-02-24$ mongosh -u USER -p PASSWORD --authenticationDatabase wfrepo --eval"daysBack=365" views/main.jsProcessing WFCatalog entries using networks:'^.*$', stations:'^.*$', start:'2022-03-24', end:'2023-03-24' completed!
It will go throught the documents in
daily_streams
andc_segments
from last year, extract availability information and store it in theavailability
materialized view.Daily appension
To automate availability information appension, add following line to
cron
:0 6***cd~/ws-availability/views&& mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo main.js> /dev/null2>&1
It will go throught the documents in
daily_streams
andc_segments
from last day, extract availability information and append it to theavailability
materialized view. If additional parameters are not provided, script processes data from last day:# Script started on 2023-02-24$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo main.jsProcessing WFCatalog entries using networks:'^.*$', stations:'^.*$', start:'2023-03-23', end:'2023-03-24' completed!
Back-processing
Processing can be also executed on a predefined subset of data using
networks
,stations
,start
andend
parameters.# Last week$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval"daysBack=7;" main.jsProcessing WFCatalog entries using networks:'^.*$', stations:'^.*$', start:'2023-03-17', end:'2023-03-24' completed!# January 2023$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval"start='2023-01-01'; end='2023-01-31'" main.jsProcessing WFCatalog entries using networks:'^.*$', stations:'^.*$', start:'2023-01-01', end:'2023-01-31' completed!# NL.HGN data between December 2022 and January 2023$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval"networks='NL'; stations='HGN'; start='2022-12-01'; end='2023-01-31'" main.jsProcessing WFCatalog entries using networks:'^NL$', stations:'^HGN$', start:'2022-12-01', end:'2023-01-31' completed!# You can also use regular expressiosn for `networks` and `stations` params# Please refer to [docs](https://www.mongodb.com/docs/manual/reference/operator/query/regex/) for details# Stations from NL network matching `G*4` template with timespan from 2023-03-01 till 2023-03-02$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval"networks='NL'; stations='G.*4'; start='2023-03-01'; end='2023-03-02'" main.jsProcessing WFCatalog entries using networks:'^NL$', stations:'^G.*4$', start:'2023-03-01', end:'2023-03-02' completed!# Stations from `NL` or `NA` networks with station codes `HGN` or `SABA` and timespan from 2023-03-01 till 2023-03-02$ mongosh -u USERNAME -p PASSWORD --authenticationDatabase wfrepo --eval"networks='NL|NA'; stations='HGN|SABA'; start='2023-03-01'; end='2023-03-02'" main.jsProcessing WFCatalog entries using networks:'^NL|NA$', stations:'^HGN|SABA$', start:'2023-03-01', end:'2023-03-02' completed!# All stations from networks `NL` and `NA` with timespan from 2023-03-01 till 2023-03-02$ mongosh -u jarek -p password123 --authenticationDatabase wfrepo --eval"networks='NL|NA'; start='2023-03-01'; end='2023-03-02'" main.jsProcessing WFCatalog entries using networks:'^NL|NA$', stations:'^.*$', start:'2023-03-01', end:'2023-03-02' completed!
Indexes
It is highly suggested to create at least following index in the
availability
materialized view. First, login to your MongoDB instance usingmongosh
and then execute following commands:use wfrepo;db.availability.createIndex({ net: 1, sta: 1, loc: 1, cha: 1, ts: 1, te: 1 })
Validation
Now it is time to check if everything is running (remember to change the
net
query parameter). API is exposed by default on port9001
, let's try to get the landing page:$ curl"127.0.0.1:9001"<!DOCTYPE html><html lang="en"><head><meta charset="utf-8" /><meta name="author" content="gempa GmbH" /><title>FDSNWS-Availability</title></head><body><h1>FDSNWS Availability Web Service</h1><p> The availability web service returns detailedtime span information about availabletime series data. Please refer to<a href="http://www.fdsn.org/webservices">http://www.fdsn.org/webservice</a>for acomplete service description.</p><h2>Available URLs</h2><ul><li><a href="query">query</a></li><li><a href="extent">extent</a></li><li><a href="version">version</a></li><li><a href="application.wadl">application.wadl</a></li></ul></body>
Get request to the
/extent
method:$ curl"127.0.0.1:9001/extent?net=NA&start=2023-02-01"#Network Station Location Channel Quality SampleRate Earliest Latest Updated TimeSpans RestrictionNA SABA BHE D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:41:14Z 1 OPEN NA SABA BHN D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:42:07Z 1 OPEN NA SABA BHZ D 40.0 2023-02-01T00:00:00.000000Z 2023-02-14T00:00:00.000000Z 2023-02-14T07:41:41Z 1 OPEN# ...
Get request to the
/query
method:$ curl"127.0.0.1:9001/query?net=NA&start=2023-02-01"#Network Station Location Channel Quality SampleRate Earliest LatestNA SABA BHE D 40.0 2023-02-01T00:00:00.000000Z 2023-02-02T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-02T00:00:00.000000Z 2023-02-03T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-03T00:00:00.000000Z 2023-02-04T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-04T00:00:00.000000Z 2023-02-05T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-05T00:00:00.000000Z 2023-02-06T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-06T00:00:00.000000Z 2023-02-07T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-07T00:00:00.000000Z 2023-02-08T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-08T00:00:00.000000Z 2023-02-09T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-09T00:00:00.000000Z 2023-02-10T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-10T00:00:00.000000Z 2023-02-11T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-11T00:00:00.000000Z 2023-02-12T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-12T00:00:00.000000Z 2023-02-13T00:00:00.000000ZNA SABA BHE D 40.0 2023-02-13T00:00:00.000000Z 2023-02-14T00:00:00.000000Z
Reverse proxy example config
An example of Apache reverse proxy config:
# FDSNWS-Availability (Docker)<Location /fdsnws/availability/1># in order to omit CORS error Header add Access-Control-Allow-Origin"*"</Location>ProxyPass /fdsnws/availability/1<HOST>:9001 timeout=600ProxyPassReverse /fdsnws/availability/1<HOST>:9001 timeout=600
Go to the root directory.
Copy
config.py.sample
toconfig.py
and adjust it as needed.Create the virtual environment:
python3 -m venv env
Activate the virtual environment:
source env/bin/activate
Install the dependencies:
pip install -r requirements.txt
Create Redis instance (mandatory for WFCatalog-based deployment):
docker run -p 6379:6379 --name cache -d redis:7.0-alpine redis-server --save 20 1 --loglevel warning
Build the cache:
python3 cache.py
Now you can either:
Run it:
RUNMODE=test FLASK_APP=start.py flask run# Or with gunicorn:RUNMODE=test gunicorn --workers 2 --timeout 60 --bind 0.0.0.0:9001 start:app
Debug it in VS Code (F5) after selecting "Launch (Flask)" config.
production
test
Tests can be executed from the respository root using following command:
PYTHONPATH=./apps/ python3 -m unittest discover tests/
- Move restriction information from Redis cache directly to the
db.availability
materialized view. This would imply modifying theviews/main.js
script with code harvesting this information directly from the FDSNWS-Station instance. - Modify underlying RESIF code from logic based on list of arrays to list of objects/dicts which is native MongoDB response to prevent the object/dict to array casting.
This repository has been forked fromgitlab.com/resif/ws-availability, special thanks to our colleagues at RESIF for sharing their implementation of the FDSNWS-Availability web service. 💐
About
FDSNWS-Availability implementation