- Notifications
You must be signed in to change notification settings - Fork0
Backend + Frontend Code for Civic Data Lab data project (Datenvorhaben) with all.txt and &effect GmbH
License
CorrelAid/all-txt-textbox
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Dies ist ein read-only Repository / Mirror, das den Frontend- und Backend-Code für denText-Box-Prototyp enthält, der im Rahmen des Datenprojekts zwischenall.txt und demCivic Data Lab entwickelt wurde. Der Code wurde entwickelt von&effect data solutions GmbH basierend auf früheren Ideen und Skripten des all.txt-Teams. Der Code ist unter der MIT-Lizenz verfügbar (siehe LICENSE sowie weitere Hinweise für mehr Informationen). Code und Dokumentation ist in Englisch.
Weitere Hinweise:
- Teile des Codes, die maßgeblich auf geistigem Eigentum des all.txt-Teams beruhen, sind in dieser Open-Source-Version nicht enthalten - der Code ist dementsprechend kommentiert. Dies betrifft Teile von
backend/src/pipelines/gender_neutralizer.py
undbackend/src/database/migrations/data/gender-dictionary.yaml
. - Der CI/CD-Setup (
gitlab-ci.yml
) ist spezifisch für GitLab.
This is a read-only repository / mirror of the repository containing the frontend and backend code for thetextbox prototype that was developed as part of the data project (Datenvorhaben) betweenall.txt and theCivic Data Lab. Developed by&effect data solutions GmbH based on earlier ideas and scripts by the all.txt team. Code is available is under the MIT license (see LICENSE and further pointers for details).
Further pointers:
- Parts of the code that are significantly based on intellectual property from the all.txt team are excluded in this open source version, they are marked as such. This affects parts of
backend/src/pipelines/gender_neutralizer.py
andbackend/src/database/migrations/data/gender-dictionary.yaml
. - The CI/CD setup (
gitlab-ci.yml
) is specific to GitLab.
Please install the following requirements to get started with the local development environment.
The PostgreSQL database and Redis cache are running in Docker containers. To start the database andcache, run the following command:
docker compose up
You can stop the database and cache by pressingCtrl + C
in the terminal.
To install the current setup execute the following commands. Please ensure that you havepoetry
installed to manage the backend dependencies.
cd backendpoetry install --no-root
Activate the environment with:
poetry shell
To start the backend server in reload mode, run the following command. The backend is started perdefault on port8000
. The automatically generated API documentation is available athttp://localhost:8000/docs.
cd backend/srcuvicorn api:app --reload
To install the current setup execute the following commands. Please ensure that you haveNode.js
installed to manage the frontend dependencies.
cd frontendnpm install
To start the frontend server in development mode, run the following command. The frontend is startedper default on port3000
. The frontend is available athttp://localhost:3000.
cd frontendnpm run dev
You can stop the frontend server by pressingCtrl + C
in the terminal.
We are usingSQLAlchemy
for database operations. To make changes to the database, add a new model,update a model or delete a model in themodels
folder. For new tables, create a new file. Forexample, to create a new table calledUser
, add the following code:
fromdatetimeimportdatetimefrommodels.baseimportBasefromsqlalchemyimportDateTime,String,func,Uuidfromsqlalchemy.ormimportMapped,mapped_columnclassUser(Base):__tablename__="users"id:Mapped[str]=mapped_column(Uuid,primary_key=True)name:Mapped[str]=mapped_column(String)email:Mapped[str]=mapped_column(String)created_at:Mapped[datetime]=mapped_column(DateTime,default=func.now())
After creating the model, import the model in themodels/__init__.py
file and append module exportby addingUser
to the__all__
list. For example, to import theUser
model, add the followingcode:
frommodels.userimportUser__all__= [ ...,"User",]
We are usingalembic
for database migrations.rev-id
is the revision id for the migration.m
is the message for the migration. Please checkdatabase/migrations/versions
folder for the latestrevision id. The latest revision id is the highest number in the folder. The following command willautomatically create a new migration file in thedatabase/migrations/versions
folder with thelatest changes in the database models. To create a new migration, run the following command:
make alembic-revision rev-id="0006" m="Create users table"
After creating the migration, apply the migration to the database by running the following command:
make alembic-upgrade
You can also downgrade the migration by running the following command:
make alembic-downgrade
The deployment is configured using Docker Stack. The configuration is stored in thedocker-swarm.yaml
file. The deployment is configured to run the backend, frontend, and the Rediscache on a Docker Swarm environment. The traffic is routed through aTraefik reverse proxy. The PostgreSQL database is running as aseparate service on Scaleway.
The deployment can be executed in a GitLab CI/CD pipeline. The deployment is triggered by pushingthe changes to the remote repository. The containers for the backend and frontend are built and theimages are pushed to the GitLab registry. The building is automatically triggered by the GitLabCI/CD pipeline. In a second step, the deployment is executed using thedocker stack deploy
commandon a remote Docker context. The services are deployed on a Docker Swarm cluster consisting currentlyof a single node. The infrastructure setup is configured in theinfrastructure repository.
Secrets are stored in the GitLab CI/CD environment variables and pasted into the build environmentduring the second step. The secrets are configured in the remote Docker context as part of thedeployment and can be accessed by the configured services. The secrets are used to store thedatabase and Redis cache credentials.
You can start the deployment by pushing the changes to the remote repository and executing the jobsin theGitLab CI/CD pipeline linked to thelatest commit.
Important
Please update the database role of theapp_api
user in the PostgreSQL database after making anychanges to the database. Theapp_api
user needs to have theRead/Write
role for theall_txt
database. The role can be updated in the Scaleway console. Please refer to theScaleway documentation for moreinformation on how to update the database role.
The Rate Limiter is a dependency that limits the number of requests a client can make to the server.The Rate Limiter is based onFastAPI-Limiter packageand uses Redis as a cache. The Rate Limiter is used to protect the server from abuse and to ensurethat the server is not overwhelmed by too many requests. If the rate limit is exceeded, the serverwill return a429 Too Many Requests
status code with aRetry-After
header that indicates howlong the client should wait before making another request.
The Rate Limiter can be bypassed by providing an API key (e.g.:[redacted]
). The API key is used to identify the client and to allow the client to make requests to the server. The API key is generated by the server and is unique to the client. The API key is stored in thePostgres database in theapi_keys
table.
Please use the classmethodcreate_api_key
of theRateLimiterWithApiKey
class to generate newAPI keys. The key is automatically written to the database using thecreate_api_key
method. Youcan use thecomment
argument to add details about the client. Each API key can be uniquelyidentified by theid
field. Thekey
field is the actual API secret. Thekey
is stored as ahash in the database. The API key that the client receives combines a prefix, theid
, and thedecoded secret seperated with adash
. The API key needs to be stored in a secure place by theclient. The API key needs to be provided with each request in theAuthorization
header.
curl -X POST \ --header'Authorization: alltxt_11...11_aQ...5C' \ --header'Content-Type: application/json' \ --data'{"text":"Lehrer"}' \'http://localhost:8000/'
You can configure the rate limiter adding theRateLimiterWithApiKey
dependency to the endpoint.You can set the number of requests and the time window in theRateLimiterWithApiKey
dependency.The following example limits the number of requests to 5 requests per minute. The rate limiter isapplied to the/rate-limited-endpoint
endpoint.
@app.post(path="/rate-limited-endpoint",dependencies=[Depends(RateLimiterWithApiKey(times=5,minutes=1))],)asyncdefrun_expansive_function():pass
New API keys can be generated using thecreate_api_key
classmethod of theRateLimiterWithApiKey
.
fromdependenciesimportRateLimiterWithApiKeyimportasyncioasyncdefmain():key=awaitRateLimiterWithApiKey.create_api_key("all.txt Test API Key")print(key)asyncio.run(main())