Qrlew
Qrlew (/ˈkɝlu/) is theopen source library that rewrites SQL queries into privacy-preserving variants usingDifferential Privacy (DP).
Use Qrlew if you want to bring privacy guarantees to your SQL pipelines. It is:
- SQL-to-SQL: Qrlew turns SQL queries into differentially-private SQL queries that can be executed at scale on many SQL datastore, in many SQL dialects.
- Feature-rich: Qrlew covers the broadest range of SQL queries, including JOIN and nested queries
- Privacy-optimized: Qrlew keeps track of tight bounds and ranges throughout each step, minimizing the amount of noise needed to achieve differential privacy.
The rewriting process occurs in three stages: Thedata practitioners’s query is parsedinto a Relation, which is rewritten into a DP equivalent and finally executed by the the dataowner which returns the privacy-safe result.
Qrlew motivations: make differential privacy affordable for analytics use cases
Qrlew assumes thecentral model of differential privacy, where a trusted central organization: hospital, insurance company, utility provider, called thedata owner, collects and stores personal data in a secure database and whishes to let untrusteddata practitioners run SQL queries on its data.
At a high level we pursued the following requirements:
- Ease of use for thedata practitioners. Thedata practitioners are assumed to be data experts but no privacy experts. They should be able to express their queries using the most common dialect for data analysis: SQL.
- Ease of integration for thedata owner. SQL is a common language to express data analysis tasks supported by most datastores of all scale.
- Simplicity for thedata owner to setup privacy protection. Differential privacy is about capping the sensitivity of a result to the addition or removal of an individual that we callprivacy unit.Qrlew assumes that thedata owner can tell if a table is public and, if it is not, that it can assign exactly oneprivacy unit to each row of data. In the case there are multiple related tables,Qrlew enables to define easily theprivacy units for each table transitively.
- Simple integration with synthetic data when available. Some queries are not very suitable for DP-rewriting (e.g.:
SELECT * FROM table
), in those casesQrlew can use synthetic data as a fallback if provided.
These requirements dictated the overallquery rewriting architecture and features.
How doesQrlew work?
TheQrlew library, solves the problem of running a SQL query with DP guarantees in three steps:
- the SQL query submitted by thedata practitioners is parsed and converted into an intermediate representation calledRelation, thisRelation is designed to ease the tracking of data types ranges or possible values, to ease the tracking of theprivacy unit in the next step.
- TheRelation is rewritten into a DP variant
- The DP variant of theRelation can be rendered as an SQL query string in any dialect.
At the end of this process, the query string can submitted to the data store of thedata owner. The output can be shared with thedata practitioner or published without worrying about privacy leakage.
Deep Dive intoQrlew
To learn more aboutQrlew read theQrlew white paper.
You can cite us:
@misc{grislain2024qrlew,title={Qrlew: Rewriting SQL into Differentially Private SQL},author={Nicolas Grislain and Paul Roussel and Victoria de Sainte Agathe},year={2024},eprint={2401.06273},archivePrefix={arXiv},primaryClass={cs.DB},url={https://arxiv.org/pdf/2401.06273.pdf}}
Repositories
- sqlparser-rs Public Forked fromapache/datafusion-sqlparser-rs
Extensible SQL Lexer and Parser for Rust
Qrlew/sqlparser-rs’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…