- Notifications
You must be signed in to change notification settings - Fork5
Repository for developing the project 35 - FAIRX: Quantitative bias assessment in ELIXIR biomedical data resources - for the 2021 Elixir biohackathon
License
social-link-analytics-group-bsc/biohackathon-project-35
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
The design of AI systems for health is a grand achievement of science and technology of our times. Nevertheless, such systems learn to perform specific tasks by processing extensive amounts of data that is produced and stored in large biomedical repositories. The quality and content of this data have an immense impact on what and how AI learns. If the data contains biases, such as skewed representation of certain categories or missing information, the application of AI can lead to discriminatory outcomes and propagate them into society, as we recently pointed out (Cirillo et al. NPJ Digit Med. 2020 doi:10.1038/s41746-020-0288-5).The aim of our project is to determine the extent of biases in available demographic categories (sex, age, race) in ELIXIR biomedical data repositories, which are largely used in the community to train AI systems. We aim to quantify bias and provide recommendations on how to properly use the data to develop fair and trustworthy AI, including solutions and best practices.We have recently collected endorsement and support regarding this project from representatives of several ELIXIR platforms, communities and focus groups, namely Data platform, Human Data Communities, Diversity, Equity, & Inclusion group, Impact group, Industry group and Communication.
CancerData PlatformFederated Human DataHuman Copy Number VariationMachine learningRare Disease
Project Number: 35
EasyChair Number: 61
Davide Cirillodavide.cirillo@bsc.esNataly Buslónnataly.buslon@bsc.es
Task 1. Quantification of bias in selected resourcesTask 2. Evaluation of social and ethical impact
ELIXIR data resources representatives especially designers, developers and data minersComputer scientists with database skills including development and data managementResearchers in computational biology with strong programming backgroundResearchers in social sciences with interests in biomedicine and technologyData scientists with strong analytical and statistical knowledgeBioinformaticians with knowledge on biological data resourcesBiostatisticians with interests in bias and data miningResearchers and practitioners in academic or industrial fields devoted to social equity
- Nataly Buslón, subgroup spokesperson
- Gemma Holliday
- Atia Cortés
- FTP access to the dataset:http://ftp.ncbi.nlm.nih.gov/dbgap/studies
- Study Submission Guide:https://www.ncbi.nlm.nih.gov/gap/docs/submissionguide/ andhttps://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/GetPdf.cgi?document_name=HowToSubmit.pdf
Non-NIH funded "expectations":https://osp.od.nih.gov/wp-content/uploads/Expectations_for_Non-NIH_Funded_Submission_Requests.pdf
Basic Requirements:https://osp.od.nih.gov/wp-content/uploads/Non-NIH-Funded_Basic_Study_Information.pdf
Template files:https://ftp.ncbi.nlm.nih.gov/dbgap/dbGaP_Submission_Guide_Templates/Individual_Submission_Templates/
Data Access:https://www.ncbi.nlm.nih.gov/books/NBK5294/ andhttps://osp.od.nih.gov/wp-content/uploads/NIH_Best_Practices_for_Controlled-Access_Data_Subject_to_the_NIH_GDS_Policy.pdf
Quality Control Errors:https://www.ncbi.nlm.nih.gov/gap/public_utils/messages/ and for the QC process:https://www.ncbi.nlm.nih.gov/gap/docs/submissionguide/#aqcchecks
- Davide Cirillo, subgroup spokesperson
- María Morales
- Alejandro Muñoz
- Camila Pontes
- Olivier Philippe
- API Metadata documentation:https://ega-archive.org/metadata/how-to-use-the-api
- Policy documentation:https://ega-archive.org/submission/dac/documentation
- Submitter Portal:https://ega-archive.org/submission/tools/submitter-portal
- Quality Control Reportshttps://ega-archive.org/about/quality-control-reports
- Implementation of the EU General Data Protection Regulation (GDPR):https://ega-archive.org/privacy-notice
- Data Access:https://ega-archive.org/access/data-access
- Download Client V3:https://ega-archive.org/download/downloader-quickguide-APIv3
- Metadata Rest Endpoints:https://ega-archive.org/metadata/how-to-use-the-api
- Aina Jené, subgroup spokesperson
- Babita Singh
- Mauricio Moldes
- Victoria Ruiz
- Diego Saby
About
Repository for developing the project 35 - FAIRX: Quantitative bias assessment in ELIXIR biomedical data resources - for the 2021 Elixir biohackathon
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors12
Uh oh!
There was an error while loading.Please reload this page.