Navbar Search Filter Mobile Enter search termSearch

Bioinformatics Journals

International Society for Computational Biology

Navbar Search Filter Enter search termSearch

Advanced Search

Search Menu

AI Discovery Assistant

Article Navigation

Volume 25

Issue 2

January 2009

Article Contents

Journal Article

GWAS GUI: graphical browser for the results of whole-genome association studies with high-dimensional phenotypes

Wei Chen*

Center for Statistical Genetics, Department of Biostatistics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA

*To whom correspondence should be addressed.

Search for other works by this author on:

Oxford Academic

PubMed

Google Scholar

Liming Liang

Center for Statistical Genetics, Department of Biostatistics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA

Search for other works by this author on:

Oxford Academic

PubMed

Google Scholar

Gonçalo R. Abecasis

Center for Statistical Genetics, Department of Biostatistics, University of Michigan, 1420 Washington Heights, Ann Arbor, MI 48109, USA

Search for other works by this author on:

Oxford Academic

PubMed

Google Scholar

Associate Editor: Martin Bishop

Author Notes

Bioinformatics, Volume 25, Issue 2, January 2009, Pages 284–285,https://doi.org/10.1093/bioinformatics/btn600

Published:

20 November 2008

Article history

Received:

06 August 2008

Revision received:

03 November 2008

Accepted:

14 November 2008

Published:

20 November 2008

Navbar Search Filter Mobile Enter search termSearch

Navbar Search Filter Enter search termSearch

Advanced Search

Search Menu

AI Discovery Assistant

Abstract

Summary: We describe an interactive package that provides graphical overviews of the results of whole-genome association studies in datasets with rich multi-dimensional phenotypic information, such as global surveys of gene expression. Windows, Linux and Mac binaries are available from our website.

Availability: http://www.sph.umich.edu/csg/weich/software.html

Contact: [email protected]

Supplementary information: Supplementary data are available atBioinformatics online.

1 INTRODUCTION

Recently, genome-wide association scans (GWAS) have been used to successfully dissect a variety of complex traits, ranging from discrete clinical outcomes such as asthma and diabetes (Moffattet al.,2007; Scottet al.,2007; WTCCC,2007) to continuous traits as diverse as height, weight, global gene expression and blood lipid levels (Dixonet al.,2007; Fraylinget al.,2007; Sannaet al.,2008; Scuteriet al.,2007; Willeret al.,2008). The amount of information generated in these studies is staggering and interpreting their results requires efficient computational tools for data analysis and visualization. This challenge is most noticeable when high-dimensional data (such as microarray gene expression data or proteomics data) are analyzed. In this case, the results of whole genome association studies can include billions of data points (Cheunget al.,2005; Dixonet al.,2007; Moffattet al.,2007). Realizing the full benefits of these studies requires an efficient way to share data among collaborators and with other researchers, both before and after the data are published. Here, we present a tool that facilitates interactive browsing of the results from whole genome association studies. To illustrate the capabilities of our browser, we used it to create an interactive interface for the results of a recent genome-wide association study of global gene expression (Dixonet al.,2007). The objective of the Dixonet al. (2007) study was to build a database that would allow researchers to systematically examine potential effects of disease-associated variants on transcript expression and our interactive browser makes it easy for many researchers to explore the data.

A diverse set of statistical methods can be used to examine the association between phenotypes of interest and single nucleotide polymorphism (SNP) data. For example, χ² test statistics,P-values, effect size estimates and their standard errors, as well as SNP-specific heritability estimates are all commonly reported in GWAS studies. When there are tens of thousands of phenotypic outcomes and hundreds of thousands SNPs, the result set is usually very large, containing several million statistics and easily totaling several gigabytes. These datasets can be integrated into specialized local databases for further investigation, but it can be challenging for researchers without extensive database or programming skills to access results. Our GWAS GUI (Graphic User Interface) is intended to provide a convenient tool for interacting with arbitrary GWAS result sets and to facilitate searches and displays of GWAS results in graph or tabular form. We hope our tool will facilitate data sharing within collaborative groups and with the public at large.

2 FEATURES OF GWAS GUI BROWSER

Our GWAS GUI browser is an interactive package that facilitates rapid interactive browsing of whole-genome association study results. It is designed to handle thousands of phenotypes, and thus can handle very rich datasets, such as those where global surveys of gene expression are combined with genome-wide SNP data. The browser also allows users to interact with the results of simpler scans, such as scans that focus on a single discrete outcome or a small number of related traits. To evaluate the program, we have applied it to several large datasets, including a study evaluating the association between 408 273 SNPs and the levels of 54 675 transcripts representing 20 599 known genes and assessed in lymphoblastoid cell lines from approximately 400 children (Dixonet al.,2007). After this initial evaluation, we released an early version of the program, named the mRNA by SNP browser (MRBS), when the Dixonet al. (2007) paper was published. In addition to the visualization tool, the full GWAS GUI browser includes a data preparation tool that can be used to organize tabulated results into an indexed database for rapid browsing. There are two main browsing interfaces within our browser: (i) an interface that retrieves all results for a specific trait and (ii) an interface that retrieves all results in a specific genomic region. In either view, results can typically be retrieved almost instantaneously. In the ‘trait-centric’ view, the browser can tabulate and sort a summary of user provided association test results (e.g. effect size, standard error, heritability estimates, test statistics andP-value) and quickly generate plots that summarize the distribution of a user-specified test statistics along the genome. Alternatively, in the ‘position-centric’ view, the browser can tabulate all significant association test statistics (using a user-defined threshold) in a target region and plot the results for multiple traits. Optionally, information such as the location of nearby genes can also be displayed (Fig. 1). For convenience, both interfaces allow the browser to link the results to external databases chosen by the user, such as the University of California Santa Cruz (UCSC) genome browser, where users can examine the genomic context of each result in detail. When the user requests a SNP that is not included in the current dataset, linkage disequilibrium (LD) and tag information from the International HapMap Consortium can be used to suggest a backup tag-SNP.Figure 1 is an illustration of the browser interface after searching for a specific SNP position using the ‘position-centric’ view. Four SNPs of interest have been highlighted by the user in the tabular view (bottom left) and are circled in the graphical view.

Fig. 1.

An illustration of the GWAS GUI browser interface. This example demonstrates how to display the results for a specific region. Several large statistics have been highlighted with blue circles by selecting the corresponding rows. The top transcripts ordered by maximum statistic within the region are tabulated in the right panel.

Open in new tab Download slide

3 EXAMPLES OF APPLICATION

Allowing large groups of scientists to browse and interact with the results of large multi-dimensional GWAS can be extremely helpful. For example, prior to the publication of the Dixonet al. (2007) gene expression paper, we used an early version of our browser to share preliminary results with several colleagues. This led to the observation that SNPs in an intergenic region on chromosome 5p13 that were associated with Crohn's Disease were also associated with transcript levels of PTGER4 suggesting that PTGER4 may be the primary candidate gene for Crohn's disease on chromosome 5. The Crohn's-associated SNPs are >200 Kb away from the nearest annotated gene. The result is published and described in detail elsewhere (Libioulleet al.,2007). Since then, many others have browsed our results resulting in several potential links between SNPs, human disease and mRNA transcript levels.

The current version of the GWAS GUI browser program is not restricted to gene-expression data, but is intended as a general tool that provides graphical overviews of whole-genome association study results for arbitrary phenotypes. The extended program allows users to load their own data files, tests statistics and genomic annotation files into the browser in a standardized text format. Generally, the traits can be any outcomes of interest, such as case–control indicators, expression values and many other continuous or categorical measurements. Arbitrary meta-data about each trait can be tracked and displayed. We expect that the browser will be particularly helpful when multiple-related traits are studied. In this setting, the browser simplifies the initial comparison of signals for different-related traits in regions of interest.

4 IMPLEMENTATION

The GWAS GUI browser program was implemented in C++ using the Qt4 toolkit (open-source version 4.4 Trolltech Inc.). It has been tested on Windows, Linux and Mac workstations. The system requirements depend on the size of input datasets which can range from a dataset examining a single trait dataset and hundreds of thousands of genetic markers to large-scale genome-wide gene-expression datasets with tens of thousands of traits and markers. On a modern Windows Workstation, the initial indexing of a set of results generated by PLINK (Purcellet al.,2007), MERLIN (Chenet al.,2007) or another whole-genome analysis tools and including approximately 300 000 SNPs requires ∼200 Mb of RAM and 5–10 min of computing time. After indexing, opening the same dataset and browsing the data should be nearly instantaneous and require only 60 Mb RAM.

Funding: National Human Genome Research Institute; National Heart Lung and Blood Institute.

Conflict of Interest: G.R.A. is a Pew Scholar of the Biomedical Sciences and is supported by the Pew Charitable Trusts.

REFERENCES

Chen

et al.,

Family-based association tests for genome-wide association scans.

Am. J. Hum. Genet.

2007

, vol.

(pg.

913

926

)

Cheung

et al.,

Mapping determinants of human gene expression by regional and genome-wide association.

Nature.

2005

, vol.

437

(pg.

1365

1369

)

Dixon

et al.,

A genome-wide association study of global gene expression.

Nat. Genet.

2007

, vol.

(pg.

1202

1207

)

Frayling

et al.,

A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity.

Science.

2007

, vol.

316

(pg.

889

894

)

Libioulle

et al.,

Novel Crohn disease locus identified by genome-wide association maps to a gene desert on 5p13.1 and modulates expression of PTGER4.

PLoS Genet.

2007

, vol.

pg.

e58

Moffatt

et al.,

Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma.

Nature.

2007

, vol.

448

(pg.

470

473

)

Purcell

et al.,

PLINK: a toolset for whole-genome association and population-based linkage analysis.

Am. J. Hum. Genet.

2007

, vol.

(pg.

559

575

)

Sanna

et al.,

Common variants in the GDF5-UQCC region are associated with variation in human height.

Nat. Genet.,.

2008

, vol.

(pg.

198

203

)

Scott

et al.,

A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants.

Science.

2007

, vol.

316

(pg.

1341

1345

)

Scuteri

et al.,

Genome-wide association scan shows genetic variants in the FTO gene are associated with obesity-related traits.

PLoS Genet.

2007

, vol.

(pg.

1200

1210

)

Wellcome Trust Case Control Consortium

Genome-wide association study of 14 000 cases of seven common diseases and 3000 shared controls.

Nature.

2007

, vol.

447

(pg.

661

678

)

Willer

et al.,

Newly identified loci that influence lipid concentrations and risk of coronary artery disease.

Nat. Genet.

2008

, vol.

(pg.

161

169

)

Author notes

Associate Editor: Martin Bishop

Issue Section:

APPLICATIONS NOTE>Genetics and population analysis

Download all slides

Citations

Views

1,116

Altmetric

More metrics information

Metrics

Total Views1,116

838Pageviews

278PDF Downloads

Since 12/1/2016

Month:	Total Views:
December 2016	5
January 2017	2
February 2017	7
March 2017	3
May 2017	8
June 2017	1
July 2017	5
August 2017	7
September 2017	1
October 2017	8
November 2017	1
December 2017	13
January 2018	9
February 2018	8
March 2018	31
April 2018	23
May 2018	15
June 2018	14
July 2018	6
August 2018	12
September 2018	4
October 2018	7
November 2018	17
December 2018	14
January 2019	7
February 2019	12
March 2019	8
April 2019	13
May 2019	10
June 2019	6
July 2019	7
August 2019	7
September 2019	6
October 2019	14
November 2019	24
December 2019	6
January 2020	20
February 2020	6
March 2020	8
April 2020	12
May 2020	4
June 2020	13
July 2020	13
August 2020	2
September 2020	12
October 2020	18
November 2020	19
December 2020	8
January 2021	13
February 2021	6
March 2021	13
April 2021	11
May 2021	8
June 2021	12
July 2021	12
August 2021	18
September 2021	22
October 2021	15
November 2021	20
December 2021	12
January 2022	10
February 2022	10
March 2022	5
April 2022	12
May 2022	12
June 2022	7
July 2022	14
August 2022	12
September 2022	12
October 2022	16
November 2022	7
December 2022	12
January 2023	15
February 2023	13
March 2023	8
April 2023	5
May 2023	12
June 2023	7
July 2023	1
August 2023	12
September 2023	3
October 2023	15
November 2023	10
December 2023	16
January 2024	24
February 2024	28
March 2024	14
April 2024	18
May 2024	16
June 2024	15
July 2024	14
August 2024	17
September 2024	13
October 2024	10
November 2024	8
December 2024	9
January 2025	4
February 2025	14
March 2025	22
April 2025	6

Citations

5Web of Science

Altmetrics

Email alerts

New journal issues

New journal articles

Citing articles via

Web of Science (5)

Google Scholar

Gradient matching accelerates mixed-effects inference for biochemical networks

Predicting Explainable Dementia Types with LLM-aided Feature Engineering

ROICellTrack: A deep learning framework for integrating cellular imaging modalities in subcellular spatial transcriptomic profiling of tumor tissues

Rethinking GWAS: how lessons from genetic screens and artificial intelligence could reveal biological mechanisms

Topology-Driven Negative Sampling Enhances Generalizability in Protein-Protein Interaction Prediction

Looking for your next opportunity?

Having trouble contacting the network. Please try again in a moment or two.

Online ISSN 1367-4811

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

Cookie policy
Privacy policy
Legal notice

Movatterモバイル変換

Article Contents

GWAS GUI: graphical browser for the results of whole-genome association studies with high-dimensional phenotypes

Cite

Abstract

1 INTRODUCTION

2 FEATURES OF GWAS GUI BROWSER

3 EXAMPLES OF APPLICATION

4 IMPLEMENTATION

REFERENCES

Author notes

Citations

Views

Altmetric

Email alerts

New journal issues alert

Sign in

Personal account

Journal article activity alert

Sign in

Personal account

Citing articles via

Latest

Most Read

Most Cited

Looking for your next opportunity?

This Feature Is Available To Subscribers Only