Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

BioMed Central full text link BioMed Central Free PMC article
Full text links

Actions

Share

.2001;2(9):RESEARCH0034.
doi: 10.1186/gb-2001-2-9-research0034.

Functional associations of proteins in entire genomes by means of exhaustive detection of gene fusions

Affiliations

Functional associations of proteins in entire genomes by means of exhaustive detection of gene fusions

A J Enright et al. Genome Biol.2001.

Abstract

Background: It has recently been shown that the detection of gene fusion events across genomes can be used for predicting functional associations of proteins, including physical interaction or complex formation. To obtain such predictions we have made an exhaustive search for gene fusion events within 24 available completely sequenced genomes.

Results: Each genome was used as a query against the remaining 23 complete genomes to detect gene fusion events. Using an improved, fully automatic protocol, a total of 7,224 single-domain proteins that are components of gene fusions in other genomes were detected, many of which were identified for the first time. The total number of predicted pairwise functional associations is 39,730 for all genomes. Component pairs were identified by virtue of their similarity to 2,365 multidomain composite proteins. We also show for the first time that gene fusion is a complex evolutionary process with a number of contributory factors, including paralogy, genome size and phylogenetic distance. On average, 9% of genes in a given genome appear to code for single-domain, component proteins predicted to be functionally associated. These proteins are detected by an additional 4% of genes that code for fused, composite proteins.

Conclusions: These results provide an exhaustive set of functionally associated genes and also delineate the power of fusion analysis for the prediction of protein interactions.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Z-scores for component proteins. The graph illustrates the Z-score (blue bars) distribution and its cumulative sum (step function, with red rectangles) between components, for all detected fusion events (66,406 in total). The Z-score is a statistical measure of similarity for each pair of components. Components that have a Z-score similarity of less than 10, and both exhibit similarity to the same composite protein are detected as fusion events. In general, fusion events where the Z-score between components is less than 3 (marked by a vertical line) result in fewer false-positive fusion detections.
Figure 2
Figure 2
Correlation of gene expression between component pairs. The graph illustrates the distributions of average correlation values of gene expression between component pairs (blue bars) and randomly selected pairs (gray bars), above a threshold value of 0.5. Inset: Distributions of average correlation values for both predicted and random associations (vertical line indicates the cut-off value of 0.5).
Figure 3
Figure 3
Numbers of component and composite proteins. Absolute number of(a) component and(b) composite proteins as individual cases (blue bars) and protein families (green bars), by species. Species name abbreviations as in Table 1. Data forC. elegans andD. melanogaster are clipped (1,973 and 1,981 components, 567 and 559 composites, respectively).
Figure 4
Figure 4
Numbers of component and composite proteins relative to genome size. Relative numbers of(a) component and(b) composites per species, as individual cases (blue bars) and protein families (green bars), normalized by total genome size (number of ORFs). Species name abbreviations as in Table 1. Average values per genome are 9% for components and 4% for composites.
Figure 5
Figure 5
Neighbor-joining dendrogram representing the phylogenetic proximity of each of the 24 species in terms of detected gene fusion events. The distance measure is derived from the count of composite families (see Materials and methods). Scale bar is set to indicate a distance of 10 (ranging from 0 to 100). Species name abbreviations as in Table 1. Only bootstrap values less than 100 are shown.
See this image and copyright information in PMC

Comment in

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Enright AJ, Iliopoulos I, Kyrpides NC, Ouzounis CA. Protein interaction maps for complete genomes based on gene fusion events. Nature. 1999;402:86–90. - PubMed
    1. Marcotte EM, Pellegrini M, Thompson MJ, Yeates TO, Eisenberg D. A combined algorithm for genome-wide prediction of protein function. Nature. 1999;402:83–86. - PubMed
    1. Marcotte EM, Pellegrini M, Ng H-L, Rice DW, Yeates TO, Eisenberg D. Detecting protein function and protein-protein interactions from genome sequences. Science. 1999;285:751–753. - PubMed
    1. Sali A. Functional links between proteins. Nature. 1999;402:23–26. - PubMed
    1. Doolittle RF. Do you dig my groove? Nat Genet. 1999;23:6–8. - PubMed

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full text links
BioMed Central full text link BioMed Central Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp