Movatterモバイル変換


[0]ホーム

URL:


Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
Thehttps:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log inShow account info
Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
pubmed logo
Advanced Clipboard
User Guide

Full text links

BioMed Central full text link BioMed Central Free PMC article
Full text links

Actions

Share

.2005 May 2:6:110.
doi: 10.1186/1471-2105-6-110.

Ab initio identification of putative human transcription factor binding sites by comparative genomics

Affiliations

Ab initio identification of putative human transcription factor binding sites by comparative genomics

D Corà et al. BMC Bioinformatics..

Abstract

Background: Understanding transcriptional regulation of gene expression is one of the greatest challenges of modern molecular biology. A central role in this mechanism is played by transcription factors, which typically bind to specific, short DNA sequence motifs usually located in the upstream region of the regulated genes. We discuss here a simple and powerful approach for the ab initio identification of these cis-regulatory motifs. The method we present integrates several elements: human-mouse comparison, statistical analysis of genomic sequences and the concept of coregulation. We apply it to a complete scan of the human genome.

Results: By using the catalogue of conserved upstream sequences collected in the CORG database we construct sets of genes sharing the same overrepresented motif (short DNA sequence) in their upstream regions both in human and in mouse. We perform this construction for all possible motifs from 5 to 8 nucleotides in length and then filter the resulting sets looking for two types of evidence of coregulation: first, we analyze the Gene Ontology annotation of the genes in the set, searching for statistically significant common annotations; second, we analyze the expression profiles of the genes in the set as measured by microarray experiments, searching for evidence of coexpression. The sets which pass one or both filters are conjectured to contain a significant fraction of coregulated genes, and the upstream motifs characterizing the sets are thus good candidates to be the binding sites of the TF's involved in such regulation. In this way we find various known motifs and also some new candidate binding sites.

Conclusion: We have discussed a new integrated algorithm for the "ab initio" identification of transcription factor binding sites in the human genome. The method is based on three ingredients: comparative genomics, overrepresentation, different types of coregulation. The method is applied to a full-scan of the human genome, giving satisfactory results.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Flow-chart of the algorithm
Figure 2
Figure 2
Histogram of the distance from the TSS of the motifs found significant by the algorithm (see text).
See this image and copyright information in PMC

Similar articles

See all similar articles

Cited by

See all "Cited by" articles

References

    1. Wassermann WW, Sandelin A. Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet. 2004;5:276–87. doi: 10.1038/nrg1315. - DOI - PubMed
    1. Pennacchio LA, Rubin EM. Genomic strategies to identify mammalian regulatory sequences. Nat Rev Genet. 2001;2:100–109. doi: 10.1038/35052548. - DOI - PubMed
    1. Hardison R. Conserved non-coding sequences are reliable guides to regulatory elements. Trends Genet. 2000;16:369–372. doi: 10.1016/S0168-9525(00)02081-3. - DOI - PubMed
    1. Duret L, Dorkeld F, Gautier C. Strong conservation of non-coding sequences during vertebrates evolution: potential involvement in post-transcriptional regulation of gene expression. Nucleic Acid Res. 1993;21:2315–2322. - PMC - PubMed
    1. Loots GG, Locksley RM, Blankespoor CM, Wang ZE, Miller W, Rubin EM, Frazer KA. Identification of a coordinate regulator of interleukins 4, 13, and 5 by cross-species sequence comparisons. Science. 2000;288:136–140. doi: 10.1126/science.288.5463.136. - DOI - PubMed

Publication types

MeSH terms

Substances

Related information

LinkOut - more resources

Full text links
BioMed Central full text link BioMed Central Free PMC article
Cite
Send To

NCBI Literature Resources

MeSHPMCBookshelfDisclaimer

The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.


[8]ページ先頭

©2009-2025 Movatter.jp