More Recent Comments
Subscribe to:Post Comments ( Atom )
Laurence A. Moran
Larry Moran is a Professor Emeritus in the Department of Biochemistry at the University of Toronto. You can contact him by looking up his email address on theUniversity of Toronto website.
Sandwalk

Disclaimer
Some readers of this blog may be under the impression that my personal opinions represent the official position of Canada, the Province of Ontario, the City of Toronto, the University of Toronto, the Faculty of Medicine, or the Department of Biochemistry. All of these institutions, plus every single one of my colleagues, students, friends, and relatives, want you to know that I do not speak for them. You should also know that they don't speak for me.
More Recent Comments
Recent Comments
What's in Your Genome?
90% of your genome is junk
Principles of Biochemistry 5th edition

Themes
Quotations
The old argument of design in nature, as given by Paley, which formerly seemed to me to be so conclusive, fails, now that the law of natural selection has been discovered. We can no longer argue that, for instance, the beautiful hinge of a bivalve shell must have been made by an intelligent being, like the hinge of a door by man. There seems to be no more design in the variability of organic beings and in the action of natural selection, than in the course which the wind blows.Charles Darwin (c1880)
Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.
Charles Darwin (1859)
Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...
Peter Atkins
Although I am fully convinced of the truth of the views given in this volume, I by no means expect to convince experienced naturalists whose minds are stocked with a multitude of facts all viewed, during a long course of years, from a point of view directly opposite to mine. It is so easy to hide our ignorance under such expressions as "plan of creation," "unity of design," etc., and to think that we give an explanation when we only restate a fact. Any one whose disposition leads him to attach more weight to unexplained difficulties than to the explanation of a certain number of facts will certainly reject the theory.
Charles Darwin (1859)
Science reveals where religion conceals. Where religion purports to explain, it actually resorts to tautology. To assert that "God did it" is no more than an admission of ignorance dressed deceitfully as an explanation...
Peter Atkins
Essays and Articles
- What's in Your Genome
- Evolution Is a Fact
- Just-So Stories
- What Is a Gene?
- The Central Dogma of Molecular Biology
- Theistic Evolution: The Fallacy of the Middle Ground
- Why I'm not a Darwinist
- Evolution by Accident
- Evolution and Abiogenesis
- Macroevolution
- Random Genetic Drift
- Michael Denton and Molecular Clocks
- The Modern Synthesis of Genetics and Evolution
- Evolution Is a Fact and a Theory
- What Is Evolution?
- Have Humans Stopped Evolving?
- Mammalian Gene Families: Humans and Chimps Differ by 6%
- Michael Behe's Criticism of Biochemistry Textbooks
- O157:H7 Outbreak in Taco Bell Restaurants
- Calico Cats
- The HSP70 Sequence Database
Sandwalk Archive
- Stephen Jay Gould (1983) p.335
The first commandment for all versions of NOMA might be summarized by stating: "Thou shalt not mix the magisteria by claiming that God directly ordains important events in the history of nature by special interference knowable only through revelation and not accessible to science." In common parlance, we refer to such special interference as "miracle"—operationally defined as a unique and temporary suspension of natural law to reorder the facts of nature by divine fiat.Stephen Jay Gould (1999) p.84
Quotations
My own view is that conclusions about the evolution of human behavior should be based on research at least as rigorous as that used in studying nonhuman animals. And if you read the animal behavior journals, you'll see that this requirement sets the bar pretty high, so that many assertions about evolutionary psychology sink without a trace.
Jerry Coyne
Why Evolution Is True
I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.
Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations
Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.
Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.
Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have beenlogically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist
Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.
Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.
Richard Lewontin
Jerry Coyne
Why Evolution Is True
I once made the remark that two things disappeared in 1990: one was communism, the other was biochemistry and that only one of them should be allowed to come back.
Sydney Brenner
TIBS Dec. 2000
It is naïve to think that if a species' environment changes the species must adapt or else become extinct.... Just as a changed environment need not set in motion selection for new adaptations, new adaptations may evolve in an unchanging environment if new mutations arise that are superior to any pre-existing variations
Douglas Futuyma
One of the most frightening things in the Western world, and in this country in particular, is the number of people who believe in things that are scientifically false. If someone tells me that the earth is less than 10,000 years old, in my opinion he should see a psychiatrist.
Francis Crick
There will be no difficulty in computers being adapted to biology. There will be luddites. But they will be buried.
Sydney Brenner
An atheist before Darwin could have said, following Hume: 'I have no explanation for complex biological design. All I know is that God isn't a good explanation, so we must wait and hope that somebody comes up with a better one.' I can't help feeling that such a position, though logically sound, would have left one feeling pretty unsatisfied, and that although atheism might have beenlogically tenable before Darwin, Darwin made it possible to be an intellectually fulfilled atheist
Richard Dawkins
Another curious aspect of the theory of evolution is that everybody thinks he understand it. I mean philosophers, social scientists, and so on. While in fact very few people understand it, actually as it stands, even as it stood when Darwin expressed it, and even less as we now may be able to understand it in biology.
Jacques Monod
The false view of evolution as a process of global optimizing has been applied literally by engineers who, taken in by a mistaken metaphor, have attempted to find globally optimal solutions to design problems by writing programs that model evolution by natural selection.
Richard Lewontin
Principles of Biochemistry 5th edition

Principles of Biochemistry: International Edition

Biochemistry 2nd ed. (1994)

[8]ページ先頭
©2009-2025 Movatter.jp



16 comments :
Yay, that's the number that I have been using.
Hi Larry,
It's actually a tough question (though recall that even the human chromosome number was debated for some time). The issue is that sequencing is actually a very bad way to estimate total genome size. On the other hand, our other methods (e.g., Feulgen densitometry, flow cytometry) are all relative estimates made using a standard of "known" (i.e., generally accepted) genome size. Human is more often a standard than a study subject, and we use 3.5pg (= 3.4Gb) in the database simply because that was widely used in the past and we can easily correct all the estimates for other species based on it if we use a single value for each standard. I suspect 3.5pg is a bit high, but you're absolutely correct that we don't entirely know how much has been missed in the sequencing programs. In any case, the human genome is very average in size for a mammal at 3.2-3.4Gb-ish.
[Part 1 of 2]
Forgive my ignorance, but I am trying to make sense of different descriptions of the human genome, and different descriptions of the information needed to fully specify a large mammal, like a man.
You talk about 3.5 Gb for the genome, and, giving Ray Kurzweil the benefit of the doubt, 50 million bytes after loss-less compression.
If someone made extravagant claims about a computer program that runs on some unknown hardware and unknown OS, I would be unamused if they handed me a thumb-drive containing the compressed binary executable, and nothing more. This single file would demonstrate nothing.
I would demand the original source code, the specification for the code (including the business decisions the code is meant to automate, at the very least), some documentation demonstrating that I can move back and forth between points in the specification and the source code lines encoding that part of the specification, and the code for the automated tests (so an automated test can demonstrate what changes to the code will still keep it within specification, at the very least).
And maybe the same for some of the libraries and hardware - maybe needing the full specification if the libraries, OS, and hardware if they all are very novel, quite unlike any I have worked with before.
So there would be a dramatic explosion of information needed, moving from the binary executable to a bare minimum specification of a computer program as defined above.
[Part 2 of 2]
In the debate between PZ and Kurzweil, PZ makes this point:
http://scienceblogs.com/pharyngula/2010/08/ray_kurzweil_does_not_understa.php
"""
Let me give you a few specific examples of just how wrong Kurzweil's calculations are. Here are a few proteins that I plucked at random from the NIH database; all play a role in the human brain.
First up is RHEB (Ras Homolog Enriched in Brain). It's a small protein, only 184 amino acids, which Kurzweil pretends can be reduced to about 12 bytes of code in his simulation. Here's the short description.
MTOR (FRAP1; 601231) integrates protein translation with cellular nutrient status and growth signals through its participation in 2 biochemically and functionally distinct protein complexes, MTORC1 and MTORC2. MTORC1 is sensitive to rapamycin and signals downstream to activate protein translation, whereas MTORC2 is resistant to rapamycin and signals upstream to activate AKT (see 164730). The GTPase RHEB is a proximal activator of MTORC1 and translation initiation. It has the opposite effect on MTORC2, producing inhibition of the upstream AKT pathway (Mavrakis et al., 2008).
Got that? You can't understand RHEB until you understand how it interacts with three other proteins, and how it fits into a complex regulatory pathway.
"""
I am inclined to grant PZ the point, and say his understanding of the immensity of the task outstrips Kurzweil's understanding.
Would the explosion of information needed to move from the complete genome to the complete specification of a large mammal be on the same order of the explosion of information needed to move from the binary executable to a bare minimum specification of a computer program as defined above? Did I capture the gist of it, or am I hopelessly mistaken?
How do genome size estimates based on sequencing handle repetitive regions that are likely to collapse into a single contig? Do they look at read depth across those regions (i.e., similar to Eichler et al.'s early identification of segmental duplicates)?
It's a small protein, only 184 amino acids, which Kurzweil pretends can be reduced to about 12 bytes of code in his simulation.
This should qualify for the prize in most retarded understanding of bilogy by a non-biologist. Kurzweil has probably gone senile from popping close to a thousand of pills every week in an effort to live forever.
184 aa protein contains so much information that we at present cannot handle it! The folding problem can in principle be reduced to a decryption task and at present we cannot predict protein folding with any degree of reliability without cheating. "12 bytes"!
One can pretend to not pay attention to this information because, supposedly, folded state is encoded in the sequence. Not quite! Without correct interactions with the rest of the cell, folding typically fails. (Ask anyone expressing mammalian proteins in bacteria).
And then, even if one disregards folding, this 184 aa protein interacts with >15,000 other proteins (and carbohydrates, and nucleic acids, and various small molecules). Yep, many thousands. Most of these interactions are weak and fleeting, playing no major role but in aggregation they all matter because in sum total they make up a cell. Imaginary experiment of wiping out all of the "non-specific" interactions results most likely in a non-functional cell or, at the very least, a very different cell.
The size of the human genome is difficult to estimate. Different individuals can have large scale sequence differences so the size of the genome in one individual will differ in size from the genome in another individual.
The original assembly was meant to be a haploid representation of the euchromatic genome. The GRC is now trying to represent large scale structural diversity, so some regions are now represented by >1 path.
The number you use in the post represents gap estimates (including heterochromatic gaps) as well as the sequence from the alternate alleles. There are 2.86 Mb of sequenced bases in the Primary assembly (the non-redundant haploid representation of the assembly). The GRC is working on trying to represent segmental duplication, and while there is still work to do the representation is good in some regions. Regions that are being worked on are being tracked by the GRC and are available from the GRC website.
grc says,
The number you use in the post represents gap estimates (including heterochromatic gaps) as well as the sequence from the alternate alleles. There are 2.86 Mb of sequenced bases in the Primary assembly (the non-redundant haploid representation of the assembly).
Thanks. I'm not an expert on the human genome sequence but neither am I uninformed. If I'm having trouble figuring out the size of the human genome doesn't that suggest a problem?
Why don't you have a clear and concise answer to the question on the NCBI website? The number I quoted (3,156,105,057 bp) is given as "Total Sequenced Bases in Assembly."
Are you now telling me that this is a lie because it includes gap estmates?
Could you tell me what the current estimate of genome size is and how it breaks down into actual sequence (2.86 Mb?) and estimates of missing sequence? Is it 3.16 Mb?
BTW, who are you?
Larry,
My name is Deanna Church and I work with the GRC. The GRC is tasked with producing the human (and mouse and zebrafish) reference assemblies. I wasn't trying to suggest you were lying, it is just that assembly statistics can be complicated. If you look at the GRC statistics page:
http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/human/data/index.shtml
We do give an explanation of how the stats are calculated. Although I think you make a good point in that we should list the Total bases in the assembly and the Total bases in the Primary Assembly (which is meant to represent the haploid genome). We are working on publishing a paper describing our efforts but in reviewing our web page, I see we could certainly add some additional explanatory text and examples of both alternate loci and PATCHES. I'm certainly open to any suggestions that will make the site clearer!
I would like to have a copy of the human genome on DVD. LWZ compressed if possible. How many dvd's would it take?
Less than 1 DVD, even without any compression. There's ~3.2 Gbp (giga-basepairs) of information. Even if you use a whole byte for each, that's only 3.2 GB (gigabytes) of data, and a DVD holds anything from 4.7 to 8.7 GB depending whether you're talking single-layer or double-layer.
You can bring that down by a factor of 4 with more efficient encoding (you only need two bits to encode each base pair, not a whole byte). With even minimally effective compression, it would easily fit on a single CD with room to spare.
M.Gaber
I'm not a biologist but wold like to ask this theoretical question. In the future if I have the complete human genome for a particular man on a DVD or CD or whatever, is it possible (theoretically) to reproduce this genome and put it in a human egg and get a baby clone for this man?
I have a feeling that something is missing. How can the human complicity be reduced to less than 3.2 GB!
Yes.
Mohamad Gaber,
I think because those 3.2 GB of information produce molecules that interact in complex ways in a living system. There are feedback loops among molecules that cause different portions of that information to be expressed at different times over the life of the organism, creating essentially, an unlimited level of complexity. Basically, it's not how much information is in the genome, but how that information is used (regulatory sequences play a large role in that).
It's not Gigabytes (GB) but Giga bits. In computers there is 8 bits per byte, but computer bits are binary where DNA has 3 base pairs. So the human genome would take up 573MB (3.2*1.5*1000*1000*1000/8/1024/1024). If that dosen't seam like a lot then remember that you could store a 300 million written pages in that same space.
Post a Comment