Molecular evolution describes howinheritedDNA and/orRNA change overevolutionary time, and the consequences of this forproteins and other components ofcells andorganisms. Molecular evolution is the basis ofphylogenetic approaches to describing thetree of life. Molecular evolution overlaps withpopulation genetics, especially on shorter timescales. Topics in molecular evolution include the origins of new genes, the genetic nature ofcomplex traits, the genetic basis ofadaptation andspeciation, theevolution of development, and patterns and processes underlyinggenomic changes during evolution.
Thehistory of molecular evolution starts in the early 20th century with comparativebiochemistry, and the use of "fingerprinting" methods such as immune assays,gel electrophoresis, andpaper chromatography in the 1950s to explorehomologous proteins.[1][2] The advent ofprotein sequencing allowed molecular biologists to create phylogenies based on sequence comparison, and to use the differences betweenhomologous sequences as amolecular clock to estimate the time since themost recent common ancestor.[3][1] The surprisingly large amount of molecular divergence within and between species inspired theneutral theory of molecular evolution in the late 1960s.[4][5][6] Neutral theory also provided a theoretical basis for themolecular clock, although this is not needed for the clock's validity. After the 1970s, nucleic acid sequencing allowed molecular evolution to reach beyond proteins to highly conservedribosomal RNA sequences, the foundation of a reconceptualization of the earlyhistory of life.[1] TheSociety for Molecular Biology and Evolution was founded in 1982.[citation needed]

Molecular phylogenetics usesDNA,RNA, orprotein sequences to resolve questions insystematics, i.e. about their correctscientific classification from the point of view ofevolutionary history. The result of a molecularphylogenetic analysis is expressed in aphylogenetic tree. Phylogenetic inference is conducted using data fromDNA sequencing. This isaligned to identify which sites arehomologous. Asubstitution model describes what patterns are expected to be common or rare. Sophisticatedcomputational inference is then used to generate one or more plausible trees.[citation needed]
Some phylogenetic methods account for variation among sites andamong tree branches. Different genes, e.g.hemoglobin vs.cytochrome c, generally evolve at differentrates.[7] These rates are relatively constant over time (e.g., hemoglobin does not evolve at the same rate as cytochrome c, but hemoglobins from humans, mice, etc. do have comparable rates of evolution), although rapid evolution along one branch can indicate increaseddirectional selection on that branch.[8]Purifying selection causes functionally important regions to evolve more slowly, and amino acid substitutions involvingsimilar amino acids occurs more often than dissimilar substitutions.[7]


Gene duplication can produce multiplehomologous proteins (paralogs) within the same species.Phylogenetic analysis of proteins has revealed how proteins evolve and change their structure and function over time.[9][10]
For example,ribonucleotide reductase (RNR) has evolved a multitude of structural and functional variants.Class I RNRs use aferritin subunit and differ by the metal they use as cofactors. Inclass II RNRs, thethiyl radical is generated using anadenosylcobalamin cofactor and these enzymes do not require additional subunits (as opposed to class I which do). Inclass III RNRs, the thiyl radical is generated usingS-adenosylmethionine bound to a [4Fe-4S] cluster. That is, within a single family of proteins numerous structural and functional mechanisms can evolve.[11]
In a proof-of-concept study, Bhattacharya and colleagues convertedmyoglobin, a non-enzymatic oxygen storage protein, into a highly efficientKemp eliminase using only threemutations. This demonstrates that only few mutations are needed to radically change the function of a protein.[12]Directed evolution is the attempt to engineer proteins using methods inspired by molecular evolution.
Change at one locus begins with a newmutation, which might become fixed due to some combination ofnatural selection,genetic drift, andgene conversion.[citation needed]

Mutations are permanent, transmissible changes to thegenetic material (DNA orRNA) of acell orvirus. Mutations result from errors inDNA replication duringcell division and by exposure toradiation, chemicals, other environmental stressors,viruses, ortransposable elements. Whenpoint mutations to just one base-pair of the DNA fall within aregion coding for a protein, they are characterized by whether they aresynonymous (do not change the amino acid sequence) or non-synonymous. Other types of mutations modify larger segments of DNA and can cause duplications, insertions, deletions, inversions, and translocations.[13]
The distribution of rates for diverse kinds of mutations is called the "mutation spectrum" (see App. B of[14]). Mutations of different types occur at widely varying rates. Point mutation rates for most organisms are very low, roughly 10−9 to 10−8 per site per generation,[15] though some viruses have higher mutation rates on the order of 10−6 per site per generation.[16]Transitions (A ↔ G or C ↔ T) are more common thantransversions (purine (adenine or guanine)) ↔pyrimidine (cytosine or thymine, or in RNA, uracil)).[17] Perhaps the most common type of mutation in humans is a change in the length of ashort tandem repeat (e.g., the CAG repeats underlying various disease-associated mutations). Such STR mutations may occur at rates on the order of 10−3 per generation.[18]
Different frequencies of different types of mutations can play an important role in evolution viabias in the introduction of variation (arrival bias), contributing to parallelism, trends, and differences in the navigability of adaptive landscapes.[19][20] Mutation bias makes systematic or predictable contributions toparallel evolution.[14] Since the 1960s, genomicGC content has been thought to reflect mutational tendencies.[21][22] Mutational biases also contribute tocodon usage bias.[23] Although such hypotheses are often associated with neutrality, recent theoretical and empirical results have established that mutational tendencies can influence both neutral and adaptive evolution viabias in the introduction of variation (arrival bias).[citation needed]
Selection can occur when an allele confers greaterfitness, i.e. greater ability to survive or reproduce, on the average individual than carries it. Aselectionist approach emphasizes e.g. that biases incodon usage are due at least in part to the ability of evenweak selection to shape molecular evolution.[24]
Selection can also operate at the gene level at the expense of organismal fitness, resulting inintragenomic conflict. This is because there can be a selective advantage forselfish genetic elements in spite of a host cost. Examples of such selfish elements includetransposable elements,meiotic drivers, andselfish mitochondria.[citation needed]
Selection can bedetected using theKa/Ks ratio, theMcDonald–Kreitman test. Rapidadaptive evolution is often found for genes involved inintragenomic conflict,sexual antagonistic coevolution, and theimmune system.[citation needed]
Genetic drift is the change of allele frequencies from one generation to the next due to stochastic effects ofrandom sampling in finite populations. These effects can accumulate until a mutation becomesfixed in apopulation. For neutral mutations, the rate of fixation per generation is equal to the mutation rate per replication. A relatively constant mutation rate thus produces a constant rate of change per generation (molecular clock).[citation needed]
Slightly deleterious mutations with aselection coefficient less than a threshold value of 1 / theeffective population size can also fix. Many genomic features have been ascribed to accumulation of nearly neutral detrimental mutations as a result of small effective population sizes.[25] With a smaller effective population size, a larger variety of mutations will behave as if they are neutral due to inefficiency of selection.[citation needed]
Gene conversion occurs during recombination, when nucleotide damage isrepaired using an homologous genomic region as a template. It can be a biased process, i.e. one allele may have a higher probability of being the donor than the other in a gene conversion event. In particular, GC-biased gene conversion tends to increase theGC-content of genomes, particularly in regions with higher recombination rates.[26] There is also evidence for GC bias in the mismatch repair process.[27] It is thought that this may be an adaptation to the high rate of methyl-cytosine deamination which can lead to C→T transitions.[citation needed]
The dynamics of biased gene conversion resemble those of natural selection, in that a favored allele will tend to increaseexponentially in frequency when rare.[citation needed]
Genome size is influenced by the amount of repetitive DNA as well as number of genes in an organism. Some organisms, such as most bacteria,Drosophila, andArabidopsis have particularly compact genomes with little repetitive content or non-coding DNA. Other organisms, like mammals or maize, have large amounts of repetitive DNA, longintrons, and substantial spacing between genes. TheC-value paradox refers to the lack of correlation between organism 'complexity' and genome size. Explanations for the so-called paradox are two-fold. First, repetitive genetic elements can comprise large portions of the genome for many organisms, thereby inflating DNA content of the haploid genome. Repetitive genetic elements are often descended fromtransposable elements.[citation needed]
Secondly, the number of genes is not necessarily indicative of the number of developmental stages or tissue types in an organism. An organism with few developmental stages or tissue types may have large numbers of genes that influence non-developmental phenotypes, inflating gene content relative to developmental gene families.[citation needed]
Neutral explanations for genome size suggest that when population sizes are small, many mutations become nearly neutral. Hence, in small populations repetitive content and other'junk' DNA can accumulate without placing the organism at a competitive disadvantage. There is little evidence to suggest that genome size is under strong widespread selection in multicellular eukaryotes. Genome size, independent of gene content, correlates poorly with most physiological traits and many eukaryotes, including mammals, harbor very large amounts of repetitive DNA.[citation needed]
However,birds likely have experienced strong selection for reduced genome size, in response to changing energetic needs for flight. Birds, unlike humans, produce nucleated red blood cells, and larger nuclei lead to lower levels of oxygen transport. Bird metabolism is far higher than that of mammals, due largely to flight, and oxygen needs are high. Hence, most birds have small, compact genomes with few repetitive elements. Indirect evidence suggests that non-avian theropod dinosaur ancestors of modern birds[28] also had reduced genome sizes, consistent with endothermy and high energetic needs for running speed. Many bacteria have also experienced selection for small genome size, as time of replication and energy consumption are so tightly correlated with fitness.[citation needed]
The antMyrmecia pilosula has only a single pair of chromosomes[29] whereas the Adders-tongue fernOphioglossum reticulatum has up to 1260 chromosomes.[30] Thenumber of chromosomes in an organism's genome does not necessarily correlate with the amount of DNA in its genome. The genome-wide amount ofrecombination is directly controlled by the number of chromosomes, with onecrossover per chromosome or per chromosome arm, depending on the species.[31]
Changes in chromosome number can play a key role inspeciation, as differing chromosome numbers can serve as abarrier to reproduction in hybrids. Humanchromosome 2 was created from a fusion of two chimpanzee chromosomes and still contains centraltelomeres as well as a vestigial secondcentromere.Polyploidy, especially allopolyploidy, which occurs often in plants, can also result in reproductive incompatibilities with parental species.Agrodiatus blue butterflies have diverse chromosome numbers ranging from n=10 to n=134 and additionally have one of the highest rates of speciation identified to date.[32]
Cilliate genomes house each gene in individual chromosomes.[citation needed]

In addition to thenuclear genome, endosymbiont organelles contain their own genetic material.Mitochondrial andchloroplast DNA varies across taxa, butmembrane-bound proteins, especiallyelectron transport chain constituents are most often encoded in the organelle. Chloroplasts andmitochondria are maternally inherited in most species, as the organelles must pass through theegg. In a rare departure, some species ofmussels are known to inherit mitochondria from father to son.[citation needed]
Newgenes arise from several different genetic mechanisms includinggene duplication,de novo gene birth,retrotransposition,chimeric gene formation, recruitment of non-coding sequence into an existing gene, and gene truncation.[citation needed]
Gene duplication initially leads to redundancy. However, duplicated gene sequences can mutate to developnew functions orspecialize so that the new gene performs a subset of the original ancestral functions.Retrotransposition duplicates genes by copyingmRNA to DNA and inserting it into the genome. Retrogenes generally insert into new genomic locations, lackintrons, and sometimes develop new expression patterns and functions.[citation needed]
Chimeric genes form when duplication, deletion, or incomplete retrotransposition combines portions of two different coding sequences to produce a novel gene sequence. Chimeras often cause regulatory changes and can shuffle protein domains to produce novel adaptive functions.[citation needed]
De novo gene birth can give rise to protein-coding genes and non-coding genes from previously non-functional DNA.[33] For instance, Levine and colleagues reported the origin of five new genes in theD. melanogaster genome.[34][35] Similarde novo origin of genes has also been shown in other organisms such as yeast,[36] rice[37] and humans.[38]De novo genes may evolve from spurious transcripts that are already expressed at low levels.[39]
Constructive neutral evolution (CNE) explains that complex systems can emerge and spread into a population through neutral transitions with the principles of excess capacity, presuppression, and ratcheting,[40][41][42] and it has been applied in areas ranging from the origins of thespliceosome to the complex interdependence ofmicrobial communities.[43][44][45]
The Society for Molecular Biology and Evolution publishes the journals "Molecular Biology and Evolution" and "Genome Biology and Evolution" and holds an annual international meeting. Other journals dedicated to molecular evolution includeJournal of Molecular Evolution andMolecular Phylogenetics and Evolution. Research in molecular evolution is also published in journals ofgenetics,molecular biology,genomics,systematics, andevolutionary biology.[citation needed]
It is unimportant in this connection whether selection has been negligible or self-cancelling.
Category: molecularevolution (kimura 1968)