Welcome to the ScienceDuo blog by Chris Wallis and Rhiannon Morris. Screeds on science and sanity from two people who understand neither.
There are 1093 ways to build a cytochrome protein, so how do we always get the same one in the end?
For the sake of this discussion lets use a simple definition of biological evolution that most anti-evolutionists will agree on, that is, that all known living things are related by common ancestry by a process of descent with modification. We can call this Universal Common Descent (UCD). This idea is a central tenet of evolutionary theory, although there is obviously a lot more to it than this. There is also some feeling that there may be complications at the root of the tree due to horizontal gene transfer, however I think this encompasses both the fact of evolution and the mechanism. This idea of UCD tends to be what people talk about when discussing large scale macro-evolution, in which humans, chimps, fish, yeast and bacteria all share a common ancestor, the kind of evolution that no creationist believes in.
Evidence for evolution
Biological organisms can be grouped objectively into phylogenetic groupings that have a nested hierarchical structure, that is, groups within groups. The word objective is key here because, as anti-evolutionists are fond of pointing out, anything can be grouped into a nested hierarchy. Cars, chairs, buildings and just about any other object can be grouped in this way, but not objectively. In these systems characters are chosen arbitrarily and no one character could be said to be more important than another. With cars for example, a system could be made using colour, then number of seats, then engine size, but change the order or weighting of each character and you change the grouping structure altogether.
There is only one way a nested hierarchy can objectively be arranged and that is by using objects that reproduce and change, and organising them by order of similarities, more related objects will share more characters, this has been proven using Markovian mathematics, where the state of a character is determined solely by the state of its parent. This is why Markov chains are used to model evolutionary processes and to draw phylogenetic trees. Given that only things that reproduce and change (organisms, languages) can be organised into nested hierarchies, what does this tell us about evolution? The answer is really very simple, objective and statistically significant phylogenetic trees (the chance of randomly finding a particular tree) drawn from either molecular (amino acid, nucleotide) sequence data, or morphological characteristics can be drawn for all currently known species. If these species were not related by ancestry, then there could be no such pattern, and certainly not a significant one, instead, millions of trees could be drawn that were all equally supported by the data.
We can explain this simply using plants as an example. Plants can be grouped as vascular and nonvascular, nested within these groups, seeded and non-seeded. In those groups are the flowering plants (angiosperms) and non- flowering plants (gymnosperms), within the flowering plants are the monocotyledons and the dicotyledons. It would be highly problematic for evolution if this pattern was not observed in plants, for example, if some of the non-vascular plants had seeds or flowers.
A further note on phylogenetic trees to bring the point home: Not only do we have a lot of highly significant phylogenetic trees calculated for most species, we also have a highly significant consilience of independent phylogenies. This means that trees drawn for the same group of organisms using two or more independent methods (amino acid sequences and morphological characters for example) agree with each other to a high degree of certainty. The degree of agreement for many independent phylogenies is staggeringly high. When researching for this piece we were shocked to learn that for a universal tree containing 30 taxa (which can generate 4.9518e+38 trees), two independent measurements were made to an accuracy of better than 38 decimal places! This is more precise than many of the most accurately measured physical constants such as the mass of the neutron, proton, and electron which are known to around 9 decimal places. It’s hard to see what possible explanation for this pattern could be except for the obvious, they really are all related by common ancestry.
Figure 1. A typical phylogenetic tree of over 30 taxa.
The next argument is related but is more general. Across the tree of life there are a number of genes that do exactly the same thing in all organisms, that is, they are fundamental cellular genes that encode fundamental processes no matter if you are a slime mould, a cactus or a man. These can be referred to as ubiquitous genes, cytochrome c is a well-known example, and it does exactly the same thing in all known organisms. One may then assume that given the importance of this protein, it will be similar in most organisms for purely functional reasons, but this obvious inference is actually wrong.
Structural and genetic studies on Cytochrome C have shown that the majority of the amino acids can be changed for almost any other of the 20 possible amino acids. But what does this mean? It means that Cytochrome C is highly redundant and that there are lots and lots of ways you can build a fully functional Cytochrome protein. It has been estimated that there are at least 2.3 x 1093 possible functional cytochrome c protein sequences at the amino acid level. Given this number of possible fully functional sequences there really is no reason that any two organisms would share a significant similarity in their Cytochrome C sequences unless they were related by common ancestry.
There is only one known, tried and true mechanism that allows two organisms to share a redundant sequence, this is of course reproduction and inheritance (vertical gene transfer). So what happens when we look at the sequences for two organisms predicted by evolutionary theory to be closely related? Well, humans and chimps have exactly the same sequence at the protein level, a strong confirmation of recent ancestry between the two species.
Building on protein redundancy is genetic redundancy. That is, the canonical genetic code itself is informationally redundant. There are a multiple codons for most of the 20 amino acids with an average of 3 codons for each, so for any given amino acid sequence there are on average 3n possible combinations of codons that can specify it exactly, where n is the number of amino acids in the sequence. So for a modest protein of 100 amino acids there are 3100 possible combinations that will all make the exact same protein. Furthermore, this genetic code is preserved across all three kingdoms of life. The code of all organisms specifies the same amino acids, start codons and stop codons. So again, there is no reason for two or more organisms to share significant similarity at the genetic level, unless they are genetically related. We could extend this further to morphology, where we often see rudimentary or redundant structures in organisms, which can only be (logically) explained by the process of evolution.
In conclusion, phylogenetics shows us that only branching evolutionary relationships can be arranged into nested hierarchies and that all known organisms fit this pattern at all scales with statistical significance, anti-evolutionists have zero explanation for this pattern even though it has been used to predict viral epidemics and correctly find known genealogies. Secondly, shared ubiquitous genes with high sequence similarity that could be made from billions of other functional sequences is strong evidence for common ancestry, there is no other explanation for two organisms to share these genes except if they were related by ancestry. There are obviously many more types of evidence for evolution, however as molecular biologists we will leave you here for now. Let us know your thoughts on this post and we will happily expand to other evidence if there is any interest.