Welcome to the ScienceDuo blog by Chris Wallis and Rhiannon Morris. Screeds on science and sanity from two people who understand neither.

Evolution is 10^93 more likely than any other hypothesis

There are 1093 ways to build a cytochrome protein, so how do we always get the same one in the end?

For the sake of this discussion lets use a simple definition of biological evolution that most anti-evolutionists will agree on, that is, that all known living things are related by common ancestry by a process of descent with modification. We can call this Universal Common Descent (UCD). This idea is a central tenet of evolutionary theory, although there is obviously a lot more to it than this. There is also some feeling that there may be complications at the root of the tree due to horizontal gene transfer, however I think this encompasses both the fact of evolution and the mechanism. This idea of UCD tends to be what people talk about when discussing large scale macro-evolution, in which humans, chimps, fish, yeast and bacteria all share a common ancestor, the kind of evolution that no creationist believes in.

Evidence for evolution

Biological organisms can be grouped objectively into phylogenetic groupings that have a nested hierarchical structure, that is, groups within groups. The word objective is key here because, as anti-evolutionists are fond of pointing out, anything can be grouped into a nested hierarchy. Cars, chairs, buildings and just about any other object can be grouped in this way, but not objectively. In these systems characters are chosen arbitrarily and no one character could be said to be more important than another. With cars for example, a system could be made using colour, then number of seats, then engine size, but change the order or weighting of each character and you change the grouping structure altogether.

There is only one way a nested hierarchy can objectively be arranged and that is by using objects that reproduce and change, and organising them by order of similarities, more related objects will share more characters, this has been proven using Markovian mathematics, where the state of a character is determined solely by the state of its parent. This is why Markov chains are used to model evolutionary processes and to draw phylogenetic trees. Given that only things that reproduce and change (organisms, languages) can be organised into nested hierarchies, what does this tell us about evolution? The answer is really very simple, objective and statistically significant phylogenetic trees (the chance of randomly finding a particular tree) drawn from either molecular (amino acid, nucleotide) sequence data, or morphological characteristics can be drawn for all currently known species. If these species were not related by ancestry, then there could be no such pattern, and certainly not a significant one, instead, millions of trees could be drawn that were all equally supported by the data.

We can explain this simply using plants as an example. Plants can be grouped as vascular and nonvascular, nested within these groups, seeded and non-seeded. In those groups are the flowering plants (angiosperms) and non- flowering plants (gymnosperms), within the flowering plants are the monocotyledons and the dicotyledons. It would be highly problematic for evolution if this pattern was not observed in plants, for example, if some of the non-vascular plants had seeds or flowers.

A further note on phylogenetic trees to bring the point home: Not only do we have a lot of highly significant phylogenetic trees calculated for most species, we also have a highly significant consilience of independent phylogenies. This means that trees drawn for the same group of organisms using two or more independent methods (amino acid sequences and morphological characters for example) agree with each other to a high degree of certainty. The degree of agreement for many independent phylogenies is staggeringly high. When researching for this piece we were shocked to learn that for a universal tree containing 30 taxa (which can generate 4.9518e+38 trees), two independent measurements were made to an accuracy of better than 38 decimal places! This is more precise than many of the most accurately measured physical constants such as the mass of the neutron, proton, and electron which are known to around 9 decimal places. It’s hard to see what possible explanation for this pattern could be except for the obvious, they really are all related by common ancestry.


Figure 1.  A typical  phylogenetic tree of over 30 taxa.

The next argument is related but is more general. Across the tree of life there are a number of genes that do exactly the same thing in all organisms, that is, they are fundamental cellular genes that encode fundamental processes no matter if you are a slime mould, a cactus or a man. These can be referred to as ubiquitous genes, cytochrome c is a well-known example, and it does exactly the same thing in all known organisms. One may then assume that given the importance of this protein, it will be similar in most organisms for purely functional reasons, but this obvious inference is actually wrong.

Structural and genetic studies on Cytochrome C have shown that the majority of the amino acids can be changed for almost any other of the 20 possible amino acids. But what does this mean? It means that Cytochrome C is highly redundant and that there are lots and lots of ways you can build a fully functional Cytochrome protein. It has been estimated that there are at least 2.3 x 1093 possible functional cytochrome c protein sequences at the amino acid level. Given this number of possible fully functional sequences there really is no reason that any two organisms would share a significant similarity in their Cytochrome C sequences unless they were related by common ancestry.

There is only one known, tried and true mechanism that allows two organisms to share a redundant sequence, this is of course reproduction and inheritance (vertical gene transfer). So what happens when we look at the sequences for two organisms predicted by evolutionary theory to be closely related? Well, humans and chimps have exactly the same sequence at the protein level, a strong confirmation of recent ancestry between the two species.

Building on protein redundancy is genetic redundancy. That is, the canonical genetic code itself is informationally redundant. There are a multiple codons for most of the 20 amino acids with an average of 3 codons for each, so for any given amino acid sequence there are on average 3n possible combinations of codons that can specify it exactly, where n is the number of amino acids in the sequence. So for a modest protein of 100 amino acids there are 3100 possible combinations that will all make the exact same protein. Furthermore, this genetic code is preserved across all three kingdoms of life. The code of all organisms specifies the same amino acids, start codons and stop codons. So again, there is no reason for two or more organisms to share significant similarity at the genetic level, unless they are genetically related. We could extend this further to morphology, where we often see rudimentary or redundant structures in organisms, which can only be (logically) explained by the process of evolution.

In conclusion, phylogenetics shows us that only branching evolutionary relationships can be arranged into nested hierarchies and that all known organisms fit this pattern at all scales with statistical significance, anti-evolutionists have zero explanation for this pattern even though it has been used to predict viral epidemics and correctly find known genealogies. Secondly, shared ubiquitous genes with high sequence similarity that could be made from billions of other functional sequences is strong evidence for common ancestry, there is no other explanation for two organisms to share these genes except if they were related by ancestry. There are obviously many more types of evidence for evolution, however as molecular biologists we will leave you here for now. Let us know your thoughts on this post and we will happily expand to other evidence if there is any interest.







8 comments on “Evolution is 10^93 more likely than any other hypothesis

  1. richarddmorey
    December 28, 2015

    You’ve commited the (quite common, unfortunately) inverse probability fallacy. The 1 in 10^93 probability appears to be from the number of ways of building cytochrome c. So, I guess you’re assume independence and equal probability across organisms for the non-common descent model, then asking what the probability is that they share cytochrome c? That is, you’re asking “given a particular model of non-common descent, what would be the probability of observing what we see?”

    This is a very different probability from “the probability that common descent is false”. Your probability assumes a particular non-common descent model; obviously, you can’t get the probability that common descent is false from a probability that, in fact, assumes that it is false! So the probability of common descent cannot, in general, be the same as a probability which assumes that common descent is false…

    Also, even if the particular non-common descent model is false (the one for which you’ve assumed for the 1 in 10^93 probability), this does not mean that common descent is the only other option.

    The data do seem to *very* strongly indicate common descent, and lots of evidence points to it, but “1 in 10^93” is not the probability that evolution is false.


    • syntheticduo
      December 28, 2015

      Hi Richard, I think I know what you are getting at but given that all 10^93 cytochromes are equally functional, what probability would you give that humans and chimps have the same sequence if they were not related by common descent?


      • richarddmorey
        December 28, 2015

        In order to answer that question, one would need a probabalistic model of how those sequences arose. I’m fine with your stipulated one for the sake of example, and I’m not questioning your 1/2.3e93 probability. What I’m pointing out is that this is not the same as the probability that common descent is false. They’re not even the same *kind* of probability (the first is aleatory, the second epistemic). Crucially, the substitution of one with the other assumed here is a common fallacy. You’ve confused P(O|H) with P(H|O), where O is the observation (shared sequence) and H is the hypothesis (not common descent, in this case independent equal probability of having a sequence). With the additional confusion of “evolution is false” with your specific H.

        Liked by 1 person

      • syntheticduo
        December 28, 2015

        Hi Richard, I looked up the inverse probability fallacy (confusion of the inverse) and I have to agree, so thanks for pointing that out. Are you aware of similar calculation done by Doug Theobald? Im not sure exactly how model selection theory works but he came up with a probability of 10^2860 that all organisms are related compared to the nearest competing theory. Id love to hear your thoughts.

        Adding to this (From second blogger on this site) – we chose the title to get attention. You are correct though, it is not entirely correct to say that is the probability of evolution being incorrect, however it serves its purpose for this blog (It really does grab attention).


      • Richard D. Morey
        December 28, 2015

        They used Bayesian model selection, which requires some prior assumptions about the models (in addition to the models themselves). I like the approach (with some caveats) and actually it’s closely related to my main line of work.


      • syntheticduo
        December 29, 2015

        Thanks Richard, so I guess in regards to this post it would be more accurate to say something like, UCD is at least 10^93 times more likely than its competing rival hypothesis?


  2. nick012000
    May 16, 2016

    >It has been estimated that there are at least 2.3 x 1093 possible functional cytochrome c protein sequences at the amino acid level. Given this number of possible fully functional sequences there really is no reason that any two organisms would share a significant similarity in their Cytochrome C sequences unless they were related by common ancestry.

    Or unless God decided to create them that way. Your argument would work if you were assuming random chance, but maybe he just really, really liked that particular gene sequence for some reason. Who can say that they truly know the mind of God?


    • ScienceDuo
      July 23, 2016

      I have actually heard people use that as a reply, and it could be used to explain anything, so it really explains nothing. The real question then becomes, what is the better explanation?


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


This entry was posted on December 22, 2015 by in Evolution, Science, Uncategorized.
%d bloggers like this: