We know that, in the process of protein synthesis, 64 unique inputs get translated to 21 unique outputs (20 amino acids plus the stop output).
Assume that a system, based on Y number of inputs can successfully carry D number of messages.
The formula M = D* ln(D) calculates the minimum number of inputs M required based on the number of messages D the system needs to carry.
I is the “Level of Inefficiency”. For those familiar with thermodynamics, think of this value as the heat generated when work is produced.
S is the minimum number of inputs required. Mathematically S is the ceiling of M. Y needs to be a power of 2, if we are assuming that the DNA is binary, and bigger or equal to S.
To reach optimal conditions, we clearly need to reduce the I value (Level of Inefficiency). This seems to be happening at the point where D=21 and S=64!
The table below shows the first 29 D values.
I have mathematically checked the first 10 million D values and D=21 still gives the lowest possible I value.
Here is a graph of the first 2 million values of log(I) against log(D)
To generate the graph above, just run this python script: https://github.com/binaryDNA/symi_hypothesis
Is it just a random coincidence that 64 inputs and 21 outputs have the lowest possible I value (Level of Inefficiency) out of 10 million possibilities? Unlikely.
The SYMI Hypothesis introduces a very interesting question, which I think will be better explained by mathematicians in the future. If it’s proven to be correct, it will show that nature has selected the 64 inputs /21 outputs combination because it’s the optimal way of safely transporting data with the help of an error correction mechanism.