[MUSIC] Now we're coming back to the 1589 deletion. They're very well aware that it was relatively easy to work in the R2B1 region because this region is not essential for it to be function. And so for them this is a very fortunate situation. Okay, so they are going to use 1589. And this paragraph of the paper is written joining two genes together to test their hypothesis in a very brilliant experiment. And they're going to combine mutations in R2A with 1589 and test their effect. So what they do here is they take a series of mutation in R2A and combine it with 1589 to see the result. If the mutation R2A is a base substitution, a typo, a change of a letter, you type C for T or G for C, you're not going to affect the function. If the change in the R2A region is a frame shift, deletion or addition of the base, you're going to abolish the B function. That's the B side. What they found that they had a mutation that it was a frame shift induced by proflavine in R2A that would abolish R2B function. And they managed to isolate a suppressor of this mutation, like FC1 being a suppressor of FC0. And when they put the two, the mutant and the suppressor, they recovered the R2B function. And finally they tested a bunch of deletions. Some would abolish the B function, the one that delete. Not an equal even number of codons. 3N nucleotides and those that would lose the function will be those that remove 3N plus or minus one nucleotide. So with that they had essentially nailed their model by using the fusion of the two genes, the chimeric protein, okay. Now, the last figure of the paper discusses the fact that they can isolate the same kind of suppressors at the other end of the R2BG. This is a remnant of Crick's original idea. He started with P13 that was going to become FC0, and P83, another proflavine induced mutant, and he isolated suppressor at different sites nearby. The same thing as the FC0, he just didn't pursue this because the distance here, the genetic distance is much smaller and it's much harder to do the crosses. So this is like a non-ironed out part of the story, although he doesn't say why he ever used P83. Now one of the most difficult discussion in their paper is the coding ratio of 3 or 6. This is a difficult issue. A 3 base code or a 6 base code? If the code is 3, then you only have two kind of frameshift mutations, +1 and -1. +2 is the same as -1, +3 is not a mutation, +4 is like +1, -2 is like +1, etc, etc. The only three possible reading frame, but the code could be a code of 6 and not of 3. And then the FC mutants are necessarily plus or minus two bases because of the triple mutant result. So if the code is a code of 6, and if FC0 is say +2 and the suppressor of FC0 are -2. They should be frameshift mutants induced by accidental proflavine that are +1, +3, +5. None of these mutants would be suppressed by the FC mutants, and they would not suppress any of the FC mutants. It would be a completely separate class of friendship mutants. So their reasoning is based on the fact that it identifies six mutants in R2B1 independent of FC0. If you want brothers and sisters of FC0 that are all A to + or Now this is a result but then you have to weigh this result. And to weigh this result, you have to calculate the probability to have out six mutants only even number and not uneven number. And if you calculate this you find that the probability is something like .02% or maybe .001%. But it's not negligible. The code could still have been a code of six. So they're careful in this argument. And then the last experimental part, if you want, is the notion of degeneracy. Is the code degenerate? And all of that depend on how large the R2B1 region is. And of course they don't really know. They know that, but they make a suggestion. If the code is completely not degenerate. Okay, now what about the generacy. You know that there are 64 codons, right. You know there are 20 amino acid. So a non-degenerate code must have 20 sens codons and 44 non sens condons, okay. So the probability of having a sens codon is 0.31, and the probability of having a non sens codon is 0.69. So if you want to evaluate the chances of having a stretch of amino acid, or stretch of codons, with or without a non sens. A stretch of codon without a non sens property of it's one codon, 0.31. If it's two codons 0.31 x 0.31, and if it's n codon, it's (0.31)n codon. So if you calculate this and you say n=4, which is a very small region, right. The probability of having only sens codon a stretch of four is 0.008. And if you take six amino acids, the probability is 0.001. So they knew that the R2B1 region, or there was some guessing that the R2B1 region was about 30 amino acid long, R2B1 is about 30 amino acid. So 0.31 power 30 is so low that the probability that the code is not degenerate is vanishing this mole, and you can basically say that the code is degenerate. This is the reason which they give us in one sentence. Now, in this paragraph comes the only completely stupid proposal that they put forward. Because they go from the degeneracy to say that if the code is degenerate it could account for the major dilemma of the coding problem and the dilemma is the following. The base composition of the DNA vary a lot, that's Chargaff's rule. But the amino acid composition of the cellular protein are very similar. Why is this completely stupid? Because the DNA composition reflects the entire genome. In bacterium like e coli, it's 4,000 genes. When you calculate or measure the amino acid composition of their proteins, the cellular proteins. What do you call it? Amino acid composition of their proteins. What you do is you look at the most abundant proteins, like the ribosome. The ribosome is 50 proteins. It's 1.3% of the coding capacity of the genome. So the total amount of bases in the genome could vary a lot, while the ribosome gene could vary very little. That's why this is a completely stupid proposal. Because you compare the genome with the fraction of the genome that you're recovering proteins, which is a small fraction of the genome. If you take a pancreas beta cell that makes insulin, insulin is a few percent of the total protein made by these cells. If you take the adjacent cell in the exocrine pancreas that made the digestive enzyme, there's no insulin but there are all these digestive protein. Roughly there are about 20 proteins that are made in such a large amount that they are the major protein of that cell. But the genes encoding this protein is a tiny fraction of the total genome. So this is, but you see, this is probably one of the reasons why this paper is such a piece of art. Because a piece of art should have some imperfection, and this is the only imperfection in this paper. Now at the end Crick describes a meeting in Moscow, the Biochemical Congress in Moscow, where Nuremberg explained how he could make proteins with synthetic RNA. And they were very excited about that. And basically they say if the coding ratio is indeed three as our result suggests, and if the code is the same throughout nature, then the genetic code may well be solved within a year. That was quite optimistic because it took five years, but it didn't take 30 years.