Hi, and welcome to lecture number one on genetics and society. Today what we're going to do is we're going to cover the history of genetics. We're also going to talk about the recent developments that have happened in genomics and genetics. We're going to address the issue of how genomics is applied to humans. And the Thousands Genomes Project. And with, then we're going to talk about the challenges of processing all of the information that comes from all of these genomes. What I like to do with history of genetics is divide it into three major areas. Pre-Mendel, Mendel, and then post-genomics. Pre-Mendel includes breeding and other non-genetic kinds of things that people did. Prior to Mendel this uh,would include the, the breeding of plants such as wheat. Breeding of animals like dogs. And even Aristotle has in his writings some interesting observations about breeding. Before Mendel, scientists like Charles Darwin knew about heredity. And they knew that there was something going on with heredity. But they didn't have the mechaism down. And then Gregor Mendel worked on the genetics of peas and he discovered the two major laws of genetics in the 1870s and these are the laws of segregation and random assortment. And then in the 1950s, Francis Crick, Maurice Wilkins and Rosalind Franklin working on the structure of DNA discovered that it's a double helix. And then a lot of really very interesting and important genetic work was done between the 1950s and the sequencing of the first full genome of an organism, [UNKNOWN] influenza. So let's take a closer look at the age of genetics, that is the age between Mendel and the sequencing of genomes. Hugo DeVries was one of the scientists who rediscovered Mendel's laws in the early 1900s. And the rediscovery of these laws started the field of genet-, genetics formally. Karl Landsteiner was a human geneticist who studied human blood groups and was one of the first to apply genetics to human populations. And then in the 1930s to the 1970s, a formulation of genetics in the context of populations was developed and J.B.S. Haldane was one of the formulators of, of that theory. In the 1940s, there were series of critical experiments on the, the nature of DNA and the structure of DNA that culminated in Watson and Crick's. Discovery of the double helix. there're three experiments that we'll talk about. Oswald Avery's experiment, pinning down the fact that DNA is the genetic material, Matthew Meselson's, work showing how DNA replicates and Marshall Nirenberg's work showing how the code works. The classical experiment that was done to, to show that DNA is the herditary material is this experiment done in the 1940s by Avery and his colleagues. Avery used a clever system whereby he had two kinds of bacteria. a resistant bacteria and a bacteria that caused infection. What he would do is inject these bacteria into mice. And if the mouse died he knew he had the bad strain of bacteria. So what he did then, was start to break the bacteria up into their component parts. Into their proteins and their carbohydrates and their DNA. And he would then mix these component parts with bacteria that weren't infectious, and then inject them into, into the mouse. What he did in the first experiment is that he took carbohydrates from the smooth cells and mixed them with the rough cells. And then he took that and he injected that into, into mice. And if the transforming material, or the material that makes a bacterian infectious is in the carbohydrates then you would expect the mouse to die. The mouse lived. The next experiment he did was he took proteins from the S strain and mixed them with the R strain. He then injected that into the mouse. And again if the mouse dies, the material that he mixed would be the transforming material but the mouse lived. Over here on the right what he did, was he took the DNA from the smoo, smooth strain, or from the infectious strain, treated it with Deoxyribonuclease, a material that will degrade DNA. And then injected that into the mice, and those mice lived. This experiment is the critical experiment and in this experiment what he did was he took DNA from this smooth string, mixed it with the rough string injected that into the mouse and the mouse died. And what this indicates is that DNA then is the hereditary material and the transforming material that carries information from cell to cell. The next experiment that helps us understand how DNA works was done by Matt Meselson in the 1950s. At that point in time, scientists knew that DNA was double helical, but they didn't know how it replicated, and what Meselson did in this experiment was, he labeled different strains of D, of, of DNA with different radioactive compounds. Making them heavier or lighter. And once you spin this in a centrifuge, the DNA migrates to diff, different places. And what he would then do is mix the 2 together. And allow them to replicate. And if the DNA replicated on itself. You would expect both bands to be a pair in the resulting reaction. However, if DNA replicates semi-conservatively then you would expect a mixture of the two bands which is exactly what we see when the experiment is done. And so he was able to show that the DNA first unwound into single strands. And then the new strands were replicated anew onto those old strands that had separated from each other. Another very important set of experiments that was done was, were experiments that were used to crack the genetic code. That is, the code of triplet bases, Gs, As, Ts and Cs, that then code for amino acids, the 20 amino acids that are available to our cells. In this experiment done in the 1950s and early 1960s by Marshal Neurenberg what he did was he took and made synthetic DNA with all U's in it. U, u, u, u, u, u, u. And then put that into an extract and allowed that to get translated into a protein. And lo and behold what happened when he did that was he got all, all phenylalanine. What this means is that UUU codes for phenylalanine. He did the same experiment for UUC and also got phenylalanine. This means that UUU and UUC code for phenylalanine. By doing all possible combinations of Gs, As, Ts, and Cs. Neurenberg and other researchers were able to crack the three base code for how DNA translates into protein. Going on with the, with the timeline. Fred Sanger, very famous. A biochemist developed DNA sequencing, using what is called Sanger sequencing. Kary Mullis, right here, developed PCR, the polymerous chain reaction which is used in almost every aspect of modern biology. And finally we have to keep in mind that databases were created in the 1980s to hold all of the sequences that scientists obtain. In in their research. Kary Mullis's development of PCR, again, revolutionized, revolutionized modern biology and it's still used in every, in almost every aspect of DNA sequencing. Then we come into the genomics age, and as I said in 1996, the first bacterial genome was sequenced, the haemophilus influenza, but at the same time, other genomes were being sequenced, such as the yeast genome. The worm genome and the fly genome, all done in rapid succession before the year 2000. Following that, the arabidopsis genome and the announcement of the first draft of the human genome in 2002. And then a whole onslaught of, of research on other human genomes. Subsequent to that, there was a development of lots of brand new data bases and ways to analyze data. first of all Encode it which is a data base that taught that stores DNA sequences and talks about them in the context of the proteins they make. The HapMap, which is used to discover, disease loci in humans. And then, now we're on the verge of personal genomics. This is a, a stage in genomics where individual genomes can be sequenced and these individual genome sequences can be used to learn things about, disease states in, in humans. Okay, why are we able to do all this sequencing? Why has the, the amount of sequences exploded? in the early 2000's, this machine was used to do a lot of sequencing. And indeed versions of this machine were used to sequence the human genome. Just recently in the last 5 years, new platforms such as 454 and Illumina have been developed. And these platforms actually sequence 10s to 1000 times more DNA than the old sequencing machines. And its this increase in throughput, this increase in the amount of single machines can do. Which is causing the, the data deluge that we're seeing in modern genomics and genetics. Shown here on this slide are the kinds of, of data that come out of these different platforms. The Sanger sequencing platforms just shows lanes several lanes from top to bottom. And the different colors on the, on the figure are g's, a's, t's and c's. The 454 platform in illumina platforms are so powerful because they can use single spots on their platforms and these single spots are actually a single DNA sequence that can be generated. in, in this way you can get 350 million base pairs per week from a 454 and 350 billion base pairs, base pairs per week from an il, Illumina platform. So, a lot of human genomes have been sequenced. We start out with the, the NHGRI, National Institute for Human Genomic Research. Where they sequenced several individuals in one shot. Then we have several scientists who, who, have had their genome sequenced. Jim Watson who was one of the to have his genome sequenced in 454. S.J. Kim, who is a Korean scientist and great inventor who is an American scientist. And then different populations of humans were sequenced. These African gentlemen had their genome sequenced about 5 years ago. And this opens up the, the possibility for understanding genetics. Then what we start to see are indiv, individual genomes being produced for people like George Church who's a scientist. Dr. Lipsky, who's also a scientist, and some celebrities. And then finally, we get a Neanderthal genome and a [INAUDIBLE] genome. genomes from fossils. Now, all of this information is, is, overflow. What we see in the blue line here is Moore's Law. And this is a law that says computing power will double every 18 to 24 months. This is the number of websites that, have grown, since the year 1990 to the present. And this is what genomes are doing and if, if we could draw this, it would go straight up in the air as, many, many yards. All of this information is making it very, very difficult for researchers to handle the information. And it's somewhat of a data overflow, but because the data are so important for our health and our understanding of, of, of our being humans the scientists who do this are continuing with, with this research full speed ahead.