Welcome, everybody, to the Coursera course in Experimental Methods in Systems Biology. My name is Marc Birtwistle, I'll be the lecturer for this course. Hopefully, some of you have just finished taking Professor Ian Garth's course. An overview of Systems Biology as part of a sequence of courses that we're giving here at Coursera for a certificate in Systems Biology. It's followed by Network Analysis in Systems Biology, and Dynamical Modeling in Systems Biology, and then a Capstone in Systems Biology. So, I hope some of you, after this course, will consider staying on for these other courses that we're doing in the context of the Systems Biology Center of New York here at Mount Sinai which is funded by NIGMS at the NIH. So, today, we're just going to have a brief introduction to what this course is going to be about. First, a little bit about myself as the lecturer for this course. And next, a little bit about the course scope, what we're going to be covering in depth, what we're going to be covering very briefly just in this first week, just to make you aware of some things that are available but we don't have time to go into depth on, and then, what we will not be covering. Then a little bit about Systems Biology and Systems Biology experiments in general. What we mean by Systems Biology, how experiments fit into that loop, and then just some general features of what's important to consider really in any experiment that one does. So, first introductions, my name, as I mentioned a few slides ago, is Marc Birtwistle. I am actually trained as a chemical engineer, but I ended up in Systems Biology. I did postdoctoral work in Cell and Molecular Biology. So, I have kind of a quantitative modeling background for my engineering degrees. But then, I've learned to apply that in the context of Cell and Molecular Biology. I started as an assistant professor here at Mount Sinai in 2012. And my lab focuses on integrating both experiments and computation modeling in the context of cancer. And we're particularly interested in the type of brain tumor called glioblastoma multiforme, which is a deadly brain tumor. And we're really interested in applying Systems Biology approaches to the development of new pharmacological approaches to treating this disease. So that's an area typically referred to as Systems Pharmacology. One thing we're really interested in is cellular heterogeneities. So why cells within a GBM tumor are different from one another and what that means in terms of response to treatments. Some things I assume about you. First of all, I assume that you have a basic knowledge of Cell and Molecular Biology. I try to review some of the relevant concepts as they come up, but that's a very deep and broad field, so I have to assume you have some knowledge of that. Second of all, I am assuming that you have a knowledge of basic statistics. So, in terms of, like, what's a mean, what's a standard deviation, that level of depth. Nothing too advanced there. And I know some of you know that Systems Biology can be very heavy in Mathematics and also in Computer Science. But just to alleviate those concerns of some of you, I am not going to go into a lot of that in this course. Just going to be very focused on just the experiments, the theory behind the experiments, and the equipment and how you actually go about doing some of these experiments. We're not going to get a lot into the analysis or modeling of the data that comes out of it. As you can guess from my training and lab focus, I'm familiar with lots of types of experiments that are used in Systems Biology and how you might use the experimental data from these techniques in the context of a computational model to learn something about the biological system. But that being said, it's really impossible for anyone to be an expert in all the techniques that are relevant for Systems Biology and, therefore, I had a lot of help in developing this course. As you'll see on the syllabus and as you'll see as we go on in the course. I've consulted with experts that are practicing the state of the art in a lot of these techniques. [COUGH] Also, there's lots of different techniques used in Systems Biology. So, and it's not possible to cover every technique with a reasonable depth. So, I opted instead to do a very wide of a lot of different methods at first, but then go very deep into four different techniques. So, what am I going to cover in depth? The first one is a relatively recent technique called mRNA sequencing, and this technique uses next generation DNA sequencing to quantify gene expression in an omic fashion, and the term omic just means on a genome scale, trying to cover everything that might be in the cell. The second thing I'm going to cover in depth is mass spectrometry-based proteomics. And this uses state of the art mass spectrometry methods to quantify the level of proteins also in an omic fashion. It can also be used to quantify protein states or pretty much anything that has mass, which is used to modify a protein. All right. So, the previous two techniques I mentioned are so-called omic scale, meaning they cover genome-wide and kind of a hypothesis-free fashion. And they're usually applied to cell populations rather than single cells just because you need a lot of starting material in order to carry out the technique. That being said, there are some very recent studies coming out starting to do mRNA sequencing as single cells. But, you know, there's often an assumption that you're not doing Systems Biology unless you're doing things on an omic scale. And that's not really the case. There's lots of good systems biology that occurs that is very far from the omic scale that are based on very targeted hypothesis about a biological system. You know, some experiments might be concerned with single cell behavior or dynamic behavior, something that needs to be looked at over a time course, and these are very difficult to probe on an omic scale. Or somebody might be interested in just a small subsystem within a cell, like a kinase signaling cascade where you don't really need to look on the omic scale in order to learn something about the system. So, therefore, I've chosen to focus on two other methods which aren't omic scale but get more into this realm of single cells and dynamics. The first will be flow and mass cytometry, where essentially you can use labeled antibodies or other types of labeled reagents to stain single cells in suspension. And then you can measure how much of this labeling gets into each single cell, usually relatively rapidly about hundreds of cells per second. And that allows you to observe a handful of things, maybe four or so with flow cytometry and up to 40s in mass cytometry on the single cell level across a large population of cells at fixed time points. Okay. And the last thing I'm going to go into in depth is live cell imaging. And that predominantly uses fluorescence microscopy to observe processes in living cells in real time using fluorescent probes that can be used to indicate many different things that are going on inside of a cell. With this, it's very hard to measure many, many things within a cell. Usually, you're observing only one or a few things within a cell. But you can do so at very high temporal frequency. Okay. So, those are the things that I'm going to go over in depth. But there are some things that I'm going to cover very briefly within this first week. And I've classified those into two groups, one that looks at nucleic acids and one that looks at protein. So, for those that look at nucleic acid, these are the things I'm going to be covering briefly. Quantitative polymerase chain reaction or qPCR, a way to quantify gene expression when you know what you're looking for. Microarray was one of the original omic techniques to look at gene expression transcriptome wide which has starting to become replaced with mRNA sequencing, but it's still used quite a bit. Whole genome sequencing, exome sequencing, bisulfite sequencing, and chromatin immunoprecipitation or ChIP sequencing are all relatively recent techniques based on next generation sequencing to look at a variety of things in the genome. And fluorescence in situ hybridization or FISH has been around for a long time, but it's also a way to use fluorescence to look at nucleic acids in cells. Okay. In terms of proteins and protein states, we focus on just a few things, the western blot or the microwestern array is a way to use antibodies to look at proteins or protein states from cell lysates. The same with the reverse-phase protein array, it's just a more high-throughput way of looking at proteins or protein states and cell lysates. And immunofluorescence is a way to look, use antibodies to look at proteins or protein states. But, instead of in cell lysates or in cell populations, look at things on the single cell level. And now onto the things that I'm not going to be covering at all except mentioning them right here. And this is not to say that they're not important, but just that I don't have time to really give them sufficient depth of coverage so that you understand them fully. And also a lot of the other topics that I'll be covering, touch upon a lot of these technologies. The microfluidics is more of an enabling technology where one uses microfabrication to build all sorts of small devices which allow a lot of clever and innovative ways of using the techniques which I will describe to you in new ways. But I don't really have time to cover it in detail here. Metabolomics is a way to use mass spectrometry to measure not proteins or nucleic acids, but metabolite levels within cells or a population of cells, or a tissue. It's important, but I just don't have time to go into it in detail. Plus, a lot of the mass spectrometry methods that I'm going to be describing for proteins equally applied to metabolomics. So, you will get a bit of a understanding of mass spectrometry which will help you to learn more about metabolomics if you'd like to learn more somewhere else. High-throughput techniques, and by this I mean things like high content imaging or, you know, screening assays where you're looking at the responses of many, many, many different compounds. For example, in the pharmaceutical industry, many of the experimental techniques that I'm going to be describing to you have been optimized or scaled up in order to be done in a high-throughput way, but that really doesn't change the fundamental basis by which they work. So, I instead opted to teach you about the techniques so that you understand how they work, and then, you can know that most of them have been scaled up for high dew point assays. So, but I won't be teaching you about that directly. And in other areas, formal design of experiments when you talk about experimental methods, a lot of people think about statistical design of experiments, and all of the formal mathematics that have been built up around that. I won't be going into any of that. We're focusing more on the experimental techniques themselves rather than designing the experiment based on, for example, a model of the system you might already have in hand. Okay. So with those introductions being done, I just want to get into a little bit of what Systems Biology means to me, especially within the context of this experiment. Mental methods, of course. So, you'll hear a lot of people talk about the Systems Biology Loop or this integrative cycle of modeling and experiments. And before we get into that, we just need to know a little bit about what Systems Biology is. And to me and most researchers in the field, there's still a bit of an evolving definition. But something that I think people have settled on is that Systems Biology really seeks to understand how all the individual parts within a cell or a tissue. How they work together to give rise to some kind of function. So, I like to show this picture here which was turned up from a random Google search that, if you take a part of a car and you just lay all the parts out, and you study each part by itself, and you really, you know, you really understand each one. You should be able to understand how the car works, right? And I think the answer by looking at a picture like this is clearly no. You really need to understand how they all fit together in order to understand how the car works, how it drives, how it shifts. So, typically, and this is in general, of course, it's not true all the time. But biologists will like to study one part, so they'll find a gene that's associated with a disease they're interested in or a protein, and they study that protein. They figure out what it interacts with, what its enzymatic function might be, for example. And then that understanding is there about that one part. But Systems Biologists try to build on all that understanding that's been built up above the individual parts. And they try to understand how the interactions between all these individual parts fit together to give rise to some kind of a function. Some try to argue that it's physiology reinvented with a new name. And, that's correct to some extent, although there may be some differences. First of all, that Systems Biology is really focused on a molecular level on molecular and cell biology and how that spans scales, for example. But with that philosophical stuff being aside, let's get into what this System's Biology loop actually is. So, if you think about a common scientific workflow, one that you might learn in grade or elementary school even, is that you have a hypothesis. In order to test that hypothesis about some system you care about, you design some experiments or you do some experiments, or maybe somebody's already done the experiments and you have access to the data. When you look at that data or you gather data and you compare it to what your hypothesis would predict, that feeds into your current understanding. So it may validate your current understanding or it may cause you to change it. And based on that current understanding, you can then come up with new hypotheses for what should be the case if your understanding is correct. And, so this loop continues on. And there is really, you could enter this loop at any point. You can exit at any point. And draw all sorts of conclusions by following this basic Scientific method. So, as I said, you can enter that loop anywhere really, and, although sometimes it's not always followed in current science, the loop really doesn't need to be completed by a single research group. Many of the great discoveries of our past have been made when those arrows in the loop have been completed by different groups and sometimes longly separated by time. For example, Watson and Crick with the DNA double helix structure. They were just analyzing data that was collected by somebody else. Or recently, discovery of the Higgs Boson in particle physics. So this was something that was predicted by models developed long before the very recent experimental evidence for the existence of the Higgs Boson was found. So how is systems biology different like when you take into account this normal scientific loop? Well, normally the current understanding of the system is represented by a Quantitative Computational Model. And that's really, at least in my experience, one of the main differences between this and more traditional biology. And you might ask, well, that doesn't seem like a large difference. What difference does that really make? So, in biology, often that current understanding is semantic or cartoon-based. So, you know, somebody might have an intuitive understanding of the way the system is working but if it's semantic or cartoon-based, you know, it can be very imprecise. Or it could lead to a misunderstanding or really, an inability to make a solid prediction about what might happen. And this is a problem because often biology can be remarkably complex. So it's very difficult to represent what's going to happen in a very complex system in the context of a cartoon or something that's not computable. And one of the main differences in Biology versus engineered system is that, you know, even though engineered systems might rival the complexity of a biological system, the difference in a biological system is that we didn't build it. In an engineered system, you know what's in there because you've built the thing. Where as in biology, we didn't build it. So, we only have partial or, and very incomplete knowledge of what is actually in the cell or the biological system. So that makes use of a quantitative computational model, sometimes a very important thing. So, I'd like to illustrate that point on, you know, the difference between a cartoon-based model and a quantitative model with a simple picture over here. A network motif called an Incoherent Feed Forward Loop which is over represented in Biology, it's found much more often than you would find at random, meaning it's conserved in highly enriched and biological systems. So, in this case you have some input signal, s, which activates some protein called A, A activates quantities C and D. But C comes down and represses D. And, and let's consider in this simple case that D leads to some biological outcome. So, a question that's commonly asked is you know, if you inhibit this input signal, with say, a drug, for example, what's going to happen to DS subsequently then, the cellular outcome? So, the first thing you need to know, which this cartoon will tell you, are magnitudes. Because if A is strong and C is weak, then D would end up going up. However, if A is weak and C is strong, the D would end up going down. So, without knowing the magnitudes, you really can't even predict whether the outcome will go up or down, on or off, for example. Second are the dynamics. Things in a cell are always changing with time, and they have different rates of change. If A is slow and C is fast, then D would go down, then up. But if A is fast and C is slow, then the opposite would happen. So without knowing dynamics, you also have very limited ability to predict what the outcome is going to be in the cell. Lastly is localization. Theells for a while were thought of kind of like this bag of enzymes. But it's become very, very clear that everything in the cell is highly localized and regulated by localization. So if A happens to be localized with D and C is not, then D would end up going up. But if A is distant from D, and C is localized, then D would go down. So again, another factor that we can't see from this cartoon would really influence our ability to predict what the outcome is going to be from a perturbation on this system. So, to me, that sort of thinking illustrates why quantitative computational models are really important for understanding systems biology. Because it allows you to keep track of these kinds of properties of a complex system, without which you have really limited ability to predict what's going to happen within a cell when you make some perturbation to it. So, when you start thinking about experiments in Systems Biology, what kinds of unique features might those Systems Biology experiments have? Well, because of this emphasis on quantitative computational modeling, as I mentioned in the previous slide, they really tend to have two main properties. One is that they tend to be quantitative, of course, because if you have a quantitative model, you really need quantitative data in order to properly parametrize and validate those sorts of models. Many experimental techniques in biology, they can be quantitative, but many times they're not needed to be quantitative. So, one usually needs to have some good planning and usually a good bit more experimental effort in order to make these experiments properly quantitative. In this course, we give a strong focus on how to make these experimental techniques quantitative, and what are the limitations in terms of their ability to be quantitative. Secondly, dynamics. Because a lot of times the quantitative models also describe what's going to happen in time in cells. The experiments also need to be able to probe what happens in time in a response to a privation of cells. Many important phenomena in biology are time dependent, think of circadian clocks, the cell cycle development, responses to drugs, drug pharoco kinetics, action potentials in neuroscience, the list goes on and on. Another important thing about dynamic data is that when you measure things over a time course. This inherently gives you an ability to infer causality. When you're not measuring things in time, you have a very limited ability to infer causality such as a causes b, rather than a is correlated with b. And these two features transcend most Systems Biology experiments, and that's not to say that if you have an experiment that's not quantitative or not dynamic that it's not Systems Biology, but just very often with Systems Biology you will see these two properties of an experiment. Okay. I made a note about this before, when I was talking about what is Systems Biology. But Systems Biology doesn't need to be larger omic scale. There's also so-called small scale Systems Biology. So, and you can learn a lot by such small scale models and small scale experiments. I just list a few references here if you're interested to see about this more in kind of simple approach to Systems Biology. But it's actually a significant fraction of the literature and people doing Systems Biology research do this so-called very targeted small scale Systems Biology. And, that's why I spend half the course on these more low throughput methods. Another feature is that they tend to span across systems and scales. You know, they tend to ask questions that might go beyond traditional boundaries of biological disciplines. So, you know, instead of looking at single proteins within a signaling pathway and asking what control that protein has over signal flux. You know, asking how they regulate each other, say, for example, in a feedback loop and what that feedback loop structure means for the robustness of the system, or how that system might respond to treatment with a drug. Or, you know, for example, there's lots of groups trying to understand how tissue level function such as the shape of a multi-cellular structure, how can that arise from interactions between cells on the molecular level in terms of maybe adhesion between cells or signaling between cells, for example. So, there's a lot of focus on molecular mechanisms in Systems Biology. And that's maybe one way in which Systems Biology is different from traditional Physiology. But, I'm sure as the field evolves, we'll really settle into a consensus definition of what it is and what it isn't. So, it's also important to know, besides those things that are specific to Systems Biology, what are some important features of any experiment. And as we all might know already, experiments are very expensive and time consuming. So we need to be very careful about how we design our experiments. We need to make sure that, that we're doing something that, that will address a very clear and precise question. Usually, it's just a hypothesis that we have that's based on our current understanding. In Systems Biology, that current understanding often comes from quantitative computational model. But our experiment should be designed in such a way that, if you answer that question, the answer will be significant for the system that you care about. And the answer should lead to new knowledge that drives the field forward or accomplishes an important business goal if you're, perhaps, in the context of pharmaceutical industry. So, of course, there's many, many, almost an infinite number of questions you can ask about your system. But very, very few of them will be worthwhile to put the time and effort into investigating them experimentally. So once you've identified an important question in an experiment, that might answer that question. There's a couple of key properties that you need to answer in order to know how to do the experiment. And that's those, those key properties are what I'm going to focus the rest of this week's lectures on in subsequent sets of slides. The first is, what biological system? So, you have a specific question. Do you need to look at human cell lines, could you look at E coli, for example? Do you need to look in a mouse? Could you look at yeast? This is a very important question to answer. Second, how do you need to perturb your system, or what treatment conditions do you need in order to answer your question? What compounds do you need to apply to elicit a relevant response? And now, what are you going to measure in response to that perturbation? Do you need to look at things on the transcriptional level? Do you need to look at only a subset of transcripts? Do you need to use MRA sequencing to look at transcription genome-wide? Do you need to look at protein levels, specific post-translational modifications et cetera? So you need to know what to measure. And although this is, this is a bit intuition, when you talk to people that have been doing experiments for a very long time, or you've been doing experiments yourself for a long time, you tend to see that if the experiment becomes too complicated, that you can't enter your question with just a handful of conditions and measurements. Usually the results are going to be very difficult to interpret and to where your, you may not be asking the right question and designing your experiment in the right way. So, some exceptions may be a screening-based study where you are just using tons and tons of compounds to try to look for a compound that works. But usually, those types of screening essays are still based on a very relevant, specific, and simple question of interest. Okay. And finally, of course, always positive and negative controls. You never know when your experiment is going to fail for one reason or another, but at some point, it will fail, and you'll want to know why it failed when that does happen. So, when you always include positive and negative controls, you'll know, you know, what happened, okay? And lastly, replicates. I'm often surprised at the lack of replicates in biological experiments. Especially quantitative data, one needs to perform at least three replicates to get an accurate estimate of the mean and of standard deviation and in order to do proper statistical inference or hypothesis testing. We won't go into a lot of statistics or, you know, figuring out how many replicates you actually need to do, but, replicates need to be done. If the experiment is too expensive to do replicates as is often the case with, for example, an RNA sequencing experiment. But you still make the decision that it's a worthwhile experiment to do, then you really need to think about how you're going to determine the significant of the results. There are ways, but which are outside the scope of this course. But I just want to reiterate that replicates really, really need to be done in an increment. Okay. So, the next lecture, we're going to be talking about Biological Systems