In this lab, you will get to employ a very useful pattern. You will use BigQuery to calculate useful aggregates, percentile values, and the like over 70 million rows. The result will go into a Pandas DataFrame of a dozen rows. You can then happily use that in-memory Pandas DataFrame for visualization. So, this is the kind of thing that would take you hours if you did it any other way. However, in the lab, you will create the graphs in seconds. It's important to get this kind of interactive development workflow down. Otherwise, you will not be able to work with large datasets easily. Well, you might think that you don't have to work with all of the data. You can simply sample the dataset and work with a smaller sample. However, that is a bad practice in machine learning. One thing I like to say is that the key difference between statistics and machine learning, is how we deal with outliers. In statistics, outliers tend to be removed. But in machine learning, outliers tend to be learned. And if you want to learn outliers, you need to have enough examples of those outliers, which essentially means that you have to work with all of your data. You have to have the distribution of outliers, distributions of rare values throughout your dataset. And in order to do that, you have to work with your complete dataset. One way to do this, is to do what you're going to do in this Lab, which is to use Managed Services like BigQuery, to process data at scale, and then bring it back into more familiar in-memory structures like Pandas, and then use tools like the plotting libraries in Python. So, this is a common working paradigm that we have to get familiar with. And you will learn how to do this in the lab.