Now, it's time for the exciting topic. How do the machine learning on BigQuery? Before we go into the syntax of model building with SQL, let's discuss very quickly how BigQuery ML came about. As you saw from the earlier ML timeline, doing machine learning has been around for awhile, but the typical barriers have been, number one, doing ML in small datasets in Excel or sheets, and iterating back and forth between new BigQuery exports, or if you're fortunate to have a data science team at your organization. Number two, is building time-intensive TensorFlow or scikit-learn models using an expert's time, and still you're just using a sample of datas that the data scientists can train, and evaluate the model locally on the machine, if they're not using the Cloud. Google saw these two critical barriers; getting data scientists, and moving data in and out of BigQuery, as an opportunity to bring machine-learning right into the hands of analysts like you, who are already really familiar with manipulating and pre-processing data, or you soon will be by the end of this specialization. Okay, so, here we go. Let's talk about how you can now do machine learning inside of BigQuery using just SQL. With BigQuery ML, you can use SQL for machine learning. Let's repeat that point, SQL, no Java, or Python code needed. Just basic SQL to invoke powerful ML models right we are data already lives inside of BigQuery. Lastly, the BigQuery team has hidden a lot of the model knobs, like Hyperparameter tuning, or common ML practitioner tasks, like manual one-hot encoding of categorical features from you. Now, those options are there if you want to look under the hood, but for simplicity of the models will run just fine with minimal SQL code. Here's an example that you'll become very familiar with in your next lab. You notice anything strange about the number of GCP products used to do ML here? You got it, it's all done right within BigQuery: data ingestion, pre-processing with SQL, model training, model evaluation, the predictions from your model, and the output into reporting tables for visualization. So, you mentioned before, BigQuery ML, was designed with simplicity in mind. But, do you already know a bit about ML? You can tune and adjust your models Hyperparameters, like regularization, the dataset splitting method, and even the learning rates for the models options. We'll take a look at how to do that in just a minute. So, what you got out of the box? First, BigQuery ML runs on StandardSQL, and you can use normal SQL syntax, like UDFs sub-queries, and joins to create your training datasets, and for model types, currently, you can choose either from a linear regression model for forecasting, or binary logistic regression for classification. As part of your model evaluation, you'll get access to fields like the ROC curve, as well as accuracy precision and recall. They can simply select from after your SQL model is trained. If you'd like, you can actually inspect the weights of the model and perform feature distribution analysis. Much like normal of those visualizations using BigQuery tables, and views, you also can connect your favorite BI Platform, like Data Studio, or Looker, and visualize your model's performance, and its predictions. Now, the entire process is going to look like this. First and foremost, we need to bring our data into BigQuery if it isn't there already that's the ETL, and here again you can enrich your existing data warehouse with other data sources that you ingest, and joined together using simple SQL joins. Next, is the feature selection and pre-processing step, which is very similar to what you've been exploring so far as part of this specialization, and here's where you get to put all of your good SQL skills to the test, and creating a great training dataset for your model to learn from. After that, here it is, this is the actual SQL syntax for creating a model inside of BigQuery. It's short enough like it fit at all within this one box of code. You simply say, CREATE MODEL, you'd give it a name, specify mandatory options for the model, like the model type, fasten your SQL query, with the training dataset, hit Run Query, and watch your model run. After your model's trained, you'll see as a new dataset objects that will be there inside of your BigQuery dataset. It'll look like a table, but it performed a little bit differently, because you can do cool things, like executed ML.EVALUATE query, and that reserve syntax will allow you to evaluate the performance of your trained model against your evaluation dataset. Because remember you want to train on a different dataset that you want to evaluate on, and here you can analyze loss metrics that will be given to you, like the root mean squared error for forecasting models, and the area under the curve accuracy precision and recall, for classification models, like the one that you see here. Once you're happy with your model's performance, again you can kind of iterate and train multiple models, and see which one performs the best. You can then predict with it this even shorter query that you see here. Just invoke ML.PREDICTS, and that command under newly trained model, will give you back predictions as well as the model's confidence in those predictions, super useful for classification. There's a new field and the results when you run this query, where you'll see your label field, with the word "predicted" added to the field name. Well, simply just your model's prediction for that label, it's that easy. But, before we don't dive in your first lab, now that you've seen with just these lines of code that you see here, of how easy it is to create a model, that doesn't mean that it's going to create a great model. A model is only as good as the data that you feed into it, further along in the relationship between your features, and the label. That's why you're going to send most of your time exploring, selecting, and engineering good features, so that we can give our model the best possible dataset for it to work and learn from.