Okay. Let's take that random data we've just created and train a GBM model on it. So if you're starting from a fresh session, you may need to run those commands. I've already run them like I'm following on. So, going forward, there's a couple of conventions we're going to follow, more than a couple. Training data will be called train. Our validation dataset will be called valid, and our test data will be called test. In addition, the column we want to learn, I'm going to be calling y, and the columns, I'm going to use to train with are going to be in x. So I'm going to be trying to learn the income, and that's a numeric column, meaning this is a regression. And this set, different Boolean sets is talking about, it's going to take all the names in our dataset but exclude ID. We already discussed that's a bad idea, and exclude the thing we want to learn, of course. So let's just run them and then our model. You've seen the first three lines before, the fields we use, what we want to learn, our training dataset. I'm going to explicitly specify a model ID from now on, just so I can keep track later on. And the whole point of this, we're specifying a validation frame there or valid data. So let's run that. It didn't take long, good. And then just going to run these three commands to see how it did on each of our three datasets: the training, the validation data, and the test data. So the training data, go to mean arithmetic error of 895. And, remember, we're not expecting an error of zero because we deliberately put some noise in the training data. Then we have a validation error higher than that, significantly higher, 900 versus 1400, and if we look on the test data, also just under 1400. So it looks like we already have a bit of overfitting just on the default settings. Let's switch over into do the same thing in Python now. Just initialize, you can see it's using a running cluster, grab the data we want, so H2O get frame, and then the name of the three datasets. We'll just double check if we have the right data, yes. So now we're going to bring in the GBM and create the same x and y fields. The way I like to do it in Python is create an ignore fields list first, and then use a list comprehension, I think they're called. i for i in train.names if i not in ignore fields. Now, we create our GBM model and train. In Python, the model ID is specified in the constructor when we create the object, but the validation frame goes with x, y, and train. Okay. Let's build. I could do the same model performance, but I'm going to use a new function, MAE, which will just tell me the number I'm interested in. So mean average error of 895, 1444, and 1365, just about the same numbers as we saw in error