Okay. The second aspect of a good feature, you need to know the value at the time that you're actually predicting. Remember that the whole reason to build the machine learning model is so that you can actually predict with it. If you can't predict with it, there's no point in building the machine learning model in the first place. So, here's one of my favorite things. A common mistake that a lot of people make is just to look at their data warehouse and just take all the data you find in there, all the related fields, and then throw them all of the model. The machine is going to figure it out, right? So, if you take all these fields use it in a machine learning model, what happens when you're going to go predict with it? Well, it turns out, when you go predict with it, maybe that you discovered that the data inside of your warehouse had, say, sales data. So, that's going to be an input for our model. How many things were sold in the previous day? That's going to be an input for our model. But, here's the rub. It turns out that the daily sales data actually comes in a month later. It takes some time for the information to come out from your store. And there's a delay in collecting and processing this data. And your data warehouse has the information because somebody has already gone through the trouble of taking all the data or joining all the tables together and putting on an pre-processing in there. But at prediction time, in real-time, you don't have it. So, therefore, you can't use it. So, some of the information in this data warehouse is known immediately, and some of the information is not known in real time. So, if you use this data that's not known at prediction time, if you use this input to your model, now your whole model is unfortunately useless because you don't have a numeric value for that input for what your model needs. Remember again that sales data comes in a month later and if your machine learning model is using that fuel that comes in a month later, it's not going to know that at prediction time. So, the key point here is, make sure that for every input that you're using for your model, for every feature, make sure that you have them the actual prediction time. You want to make sure that those input variables are even available. So, you're collecting in a timely manner. Many cases, you'll have to worry about whether or not it's legal or ethical to collect this data at the time that you're doing the prediction. Sometimes, that's all the information they have available to you and your data warehouse, but you can't collect it from the user at the time you're trying to do the prediction. Again, if you can't collect it at the time you're doing prediction, you can't use it in your ML model. So let's take another example here. An easy example to remember is, say, let's go back to that building, the housing price prediction model. If we simply had today's sale price of the house and the model and data set, the model could just output what that price was and be perfectly accurate during the training data set because there, it has this magic data field of house sale price. But come prediction time, your new houses for sale won't have already been sold, so your model is useless because you can't feed it what you do not know at prediction. So, I want us to do a bit of a discussion question. Why is the second field here a bad feature? What could go wrong? As a hint, what happens if the cluster ID was, say, take it from another model? What if that model updates without telling you? Will you still be able to train or learn anything from your training data set? Well, the ultimate answer is that feature definitions themselves should not change over time, and also, you have to update your model.