Let's test your knowledge with a quick quiz. We'll show you some features and give you an objective. And you say whether or not those features are going to be related to that objective, and whether or not we should include them or not. Assume that you want to predict the total number of customers who are going to use a discount coupon for your store which of these features are related to that objective? Alright. Let's go through it. Now the font to the text in which the discount is advertised, yes or no? Yeah, absolutely. The bigger the font is, the more likely it is to be seen, right? And there's also probably difference between Comic Sans or times new roman. Some fonts are inherently more trustworthy than others. Comic Sans, I'm looking at you. So, yeah. The font of the text in which the discount is advertised, that's probably a good feature for us. What about the price of the item the coupon applies to? Well, you could imagine that people would use a coupon more if the item costs less. So, yeah, that could be a feature. But notice what I'm doing here. I'm verbalizing the reason for why it could be a feature, am not just saying yes or no just by looking at it. I'm saying yes because people may use a coupon more if the item is not highly priced or if it's less expensive. I'm saying yes, people might use a coupon more if they get to see it, if the font is bigger. They need to have a reasonable hypothesis for each and every one of the features, and that's what makes it ultimately a good feature or not. Okay. Next one. The number of items that you have in stock. No. How would a customer even know that to begin with. I mean yes, if you had a feature that said "in stock versus out-of-stock," that could be a feature. But 800 items versus a thousand items in stock? No way. That's not going to have an effect. So we are going to throw that one out. Okay. So, if you like that quiz, here's another. Predict whether or not the credit card transaction is fraudulent or not. Whether the cardholder has purchased these items at the store before? Is that a good feature or not? Well, yes. It could be a feature. Is this a common purchase for this user or a completely unfamiliar unlikely occurrence? So, yes. Whether a cardholder has purchased these items at the store before, that's probably a good feature if the transaction was fraudulent or not. And what about the credit card chip reader speed? Well, what's the hypothetical relationship here? You don't want to use this as an input feature, throw it out. What are the category of the item being purchased? I think fraud. Yeah, well. There's probably some fraud committed on things like television, where not so much for things like say a T-shirt. So, you can imagine that there's a big difference between the categories of items. So, the category item could absolutely be a signal or when they have that B one of our features that we can use in our model. What about the expiration date of the credit card that's used? Just because we have a data point should it be used as a feature? Probably not. Maybe the issue date, because the new credit card experiences more fraud, but not the expiration date of the card. Again, we are kind of just talking through and reasoning through these things.