So now, let's talk about the Netflix recommendation system. Alright, so the idea is that we have some predictor. Just this big black box right now, okay? The predictor takes some input, which is what we know. And generates an output, which are the predictions. And there was predictions for what we think that you'd like. Alright, so the predictor's going to say oh you know this this movie I think you're going to rate it as a four star or something like that. So for Netflix the input to the recommendation system is each rating. Now the ratings are, are composed of a few different metrics which are useful to us, a few different data points. first one is the user ID, so who is the person. This is Bob, this is Alice, Charlie, whoever. then the movie title. Alright, so it's, so is this Good Will Hunting, is this is A Beautiful Mind, is this Lion King whatever movie it is, what were the number of stars. So, how many star ratings did you give right. So it has who you are, it has title of the movie and it has the rating you just gave and in terms of 1 to 5 you either rate it a 1, 2, 3, 4 or 5. And also the data of a rating is important too. And Netflix does use that but we're not going to look at that here but data rating is also an input there. And the output is first a set of predicted ratings. Okay, so we don't know, you haven't watched this movie but we're predicting you would rate it as a four star or a 3.8 stars, or something like that. It doesn't' have to be a an integer number even though when you have the input it's always integers, the output won't always be an integer. the reason it won't always be an integer is because you won't actually see those predicted ratings. As the user what you see is the recommendation. So this is what's actually transparent to the users, the recommendation. we think you should watch movie X, movie Y. You watched Good Will Hunting. We think you should watch A Beautiful Mind, or something along those source. So they, they don't show you the predicted ratings. Then they show you recommendations. So, this is not shown to you, this is shown to you. And this is just showing it graphically though and we do actually need to deal more so with the predictive ratings here. we can't deal with the recommendations. So, we have to just evaluate the predictor for accuracy. And we'll talk about that in a second. Let's first just look at this table just to get an idea. So, on this side we have the input to the system. It's we've given it a tabular form. There is shown as a table. Down the rows are users, and across the columns are movies. Right, so this user and this movie, you can see this has a question mark here, something that we don't know. You can just read. Then we'll say maybe this user and what about this movie, right. So you can come over through and see that he rated this movie a four star. So on. It's just a, a concise way of depicting that input. And the question marks, again, to note the ones that we don't know. So, we don't know what this user rated this movie. We don't know what this user rated this movie. Because it's, it hasn't been inputted yet. It's not that it's there, and we just don't know it. But we don't know what this user thinks of this movie, 'cuz he hasn't watched it yet, for instance. Then. And then make a prediction and the output again is the user movie table. We can look at it as such at least. But it's just the the predictions, right, so the predicted ratings. We predict 3.1 for this movie. We predict 3.2 for this one that has no watch, 2.8, 4.5, 3.8, 3.2. And then we can use this data to say then for each user what we think the person should watch. So well, for this, for this user up here, for instance we see that there's two movies, 3.2 rating on this one, 3.8 rating so we would say okay, well you should watch this movie, for instance. Okay. then this user down here, 2.8, 3.2, we'd say okay, well you should watch this movie for instance. Then for the first and last one it's trivial because there's no other movies. But. in general there's just a ton of movies, a ton of users. This again, is just a really small depiction of the idea. And we can, we don't have to write, recommend just one movie, we can recommend multiple ones. We could recommend the top 10 movies, say, a person hasn't watched yet. Then we need to actually evaluate this system. Okay, and again, we can evaluate it in terms of the. Actual recommendation because then we have to ask people, you know, what they think of the recommendations. That we try to have a quantifiable metric. And, the metric that we use is called the, Root Mean Squared Error. RMSE. And it's a mouth full. We'll go through how you actually compute it, and we'll walk through it step by step. but it's just basically an error metric, the lower the better. It's the difference the error, in the prediction versus the actual values. Right? So, really what we'll end up doing is, we will withhold some of the values that we actually know. Like for instance, this five right here, we might hide that from our system. We might hide this three from our system, and this three from our system and then run through with the values that we are assuming we know, and then test and see how close the values we hid are from what they really are. And then we can, therefore make a inference if that's a really low RMSE, if it's really good at predicting a few of the values. Then for the ones that we actually don't know we will assume that it's probably a pretty good system, right because we are not using any of the values of this kind of same ideas that we don't know them, the recommended, recommended system really know about this value, this value and this value. And so that we could take this this and this and looking at this table right here, we'd predict values here, here and here, and then to the RSME and see how good the RSME is.