Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

From the course by University of Houston System

Math behind Moneyball

24 ratings

University of Houston System

24 ratings

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

From the lesson

Module 8

You will learn how to use game results to rate sports teams and set point spreads. Simulation of the NCAA basketball tournament will aid you in filling out your 2016 bracket. Final 4 is in Houston!

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

Okay, let's continue our discussion of ratings and let's talk about how to predict the total score of a game and also, how many points each team scores. Instead of just predicting like in the last video how much one team beats another by. because you can bet in football on the total points or you can bet on the over and under amount like you might be able to say unless your Super Bowl Patriots. Seahawks the over and under number might have been 45 or 46. So you could bet on whether the total would be over or under that. Okay, in order to set up a model for this, we need a different set of changing cells than the last model. We need to have a changing cell for the average points scored in an NFL game. We need to have a home-edge changing cell like we did before, and then we need an offense and defense rating. So like, an offense rating of +5,

Okay, so how would you predict how many points each team would score in a game? We also need a home edge. And we'll assume this half goes to the offense and half to the defense.

So if the home edge was four points, let's say it means you score two points more and give up two points less. Okay, so let's take another example. Let's suppose the Colts offense, pretty good with Andrew Luck. Seven points better than average. Defense, one point worse than average. And let's suppose the Texans, two points better than average, and defense, pretty good. And let's suppose the home edge is four points.

And again, we don't know these things, but once we figure out how to make a prediction for each team's points based on these parameters. The solver can choose these parameters to minimize the sum of squared errors. And let's suppose then the mean points in a game is 25. All right, so let's predict.

Where the rodeo is every March. So it's at the Texans, okay. So we predict that, so how many points would the Texans score? You start with the mean of 25, they get half the home edge. Okay, your offense is worth two points. And the Colts defense will make them score one more point. So that would be 25 you start with. You'd add two points for being at home. Two points for their offense, one point for the Colts' defense. 25, that would be 30 points for the Texans. Now, what about the Colts?

You start with that mean of 25, they lose two points for being away. In other words, we're giving half that home edge to the Texans' defense. Now, the Colts offense would add seven. But the Texans' defense would knock that down by three. So that's in 20, 23, that would be 27 points for the Colts there, okay?

So we predict the Texans by three there. And you see, we've got the Colt and the reason for that is if you look at this, the Colts are seven points better on offense. One point worse on defense so the Colts overall are six points better than average

And the Texans are what, they are five points better than average so they'd be one point worse. But they're getting that four point home edge, so they should win by three. And so this checks out. Okay, so now we start rechanging cells, the average points in an NFL game the home edge, and an offense and defense rating, these are the right numbers, it turns out. And then the rating of the team will be offense minus defense. And that'll duplicate, actually, the ratings that we got in the last spreadsheet. Those of you who are real math gurus can actually prove that by you writing down and setting the partial derivatives that define basically

how to solve or fix all these problems. Okay, but what we want to do is predict the home points, predict the away points and minimize the squared errors, and that'll give us what we want, so we have the same data, NFL 2013 season.

And what did we predict for the home team? Well just look at what we did here in the Colts-Texans game. We'll predict the home team. I did an if error because there are certain rows that have no numbers. You start with the mean, you add on half the home-edge and then you look up in the second column the home team's offense. And you look up in the third column the away team's defense, and that's just what we did in the Colts-Texans game. What's your forecast for the away team's points? You start with the mean. You take away half the home edge because being an away team, they're going to score less. Then you look up in the third column, the home team defense, and look up in the second column, the away team offense. because you're trying to predict what the away team scores, and then you square the errors. You do an if ever okay just because those stupid rows that have no data, you put in a zero. You take basically, home points minus home forecast squared, away points minus away forecast squared. And when you put this in the solver, you basically have the changing cells be the offense and defense rating for each team. The mean and the home edge and the average offense and defense rating should be zero. So let's change these numbers. Not a Patriots fan, so their offense is a -10, and their defense is a +5. That would make them pretty bad. Okay, since I'm teaching. I like the Colts and I like the Texans, so I'll make the Texans good, and I'll make the Colts good.

Okay, now I'll make the home edge -10, which is ridiculous. And the average points will be 15. So the point is, I'll make it start with something that's going to make really bad forecasts. My sum of squared errors is 101,000. So now I go to solve or minimize L6. That's the sum of the squared errors, change the home edge to mean and their offense and defense ratings of each team. Set the average offense and defense ratings to zero. Don't check this box or else everything has to be positive and the average can't be zero. You use this GRG nonlinear and it should work out. We've got it. Let's see. Denver.

Run it one more time it seems. It gets a little bit better there. If I run it a second time, it really shouldn't matter because I know Arizona is supposed to be about a 6.45. Yeah there it goes. So, I guess it didn't quite get the right answer the first time. We've got Seattle was a 13.04, there they are. So, if I was going to predict the Superbowl point total, it's a neutral field. And Denver got slaughtered, I understand that.

This is before the playoffs started and after getting some playoff data, we would have predicted differently, of course, but I'm not going to worry about that. So we would start with Denver, you'd start with that 23.4, their great offense is 14.1 points.

Okay, so that's 23.4 + 14.1- 8.93, so I've got Denver to score 29 points and Seattle, we'd start with that 23.4 out of Seattle's offense, 4.11.

Rounding off, we had 30 to 29. We had Denver, Seattle by two with, we basically didn't round off the point, so this is. But basically, if you're with the gist in the first spreadsheet, we had Seattle by two but this would be Seattle 30 to 29 and predicted total points 59. Now this would have changed after looking at the playoff games. So this is a much more powerful method because it tells us, for instance, who had the best offense, well, Denver because Peyton Manning set that record for touchdown passes. 14 points better than average. But it was the best defense. And they say defense wins championships and in this case, it was certainly right. Carolina had a really underrated defense that year. And Arizona got no credit for being a good team that year. I don't even think they made the playoffs. And the 49ers were a solid team that year. Okay, really [INAUDIBLE] they were almost as good as Denver. We won't talk much about Jacksonville, they look pretty bad. And Cleveland was pretty bad. And Oakland was pretty bad. As were the Redskins, pretty bad. At least they were balanced. They were bad on both ends of the field. Four points worse than average on offense, five points worse than average on giving up points. Okay, so that explains a way that we can predict a total score of the game, or how many points each team scores. We'll give you a different more complex way to do that in the next video, because the method I've just shown you for soccer would not work. You might want to think about why.

Coursera provides universal access to the world’s best education, partnering with top universities and organizations to offer courses online.