Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Loading...

Do curso por University of Houston System

Math behind Moneyball

36 ratings

Learn how probability, math, and statistics can be used to help baseball, football and basketball teams improve, player and lineup selection as well as in game strategy.

Na lição

Module 5

You will learn basic concepts involving random variables (specifically the normal random variable, expected value, variance and standard deviation.) You will learn how regression can be used to analyze what makes NFL teams win and decode the NFL QB rating system. You will also learn that momentum and the “hot hand” is mostly a myth. Finally, you will use Excel text functions and the concept of Expected Points per play to analyze the effectiveness of a football team’s play calling.

- Professor Wayne WinstonVisiting Professor

Bauer College of Business

In this video, we're going to review some basic concepts about

random variables that we'll need in the rest of the course.

So a random variable is just really any uncertain quantity.

And there's two types of random variables that will concern us.

Discrete random variables where you can list the values.

For instance, how many three pointers does Curry hit in a game?

How many goals does, Manchester, does the US World Cup team,

I'll say women's World Cup team because that's coming up.

Score in a game?

How many runs do the Yankees give up in the next game?

So those are all discrete random variables.

You can list the values.

Okay, now, a continuous random variable we'll talk about in a minute.

Is basically any situation where there are a lot of possible values.

And even though something might be discrete,

it has a lot of possible values we'll model as it continues.

Okay, so continuous random variable, in reality,

it's any value on an interval can occur.

So there's infinite number of values.

So a person's height or weight, Would be continuous.

because a person's height can be anything between 0 inches tall to 10 feet tall.

Person's weight.

Now, some things that are really discrete, but we'll list them as continuous would be

this, how many points, Does Duke win an NCAA tournament by?

Now see that's an integer, you can't win by half a point, no fraction's possible.

But you know what could be, let's say,

between -30 to plus +30 realistically, and when you have that many values,

we approximate a discrete random variable by a continuous random variable.

And how may points does a team beat the point spread by?

We'll assume that's a continuous random variable and

the continuous random variable that would be most important to us will be the normal

random variable that will pop up in the next video.

So for a discrete random variable, we want to know how to find the mean and

variance.

So any random variable has a mean and variance.

So, the mean is just the average value.

And a variance is average square deviation from the mean.

And there's the standard deviation that's the square root of the variance.

So, a quick example of how to do this for discrete random variables.

And then we'll go on to talk about the important continuous random variable,

the normal random variable that we will basically

need when we talk about streakiness in the hot hand.

Will pop up a lot throughout the class, and

the concept of the z score will pop up a lot in statistics and analytics.

So we need to briefly talk about that and then we could get back to sports.

Let's suppose you're given how many yards you could gain on a football play.

You might gain on a pass 0 yards.

You're throwing a pass up the middle, you might gain 9 yards,

if the guy gets tackled, if he breaks the tackle, maybe 16 yards.

You might get sacked for, let's say, -7 yards.

So these are the values.

And the probabilities, let's just make some up.

So maybe there's a 30% chance you don't complete the pass.

There's a 40% chance that you throw it,

guy gets tackled immediately.

Maybe there's a 15% chance he runs seven more yards.

And there's a 15% chance you get sacked, could be interception, but

don't [INAUDIBLE].

So what's the mean?

You take the probability of each value times the value here at the bottom.

So the expected value you can use sum of product.

So you pick the 0.3 times 0 plus 0.4 times 9 and that would work out fine here.

Sum of product, the probabilities with the values.

Now, the variance is the square deviation from the mean, so basically,

I need to take the squared deviation from the expected value.

And so in each case, I take the value minus the mean,

$ the mean, and you square it because otherwise

the deviation the negative deviations cancel out.

For any random variable, if you take the expected deviation from the mean without

squaring, you'll get zero.

You could check that.

So when I got 0 minus 4.95 squared, that's about 25.

Now to get the variance, you take the average of that, so

you take the probabilities times the squared deviation.

And then the standard deviation would be the square root of that.

About seven.

Okay, so those were some basic concepts with discreet random variables.

On the average, we gained five yards.

Even though we would never gain five yards, what that average means is if you

would play this experiment out a thousand times like we did with Monte Carlo and

take the average of the yards gained, you'd get the 4.95.

Okay, in the next video I want to introduce you to a really important

continuous random variable, a normal, or bell-shaped, random variable.

And we'll show how to calculate probabilities for a normal random variable

in Excel, because we will really need that later in the course.

O Coursera proporciona acesso universal à melhor educação do mundo fazendo parcerias com as melhores universidades e organizações para oferecer cursos on-line.