0:05

So we've discussed already techniques that can be used when we're working with

categorical data.

So in the context of the movies, we looked at movie genres,

romantic comedies, dramas, comedies, as different types of genres,.

The way it might be presented in our data though,

we might have coded that as genre number one, genre number two, genre number three.

That's one type of data that we're working with.

Quantitative variables make up another type of data that we're often

working with.

So think about the number of units sold, the prices charged for a product,

the volume of advertising that we're doing,

the number of products that we have on order, our inventory levels.

All of these variables that might appear in our data sets,

we can perform arithmetic on them, we can perform multiplication.

All of the addition, subtraction,

those type of operations that we are used to can be performed on that type of data.

What I want to discuss right now are what are the numeric techniques?

So how do we summarize that data numerically and how dow we describe those

relationships that may exist among quantitative variables.

As well as discuss some visual techniques that can be use to understand

these relationships.

1:23

An example,

in finance, that often comes to mind on the concepts of Risk and Return.

So if we think about a financial investment,

there's an expected return that we might earn on that investment.

But we're also taking on a degree of risk.

And typically, when we're looking for

a higher return we're willing to accept more risk.

Well the return that we're getting kind of gives us the expectation.

The level of risk tells us how much volatility there is.

So in addition to being relevant to the context of financial investments,

we might also apply those same concepts of return and risk to customer valuation.

If we think about how casinos and hotels look at their guests,

some guests are the high rollers, the whales who are expected to spend a lot of

money during their visit, so a high expected return.

But from visit to visit we might see a lot of variation,

there might also be a high level of risk.

Now those customers are very different from customers who are predictable,

reliable in terms of their behavior where they might not spend as much but

what we do observe is that there's less volatility in their behavior.

We might observe that with product demand as well.

Some products that are very popular, well those products may not always sell well,

so over the course of the year there might be a lot of volatility in those sales.

As supposed to more niche offerings that perhaps have a smaller interest base but

it tends to be more reliable.

So, if we look at the financial context just as an example, these are stock

returns that have been pulled for Verizon in 2015 and are displayed as histograms.

So the height of the bars in these histograms

reflect how frequently a given return is observed.

So if we look for example at the 1 day returns in the upper left of this graph,

the height of the bars particularly the ones near 0% indicating a high

likelihood that on a given day we're going to see very small returns, but

occasionally we observe the 4% return.

We observe the -4% return.

So that spread that we're observing, that's reflecting the degree of risk, or

the degree of volatility that we're looking at in this data.

So we're going to look at numerical methods to summarize

whats the expected return.

We'll also look at numerical methods to summarize

the degree of variability that we're observing.