This course will expose learners to additional tools that can be used to perform Data Visualization. In particular, the courses focuses on Tableau, a state-of-the-art visualization package. In this course, the visualization concepts from previous courses are reinforced and the Tableau software is introduced through replication of the visualizations built in previous courses.

Associate Professor at Arizona State University in the School of Computing, Informatics & Decision Systems Engineering and Director of the Center for Accelerating Operational Efficiency School of Computing, Informatics & Decision Systems Engineering

Hello, and welcome to another Tableau demonstration.

In this demonstration, we'll be covering

the time-series data visualizations that we had created in Jupyter Notebook.

Specifically, we'll be creating a control chart and a moving average chart.

In this case, we'll be using the same data set that we use for Jupyter Notebook,

specifically the daily temperatures in Melbourne data.

Since that data is in a CSV file,

we'll open a text file in Tableau,

the daily minimum temperatures,

and see what that's about.

Just like we had for the cereals data,

this window shows us the data source individually in case we need

to create any derived columns or monkey with the data in any way.

However, we're going to use the data as is,

so let's go straight to the first workbook.

To create the control chart, first,

we're going to drag the date dimension to the columns,

and then we're going to drag the daily minimum temperatures measure to the rows.

Here, we can see that Tableau has done something a little bit weird.

It is aggregated all of the dates by their year and

then summed the daily minimum temperatures over that entire year,

which definitely isn't what we want.

So, let's go to the columns field and access this drop down that allows us

to modify the way that Tableau does the aggregation.

Here, we have a couple options.

We can aggregate by year,

aggregate by quarter, month, or day,

or we can aggregate by quarter and year,

month and year, et cetera.

Since we want to see each day individually,

we're going to select the individual day aggregation.

Here, we can see that this graph is a lot closer to what we expected having seen

the data previously in our Jupyter Notebook example.

Now we need to add the mean and standard deviation lines to our graph.

In this case, this is done by adding what are called reference lines in Tableau,

and the way that we add a reference lines is by right-clicking

on the y-axis and selecting add reference line.

To start out with, we're going to add the mean reference line to our graph,

and this is extremely easy to do in Tableau because that

is the default reference line that they'd like you to add.

So, it's a line, it's computed over the temperatures,

and we want the average.

We don't need to change the way that the line looks because we just want the line.

So, let's click okay and see what that gives us.

Here, we can see that Tableau has added an average line to our graph,

and the average falls at 11.19 which is exactly what we'd expect.

Now let's add the standard deviation lines to the graph.

We're going to add another reference line.

But in this case, the standard deviation doesn't come up

in the possible computations for our lines.

What we have to do is we have to add a distribution,

and that distribution occurs over the standard deviation.

Negative one and one is correct because we want them to go from

one standard deviation above to one standard deviation below.

However, Tableau by default has a sort of distribution band

here between the positive one and negative one standard deviation,

but we don't really want that.

For our control chart, what we're looking for is we're looking for individual lines.

So, we can change the formatting to a solid line and make

it orange just like we made the line in Jupyter Notebook,

and remove the fill.

This gives us our one standard deviation line.

To add our two standard deviation lines,

we add another reference line at another distribution.

But in this case, we're going to go to

standard deviation and our factor will be negative two and two.

Again, we're going to reformat the line into

a solid red line and we will remove the fill, and click okay.

This duplicates our control chart graph that we created in

Jupyter Notebook with the data appearing on the graph,

the standard deviation lines,

the two standard deviation lines, and the average.

Next, let's take a look at what our moving average graph looks like in Tableau.

So, we'll create a new worksheet to start from scratch,

and again drag date to columns and daily minimum temperature to rows.

Again, Tableau has chosen by default to aggregate over the year,

so we have to change it back to aggregating over individual days.

Tableau refers to things like the moving average calculation as a table calculation.

So, in our daily minimum temperature,

we're going to add a table calculation.

However, moving average is

a well-defined and common calculation so it is one of Tableau's quick table calculations.

So, let's just click moving average and call it a day.

However, this moving average doesn't really

look like the moving average graph that we had on Jupyter Notebook.

It's a lot spikier than the one that we created using a 50 day moving window.

So, let's take a look at the table calculation that was

performed to learn more about what Tableau does by default.

In this case, we can see that

the calculation type is a moving calculation, which is correct.

It's an average, which is correct.

But here, it only uses the previous two elements which is

very different from what we had in Jupyter Notebook using 50.

Accessing this drop down,

we can adjust to the previous values or even the next values that Tableau uses for

this calculation and change it to be more in line with what we had in Jupyter Notebook.

After changing the moving average to 50,

we can see that the result in

graph resembles much more closely what we found from Jupyter Notebook,

and is a lot more familiar.

If you remember the Jupyter Notebook example,

we had both a NumPy calculation of moving

average and a pandas calculation of exponential weighted moving averages.

One of the differences between the two was a ramp up and a

cooled down in the NumPy calculation that didn't exist in the pandas calculation.

Here, we can see that the ramp up and the ramp down are missing from this calculation,

which indicates that Tableau performs

its calculations similar to the way that pandas does,

neglecting missing elements of

the calculation instead of just assuming that those values are zero.

This concludes the example of duplicating

the time series graphs that we created in Jupyter Notebook in Tableau.

I hope that this was useful for you.

Explore nosso catálogo

Registre-se gratuitamente e obtenha recomendações, atualizações e ofertas personalizadas.