Data repositories in which cases are related to subcases are identified as hierarchical. This course covers the representation schemes of hierarchies and algorithms that enable analysis of hierarchical data, as well as provides opportunities to apply several methods of analysis.

Associate Professor at Arizona State University in the School of Computing, Informatics & Decision Systems Engineering and Director of the Center for Accelerating Operational Efficiency School of Computing, Informatics & Decision Systems Engineering

K. Selcuk Candan

Professor of Computer Science and Engineering Director of ASU’s Center for Assured and Scalable Data Engineering (CASCADE)

In this module, we want to talk about different ways

of visually representing temporal data series.

In previous parts of these modules,

we've talked about methods for doing

temporal analysis to detect things that are important to discover time series motifs.

We also want to think about how we can visually represent this data.

And perhaps the most common way to visualize time series data,

is just through an x, y plot.

And this is looking at linear time.

So we have a 2D line graph with time on the x axis,

and value on the y axis.

We could also switch those if we want,

but this is perhaps the most common time series plot you're going to see.

And hear we're plotting in the amount of interstate commerce,

railroad commerce or deed titles by year.

So we can start a scene here.

We had a whole lot of railroad commerce between 1869 in 1927.

We can start seeing interstate commerce picking up in the 1950s,

and we can start seeing some things trail off over time as well.

And so this linear time,

allows us to explore these changes in a fashion.

It starts letting us think about stories behind that.

We can see the railroad boom,

interstate boom, and those sorts of things.

But we also we're seeing deeps and peaks and so we might

start wondering if there's some sort of cyclical pattern within this as well.

And this sort of graph is just very well known to people.

We can also take these graphs and just compress them into an intense,

simple, wordlike graphic, and we call these spark lines.

And these spark lines give us an idea about trends of data.

So for example, if this might be data for a patient,

we can show a patient how their glucose level has been

trending over time, respiration patterns,

temperature, WBC and they

can look and see and think why might there have been spikes here,

or what do these spikes in temperature mean as people level off.

Now, we can't read the exact values on these spark lines,

but we can get trends and ideas and start looking

for patterns and different elements within those as well.

So, we can thinking about how we can represent our different visuals.

Then we can start even thinking about how we could

expand on these common temporal themes.

One of those expansions is what we would call Theme River.

Theme River was developed back in 2002 were trying

to- idea was to visualize thematic changes in large document collection.

So essentially, looking at how different themes might trend over documents.

So I have a bunch of documents from let's say July and August,

and every day I get more documents in.

So in that document, I may count

how many documents have to do with

NATO on a given day or how many have to do with Germany.

Essentially, I can plot that over time.

With Theme River, what we're doing is we're plotting these on top

of each other to try to show this sort of area contribution.

This was used for a very popular example in

the New York Times to show sort of movie trends in box office revenue over time.

So essentially, we can think about a new movie.

So let's say this is our time.

So it's July and a new movie all of a sudden pops up,

and then typically movies trail off over time.

The next weekend, another new movie pops up,

trails off over time and trying to think about how we can organize these and

stack these elements to be able to show these changes over time,

the emerging of trends,

the loss of other elements in the data set.

Likewise, we can think about how to save screen real estate for our time series graphs.

So, if I've got my line chart, where again,

I might have time on the x axis and value on the y axis.

I can think about how to save some different elements.

What I can do is,

if this is my value and this is my time,

I can try to split these by drawing different lines.

So as my graph goes through these lines, what I can do,

is I can cut off these top chunks and replace them here.

And so essentially, I can interpret this set of

layered bands as exactly this where I'm adding up these different elements.

We can even do things like mirroring the negative values to try to

save space or offsetting the negative values in this way.

So people were trying all sorts of different visual metaphors

to see how well people could interpret this sort of data element.

Now that was all sort of linear time visualizations,

but we can also think about cyclical visualizations.

Trends are easily seen in a linear plot but repeating patterns art.

So, think back to our first example here.

It looks like there may be cyclical patterns in the data, but those are very,

very hard to tell if they're actually existing

there,and we may just be interpreting our belief on top of this.

We can see this sort of rise and fall of railroad commerce very clearly.

Picking out cyclical patterns becomes very difficult in the data set.

So, what we wanted to do, is think about,

how we might use spirals to represent this idea of

repetition and finding periodicity may be

easier than looking at say a line graph or a bar chart.

So what people did is they used ideas like Archimedean spiral,

or a logarithmic spiral to encode data over time

where each chunk of my spiral is going to be a day.

So, for example, if we use a nine day spiral.

So, if I say I'm going to repeat every nine days.

Now, if it's by day,

nine days doesn't really make sense.

So we don't really see any patterns emerging.

But once we do this by seven,

we see that for these two days which might be the weekend,

we get very low data values almost consistently repeating in this sort of pattern.

So we can start looking for different spirals for

this periodic data and how we can begin exploring and interpreting this.

People have also thought about using what we would call a calendar based visualization.

So, taking a layout where each block

is a day and trying to think about how we might organize this.

We even do this in sort of two dimensions as well.

So I can organize my days of the week Sunday through Saturday,

Sunday through Saturday so showing a two week cycle.

So then I can look for periodicity on Sundays and Mondays and so forth.

I can even at histograms that are counting up some of the row.

So I can look for patterns that way.

I can sum up the sums of the columns looking for patterns that way.

I can encode data values and colors.

So the example here is a calendar based visualization of a crime.

So on Sunday, January 1st,

we had three crimes,

Monday we had two and we can quickly see which days had the highest crime based on color.

We can look and see if there might be any repeating patterns as well in the data.

So, all sorts of different methods,

but we can even think about just the Google calendar itself to some extent as

a counter visualization trying to show us an organization of our events and elements too.

So we don't just have to do data counts,

we may have events that we want to show sequences of events.

So, we have to think about how we can extract what's important,

show different elements, try to encode these,

whether we need text and items there.

So we have to think about what's

the appropriate visualization for our challenge or our questions.

We can even think about things like a spiral calendar where people have

tried alarm using projections in

this 3D space of using shadows on 3D walls and spots in the back end.

So here, they encoded time to this axis,

so different people and different hours of the day.

So, each time you might have met with a person on your Google calendar,

it would be encoded on that axis,

the hour you met with them and then the date.

So you might try to see if things are stacking up here in the middle,

seeing these different projections to try to look for

patterns and interpret things in this manner.

Then people have also looked at things like time wheels for things.

In the center, we have a time axis,

then we have variables on the other axes,

and we use lines to connect the time and variable values.

So similar we might think of this as like a parallel coordinate plot sort of idea,

but trying to look for connections in this way and then people have even thought about,

what are just different artistic interpretations about time series data.

So, in the UIST symposium in 1999,

they thought about People Gardens,

so creating data portraits for users.

So encoding elements where

each plant stem was the time of day and then what was going on at that time of day was

encoded into different petals and so trying to show what

your daily people garden looked like on who you were meeting and those sorts of elements.

So we don't just have to stick to our ideas of this sort of

linear time visualization with time on the x axis value on the y axis.

We can try all sorts of different visualizations.

Some may be difficult to interpret like

this People Garden may be hard to interpret quickly,

but could be fun and beautiful in a good way to sort of give a quick overview of things.

Where we may want to use spiral graphs to show repeating patterns,

or we may want to try 3D views to try

to incorporate multiple variables for our time series.

So there's lots of different methods and things we want to think about,

and we also need to encode other elements that we

talked about in previous modules like aspect ratio,

graph labels legends and all these sorts of things in our data as we go along.

Other things have included Arc diagrams,

So Martin Wattenberg introduced this concept of

Arc diagrams to help show sort of repeating patterns.

He used this for songs,

why is this for songs?

Well again, music plays over time.

So looking at what notes you played,

looking at repeating structures of chords being played and

so forth and coded this into what we called Arc diagram,

and then we also need to think about ordered time versus branching time.

So again, ordered time,

things happen one after another.

But we may also want to be working with some sort of

simulation that allows people to consider what if.

So if I have an intervention at this point in time, what might happen?

So can I allow people to compare alternate scenarios and alternate timelines as well.

So there's a nice paper on visual methods for analyzing time oriented data,

that gives us an overview of all these different types of

a time oriented analysis and the result.

Nice paper called Worldlines for looking at these what if scenarios.

So this is the flooding simulation,

so you have this make believe city of floods coming

in and I can put sandbags in different locations.

So if I at this point in time,

if a levee burst and I decide to put sandbags in a particular location,

what happens to the city?

If I don't do anything. What happens?

If I do something else what happens?

And I'm exploring these and how they track overtime,

with doing different interventions to try to

understand this sort of branching time scenario.

We saw this with work from Shehzad Afzal as well on decisions support environment.

So here we had a disease simulation,

where we're looking at cattle getting sick and you could use

quarantines or try to spray from mosquitoes to prevent diseases.

So doing different quarantine methods at different points in time to see sort

of the number of lives that would be saved over

the entire simulation to compare different decision paths.

Here again, you might do some sort of intervention,

and you might seem to save lives early on but then have

a drastic decrease and then a resurgence in healthy cattle.

So again, thinking about how we can explore these,

we're here when you did an intervention,

you had a very good component but then by the time you were done,

you had more lives lost than if you had done nothing at all.

So being able to explore these different branching paths,

can be critical as well for time series analysis and visualization.

So with all of these sorts of views,

we really want to think about how we can design good visualizations.

So we want to show familiar visual representations whenever possible.

If we're doing a time series data,

the most familiar one is probably our linear time series view,

and what's nice about that is we can provide side by

side comparisons of small multiple views.

So if I'm showing multiple stocks,

I don't have to put all of the stocks on one graph.

Because you can see the problem of this line I drew down here,

if this is a stock that doesn't make much money and I'm trying to compare it

to Amazon that has a huge valuation,

it becomes difficult to tell if they might be trending together.

I could provide side by side comparisons of rescale axes,

spatial position of where I put different time series in a set of small multiples.

Could also be drastically important.

Multiple views are more effective when coordinated through explicit linking.

So for example, if I brush here,

maybe I also want to highlight in the exact same time on our other graphs.

We also want to think about avoiding abrupt visual changes as well.

So with all of these,

hopefully this gives you a flavor of

the different possibilities of ways to visualize time series data.

Many of these algorithms are available through JavaScript implementations,

Python, and those sorts of things.

We'll also be talking about Tablo implementations as well,

but thinking about how we combine different elements,

link these together to provide users with

a broad story and a broad means of visualizing temporal data is critical. Thank you.

Explore nosso catálogo

Registre-se gratuitamente e obtenha recomendações, atualizações e ofertas personalizadas.