Most business people are quite familiar with standard Excel spreadsheets. They present lots of information, and are great for spot checking and for looking up specific measures or metrics. But if you're looking for visualizing data from large data warehouses, this type of data presentation is not very effective. That is why companies and individuals rely on visualization tools to help make sense of voluminous amount of data, and it's what we will cover in this lesson. Information visualization is the use of visual representation to explore, make sense of, and communicate data. This sector is growing rapidly for three main reasons. Companies are becoming more and more interested in data driven approaches. Inexpensive hardwares, censors, and do it yourself frameworks are driving the cost of collecting and processing data down. A diverse community of software programmers and infographics designers have assembled online. They are disseminating all sorts of new applications, software tools, and low level code libraries for data visualization and infographics design industry. We call this self service approach data discovery. Individuals and companies no longer have to rely on IT departments, so as a result, collecting, organizing, and manipulating, and visualizing data is becoming common practice. Here is the recipe. You identify a set of data, any kind, that data can be managed data from databases, data warehouses, or from your spreadsheets. Then you familiarize yourself with the data and begin to understand any relationships that you see. Next, you apply visualizations to that data to help you to screen patterns, problems, and trends. And finally, you apply your own analysis by filtering the data, creating new calculations, and adding new views of the data. While the data discovery process isn't always linear, this process allows you to add new visualizations, new combinations of data, and refine your data analysis until you arrive at a business insight. Let's talk about the types of the data that can be visualized. Essentially there are three types. First are the data that contain nominal attributes. Things like the colors in a crayon box or items on the shelf at a larger store. These can be counted, but not ordered or aggregated. Next are data that contain ordinal attributes which can be counted and ordered, but not aggregated. So are the dates on a date line or grades in a freshman English class. There is a relationship in the sequence. And then the third type of data is quantitative metrics data that can be counted, ordered, and aggregated. These data points can include revenue, costs and profits, number of customers, temperature, or time. Some data can be both nominal and metric, depending on how they are used. If I want to create a histogram that counts all the times the New York Yankees beat another team by two points or more, those scores would be nominal attributes, but if I wanted to compare the scores from every season the New York Yankees have played baseball, those score would then become a metric data point that can be counted, ordered, and aggregated. So you know the type of the data you want to show, next you need to choose how to show it, a great reference is Zelazny's 2001 classic work, Saying It With Charts. Zelazny distinguishes between basic and composite charts. Let's start with the basics. Pie charts are for illustrating the relative proportions of components in a total. Bar charts are for showing a ranked list of items. Column charts, which are similar to bar charts, but have a horizontal orientation, for depicting the progression of time. Line charts focus on frequency of an occurrence within a period of time. And dot charts are for depicting a relationship or correlation between two variables of data. Generally speaking, column and line charts should be your go-to in about 50% of use cases. Bar charts for 25% and dot charts, pie charts, and combinations for the remaining 25%. If your analysis requires more than two dimensions, there are additional composite visualizations that embrace all their concepts, but leverage changes in color and size to add different level of analysis to visualize additional dimensions of data. Pie charts illustrates proportions of a whole. This same logic applies to a heat map. Only here, users can evaluate multiple variables simultaneously based on the size and color of each square. Bar charts compare nominal and metric data by ranking them, a third dimension creates a bullet chart. Column charts visualize trends over time. Micro charts build on that premise to show how a metric is performing compared to forecasted figures. Line charts depict frequency within a time period, but if you want to show many options, you can create a graph matrix. Graph matrix depicts frequency over time, showing every combination of the data in one visual. And bubble charts builds on dot charts, but the size of the bubble adds a third dimension. Bubbles might be used for SKU analysis, or trying to determine the impact price increase of your products have on consumer purchases. Though Zelazny first published his research many years ago, it remains relevant today. As more and more people have access to business data, and are using technology to build visualizations, using the incorrect graph type can lead users to the wrong conclusion. Generally speaking, stick with the basic charts for comparing data with just a few elements, and turn to composite charts when there are many. I will show you another approach in the next lesson.