A picture is worth a thousand words.
We’ve all heard that phrase before and it’s equally true when talking about data visualization. Data visualization takes loads of numerical data and translates that information into easy to understand pictures. Well… that’s the idea.
Whether at work or reading an article on the internet, there are some graphs out there that just don’t get the job done. Below are the 5 mistakes that make my head hurt the most.
1. Incorrect Pie Chart Usage
No chart is more widely used and more widely used incorrectly. Take a look at this chart and tell me which division is larger?
OK, yeah, this is difficult on purpose, but there is a difference between the larger slices. Let’s try it with data labels.
You can actually tell that Division B is larger than Division C. That’s better, right? Not the most efficient use of a graph since you are reading the labels instead of properly using the visual. Human eyes are notoriously bad at gauging area size with circle slices. Let’s try it one more time, but with a bar chart.
Notice how obvious it is that Division B is the largest and Division D is the smallest. The chart is doing the work for you, which is what we want.
If you like the pie chart’s ability to visualize a portion of a whole, there is a movement to “square the pie“, which uses a brick or waffle chart. The idea is it’s easier for the human eye to judge area differences with squares than circles. Count me as a fan.
2. Spaghetti Chart
We’ve all seen them; the chart showing so many lines that it looks like a downtown traffic pattern.
Ouch; looks like something a three year-old has drawn with their crayons. The biggest problem here is that there is too much data for the viewer to digest in one picture. Where should the eye focus? I’ll offer up a couple of different routes you can go.
One option is to reduce the number of data points shown. You have to ask yourself: is there value in displaying 36 months worth of data on a chart or can you get away with less? It always will depend on who the audience is, but if you are talking with the CEO this may be sufficient.
Another option is to focus your audience’s attention away from the less relevant information (in this case prior year data) and towards the current year. I still advise against showing every month, but the blue says: “I’m the important line, look at me”.
Ideally, you want to mix both ideas together:
- Remove the color.
- Remove the legend and label the lines directly.
- Delete the background lines; they are a distraction.
- Reduce the amount of displayed data down (Quarters instead of Months).
3. Too Much Color
Have you seen this before?
Or this one?
Is that a bar chart or xylophone?
This song need to be cued up. I was recently looking at a report at work and the layout was fine, it showed all the relevant data, but something was just off. Finally, it dawned on me: there was just way too much color.
Take the two charts above as case in point. What purpose is the color serving these graphs? The slices and bars are already labeled so at best the color is redundant. At worst, the color is distracting and the report looks unprofessional (is that a graph or xylophone?).
Unfortunately, Excel defaults to using lots of color leaving the user to correct course. The best way utilize color in data visualization is to use it to grab somebody’s attention. Grey out any information that isn’t the primary focus and bring the salient point to the front.
Finally, avoid using bright colors as they look less professional than their bold counterparts.
4. Stacked Bar Comparisons
It’s a pretty common occurrence to compare proportions across time periods (ex: What % of sales comes from Division A vs last year?). This is one of the more widely used visuals:
At quick glance, D has increased share and A has shrunk. Those two are easier to pick out because each have an endpoint in common on both bars (bottom of A and top of D). Gets a little trickier for B and C because they start in different positions in 2016 and 2017.
This is where the slope graph shines (line chart variant).
The slope graph has eliminated any guessing as to which direction the divisions are headed. A’s share has fallen 10 percentage points and the other three rose from 2016.
5. Too Much Clutter
I saved this one for last, because all of the above mistakes contain clutter, which is why they fall short.
Going back to the beginning of this post: data visualization’s takes loads of numerical data and translates that information into easy to understand pictures. The key part here being “easy to understand”. Every visual you create should be able to stand on its own and not need a lengthy description.
The best way to accomplish this goal is to keep graphs simple and eliminate anything that is redundant. By doing so, the viewer will understand the message being conveyed quickly. I can’t tell you how many meetings or presentations I’ve been to where the presenter has created a graph that needs way too much explanation to be useful. This wastes everyone’s time and distracts from the message.
One of the keys to being a successful analyst is communicating effectively and succinctly . All the difficult data cleaning and analyzing won’t mean anything if your intended audience doesn’t understand the situation and what to do next. This is where effective data visualization completes the analyst’s toolbox.
- Storytelling With Data by Cole Nussbaumer Knaflic
- The Visual Display of Quantitative Information by Edward Tufte
Note: this article was originally posted on The Financial Data Scientist.