When creating a chart, there tends to be two main reasons for doing so: to better understand the data at hand OR, having already drawn conclusions from the data, wanting to visually share the story that the data tells with others. While graphs for explanatory analysis can be merely drafts, those telling the story should be carefully crafted.
The Web is full of pretty charts that seem to be running in the ‘Miss of the Charts Contest’:
Aren’t they adorable? The only missing element is the cute little kitten in the corner.
Are they easy to read and compare? Do they use ink and space to the maximum? Are you provided with any benchmarks? No, no and no.
In the past it took a lot of time, effort and ink to draw a chart. Nowadays, creating one can be easier than writing a sentence. Everyone can do it or, at least, is physically capable of doing. However, how do you create a chart that is actually GOOD? A chart that doesn’t necessarily look that beautiful but tells a beautiful story? It seems like a good start to equip oneself with some knowledge regarding visual information processing.
A little background info
There are three stages of processing visual information in our brain, three memory stages:
- Iconic, which is rapid and unconscious,
- Working, which is slow, temporary and conscious and
- Long-term which is permanent.
What’s important is that the first one is connected with so called ‘preattentive processing’ – it can very quickly detect several attributes. The second one isn’t that quick and what’s worse, it has limited capacity – only few chunks of info can be stored.
Here is a list of attributes of preattentive processing:
|Colour||Hue||No||8-9 maximum;culture dependent|
|Spatial position||2D position||Yes|
When displaying quantitative info, it’s strongly recommended to use length and 2D position, because they are the easiest for us to compare. If forced to show more quantitative variables, your next best options to focus on are: width, size and intensity. They are naturally sorted, so we feel which one is bigger and which one is smaller, but the comparison is difficult. For example with circles of different sizes, we are never sure if the author has used radius, diameter or area for visualisation. Other attributes, when forced to show quantitative values, pose a great challenge for a user. Let’s ask ourselves, which one is the biggest: a circle, a square or a triangle? As you can see, they aren’t naturally sorted. Please, save shape, orientation, enclosure and hue for categorical variables.
Knowing the attributes of preattentive processing and being aware of limited capacity of working memory, we are able to create charts that are user-friendly, so they don’t require a lot of time and effort to draw conclusions from.
Let’s now implement our freshly acquired knowledge in practice 🙂
Before Women’s Day 2015, we performed a survey in our company, asking both girls and guys several questions about women to check how well men know women. From the whole ‘test’ the guys barely got a C, with the mean compatibility score with women only being equal to 66.29%. The survey was answered by 66 women and 107 men, which was a nice response rate of almost 50% of all company employees. For this case study, we will use only one question from the above described survey: “How much of their salary (in percentage) do women spend on clothes?”.
Here are results:
|less than 5%||15||1|
|More than 100%||0||5|
The task is to visualize the data as well as possible. Before reading further, take a moment to think about your own design. It’s good not to have your brain spoiled with somebody else’s design 😉
Since we want to show two variables with quantitative values (Men and Women), it’s best to choose length as the main attribute. As for category (Gender) it’s good to make use of cultural colours correlated with gender: pink and blue. Here is the output:
Probably, some people would want to know the exact values without looking for the raw data, so we can show them as well. Let’s also draw viewers’ attention to the least compatible section:
You may wonder why I decided to stress this particular value. The discrepancy between answers is the biggest for this one, here men ‘lost’ most points. Knowing that discrepancy is what interests us, let’s mix things up a little and show it more explicitly:
Now it’s not clear which chart is better, it depends both on your purpose and your audience. Let’s go with the first one. Now it seems like a good idea to add some sort of question summary:
We used a chart called a ‘Bullet graph’, invented by Steven Few, which is effectively an enhanced version of a bar chart: it contains a reference point (overall compatibility score, in this example) and reference areas (here: areas for different marks).
Now, the best part – drawing too-far-fetched conclusions:
- There are some poor creatures whose wife/girlfriends spend more than 100% of their salary on clothes or, slightly more probably, some guys were just trolling us.
- Men think women spend more on clothes than they actually do, possible reasons may be: they underestimate the income of women or they aren’t aware that shoes aren’t really clothes.
- Men lost the most points by choosing the middle option, if they choose the second one instead, they would have gained over 10% of compatibility. It’s possible their knowledge isn’t that bad, their estimation was just slightly too high.
Any other ideas? I would be more than happy to discuss them with you!