I’m making a data visualization. Each datum is represented by a square. To make the underlying data intuitively legible should the length of each square’s side or the area of each square be proportional to the datum it represents?
If you, the creator, is unsure, how will the reader know which it is?
Short answer: the value should be linked 1:1 to the amount of colour on the page. So in your example, it should be area. But there’s more than that: you also need to avoid misleading cues that might make a reader read it incorrectly, and you need to know why you’re using area instead of length (e.g. bar charts), because it has real pros and cons.
First, never have both length and width (i.e. area) of a shape change when actually the variable is only linked to the length of one side. If X is double Y but Y has four times as much colour on the page, you’re misleading your readers. This sort of distortion is sometimes referred to as a “lie factor“, and is often assumed to be a deliberate attempt to mislead and exaggerate differences.
If you use area as a measure I’d strongly recommend:
Knowing why you’re using area. By using area instead of a linear dimension like length, you:
- Sacrifice the ability to clearly see differences mathematically (you can’t easily say “look, that’s double the other one”)
- Invite your readers to view it in an intuitive everyday non-numerical way the way people, for example, compare sizes of pies in a shop. Less sophisticated, but more immediate. More gut, less head.
- Small differences between very similar numbers become almost invisible.
- When one variable is many many times smaller than another, the very small one doesn’t disappear as badly as it would in a bar chart, which can allow more flexibility in layouts.
Consider using circles for area, not squares, centre aligned:
- Circles because it doesn’t invite confusion with bar charts and similar. Height and width are less to the fore: it looks less like you’re inviting a height or width based comparison.
- Centre-aligned because it doesn’t invite people to compare heights
For example, above, it’s hard not to see the square labelled “5” as being three quarters the height of the square labelled “10”, so it’s potentially misleading.
The circles don’t invite this sort of comparison: it’s more of a gut-level, instant “That blob is rather a lot bigger than the next blob”.
There’s a variety of evidence from user testing to small-scale studies (will try to hunt some examples down later) that this sort of intuitive area-based comparison can be more engaging, can lower the barrier to entry to less engaged audiences, and can help keep the reader’s focus on the subject matter rather than the cold minutiae of the numbers. But this comes at the cost of getting in the way of more numerically-minded analysis.
Don’t choose between one-dimension (length or distance) and two-dimension (area) for aesthetic reasons: choose between them based on your audience and message.
Which is more appropriate for the communication: instant gut-level comparisons at the level of “that’s much bigger”, or more considered numerical comparisons at the level of “that’s about 80% of the other one”?
Or are there practical reasons why you need to use area?
Then, when you’ve chosen for practical reasons, apply aesthetics.