Color Use Guidelines for Data Representation
[ASA presentation]
Cynthia A. Brewer
Department of Geography, Penn State, cbrewer@psu.edu
Paper prepared for invited presentation in Theme Session titled “Using
results from perception and cognition to design statistical graphics” organized
by Dan Carr for the Section on Statistical Graphics at the 1999 ASA Joint
Statistical Meetings in Baltimore. This paper synthesizes my previous research
papers which each include literature reviews (see bibliography), but references
are not included in the body of this text. Figure pdf from the original
paper is online: color_schemes_figure_B&W.pdf
For examples in color, see previous summary at www.personal.psu.edu/cab38/ColorSch/SchHome.html
and ColorBrewer.org, an online
tool for selecting specific map color schemes.
The reference for the published version of this paper is:
Brewer, C. A. 1999. Color Use Guidelines for Data Representation, Proceedings
of the Section on Statistical Graphics, American Statistical Association,
Alexandria VA. pp. 55-60.
Abstract
Matching the organization of the perceptual dimensions of color (hue,
lightness, saturation) to the organization of data being represented is
one key to gaining insight from data visualizations. Geographic mappings
of data are akin to three-variable graphs and thus benefit from the multi-dimensional
nature of color symbolization. Systematic paths within perceptually ordered
color systems, such as Munsell and CIELAB, produce logical progressions
of color. Colors that progress from light to dark through a series of adjacent
hues are an effective way to symbolize quantitative data that are monotonically
increasing. Alternatively, emphasis on a critical value midway through
a data range is accomplished by using the lightest color in a scheme to
represent the critical value and then diverging toward different hues for
high and low data extremes. Common spectral schemes (red-orange-yellow-green-blue)
are well suited as diverging schemes by centering light yellow on a middle
value, such as a mean or median. Visual comparisons between map distributions
are aided by systematic two-variable color schemes. A topology of ten perceptually
ordered schemes will be presented. In addition, vision research results
are used to guide selection of combinations of hues that are easily distinguished
by people who are colorblind (4 percent of the population).
Introduction
Cartographers have been pressed into making use of color and other
graphic variables to present our statistical data because we essentially
lose location, which is so useful in graphing, to geographic location.
A map is a multivariate graph with latitude and longitude already claiming
the x and y axes. So, we rely on symbol shape, size, orientation, arrangement,
texture, focus, and color to symbolize data values at locations on maps.
As color monitors and color printing become permanent fixtures in our offices,
researchers from all domains are finding uses for color symbolization that
go beyond decoration, and objective guidelines for color use are helpful
in many analytical endeavors.
The bare bones of cartographic guidance on color are to use hue to show categorical differences and use lightness to show ordered differences. This is a straightforward approach, but it does require that you be able to systematically separate these two aspects of color. When you decide to use a pure yellow symbol, for example, you have selected a yellow hue that is light and fully saturated. All three of these aspects of the color affect how it will function in your graphic. Color is a three dimensional phenomenon and you will get more benefit from it in your graphic work if you use all three dimensions with care. Additional considerations are to accommodate colorblind readers and anticipate contrast effects. My goals with this paper are to familiarize you with a series of inter-linked research results on color use and also to encourage you to leave behind some assumptions about color that you may have acquired from as long ago as your childhood.
Hue and Color Mixing
Hue is the perceptual dimension of color that we associate with color
names, like red, green, blue, and yellow. Dominant wavelength is a similar
concept from physics. The rainbow or spectrum arrays saturated hues in
wavelength order from long-wavelength reds to short-wavelength blues: red,
orange, yellow, green, blue. I’ve never been quite sure what the hues “indigo
and violet” in Newton’s spectrum are (many of us learned the spectral mnemonic
ROY G BIV as children), and there is some evidence that he selected these
additional color names to parallel the seven-step musical scale in addition
to categorizing his color perceptions. Purples and magentas are not colors
of the rainbow. They result from mixing red and blue from opposite ends
of the spectrum. White sunlight has a full range of visible wavelengths
in it, but our TVs and computer screens get by with a reduced set of light
primaries for mixing all the other hues: Red, Green, and Blue (RGB).
Magenta, mentioned above, is not one of the basic color names you learned as a child, but you have probably encountered this pinkish-red (a red with no yellow) as you work with desktop printer inks or as you communicate with publishers. Cyan (a blue-green) is another hue making its way into our lexicon from the realm of printing. The primary colors for printing are Cyan, Magenta, and Yellow (CMY). All other hues can be mixed with these ink (or paint) primaries. Printers often use a black ink with CMY to simplify printing black text and lines and to print dark colors more accurately (rather than mixing black using only CMY).
RGB and CMY? But didn’t you learn that red, yellow, and blue were the primary colors when you were mixing paints as a child? Let’s agree to forget that little piece of information. It will just confuse you in this modern era of computer monitors and color laser printing. This set of colors is, however, akin to the perceptual concept of unique hues that do not look like mixtures of other hues: Red, Green, Yellow, and Blue. You may find yourself gravitating toward these unique hues as you develop color codings because they are clearly different from each other. RG and YB are also the opponent hues on which our eye-brain color vision system is based. In addition to red, green, yellow, and blue, the other basic colors named in all fully-developed languages are pink, purple, orange, brown, gray, white, and black.
Lightness and Saturation
As I mentioned above, Hue is only one of three perceptual dimensions
of color. If you are using color in analytical work, you also need a good
understanding of lightness and saturation, the other two perceptual dimensions.
Lightness is the most important of these dimensions for data representation.
Lightness is a relative measure, describing how much light appears to reflect
from an object compared to what looks white in the scene. Its relative
character makes it a different measure from related terms like Brightness,
luminance, and Intensity. Another word for lightness is Value, but that
term becomes confusing in quantitative work if you are also describing
data values. Saturation is a measure of the vividness of a color, and there
are a host of related terms with slightly varied definitions: Chroma, colorfulness,
purity, and intensity (again). Some of these terms are from perception
and the scientific realm of psychophysics, some are from the realm of the
physics of light, and still others (like shades, tints, and tones) are
from art.
Why do I bring up all of these terms? Because, regardless of the context in which you work with it, color is basically a three-dimensional phenomenon. RGB and CMY both have three dimensions but mix color in different media (light vs. pigments). Your software may present you with a color palette based on any number of additional color systems: HLS, HSV, HSB, HVC, LJG, LAB, IHS… Notice that they always have three dimensions. Which systems will help you in making statistical graphics? The ones that organize the perceptual dimensions of color most accurately. I’ll return to discussion of color systems after I describe color schemes for graphics.
Types of Color Schemes (Figure)
When people read your color graphs and other data graphics, they are
working with perceptual dimensions of color, even though you may have specified
your colors using a mixture system like RGB. Your readers are seeing and
thinking about color as ‘light desaturated blues,’ ‘dark saturated oranges,’
‘dark grays,’ etc. Thus, you can make the most of your graphics by using
these perceptual dimensions in ways that parallel the logical structures
in your data to allow its organization to be readily perceived.
Sequential Schemes
The most basic guidance is to use lightness to represent ordered data.
Ordered data can be ordinal (ranked) data or the more sophisticated interval
or ratio scales of numerical data. Generally, darker colors are used to
represent higher data values; light-to-dark for low-to-high. For example,
a graphic showing pollution levels would usually show high levels in a
dark color and progress through to lighter colors as pollution levels decrease.
I have termed this type of progression a sequential color scheme. You need
not strictly follow the rule of higher-darker, especially when making graphics
to compare variables. For example, it is easier to see the relationships
between education and poverty if sequential schemes are organized so that
both high education and low poverty are represented with dark colors. The
crucial part of the guidance is to pair a monotonic sequence in the data
with a lightness sequence.
Qualitative and Binary Schemes
Categorical differences in data are usually represented with differences
in hue. For example, types of government spending, such as military, education,
and healthcare, are categories that could be shown with hues like red,
green, and blue. Cartographers have termed this a qualitative color scheme.
Pay attention to the lightness and saturation of the hues you choose for
a qualitative scheme. If each category is equally important, you will want
each to be fairly similar in its contrast with the background and with
each other. For example, a pure yellow will not be as visible on a white
background as red, green, or blue. Binary schemes are a special case of
qualitative schemes for which either a lightness or hue difference (or
both) is appropriate for the two categories.
Diverging Schemes
Variable comparisons can be accomplished with multiple graphics or
can be facilitated by mapping differences within a single graphic. Again,
careful use of hue and lightness can make this type of analytical graphic
quite successful. For example, industrial pollution could be examined by
symbolizing increases and decreases in acid rain over a given time period.
These data are double-ended and they also have a crucial midpoint in the
data range where, in this case, there has been no change in acid-rain levels.
A diverging color scheme that emphasizes the meaningful midpoint in the
data with a light color and then diverges to two different hues works well
for this type of graphic. An example scheme would present high increases
in dark red, moderate increases in medium red, low increases in light red,
negligible change in white, low decreases in light blue, moderate decreases
in medium blue, high decreases in dark blue. Emphasizing the two extremes
with darker colors and emphasizing the midpoint with the lightest color
clearly parallels the structure of the data. In addition to representing
‘no change or difference,’ a suitable midpoint that might be emphasized
in the data range is the mean, median, zero, or other threshold value.
Spectral schemes, or rainbow schemes, are popular in scientific visualizations and news media graphics such as a daily weather page. Unfortunately, this scheme is often misused as a sequential scheme with poorly used lightness differences. The most informative use of a spectral scheme is as a diverging scheme. Let dark red and dark blue mark the extremes in the data and position light yellow to emphasize a meaningful midrange (such as no change in acid rain levels). The full sequence is familiar: dark red, red-orange, orange-yellow, yellow, yellow-green, green-blue, and dark blue, for example. People like the multi-hue character of the scheme, and the variety of hues also helps distinguish symbol categories in the graphic. Structuring the lightness sequences in the scheme to parallel the characteristics of the data produces an enlightening visualization of the data.
Two-Variable Schemes
These basic types of schemes—sequential, qualitative, binary, diverging—may
be combined to show two related variables on maps or to add two additional
variables to a graph. There are six useful combinations of the one-variable
schemes: sequential-sequential, sequential-qualitative, binary-qualitative,
binary-diverging, sequential-diverging, and diverging-diverging. For example,
stage of cancer at diagnosis (sequential) could be represented with healthcare
provider (qualitative) on a graph with axes for age and income of a sample
of patients. Darker colors of one hue in the lower-right portion of the
graph would show that older and poorer patients have later-stage diagnoses
with a particular provider. Examples of each scheme are shown with European
economic data at www.personal.psu.edu/cab38/ColorSch/SchHome.html
Additional Issues
Saturation can be systematically varied to represent ordered data,
but people are not accustomed to accurately comparing saturation levels,
especially between hues. There are also usually few perceivable steps available
in saturation differences. Therefore, I don’t recommend using saturation
as the primary symbolization for a data set. However, ignoring saturation
can produce some mighty confusing graphics, where individual colors stand
out strongly from other symbols for no apparent reason. Attempt to use
saturation to either bolster a lightness sequence in an ordered manner
(such as light-desaturated to dark-saturated) or emphasize categories that
tend to have smaller symbols in the graphic.
Hue for qualitative data and lightness for quantitative data should also not be taken as strict and exclusive guidelines. Quantitative color schemes may include plenty of hue variation, but they should first and foremost be obviously ordered by lightness. For example, a part spectral scheme of yellow, light-orange, medium-red, dark red is a common sequence for ordered data. It uses different hues, but they are also obviously ordered by lightness which makes the sequential nature of the data obvious to the person reading the graphic. Qualitative schemes benefit from smaller variations in lightness, and saturation, so that colors are readily discerned, but the use of lightness should not erroneously suggest that the data are ordered or that one category is more important than another.
Problems in Color Appearance
People who are colorblind can still see lightness differences and can
see a fairly wide range of hue differences. Approximately four percent
of the population have some degree of color vision impairment (approximately
eight percent of men and less than 1 percent of women). ‘Red-green colorblindness’
is the most common type of impairment, but this type includes confusion
of other hue combinations as well (other pairs of hues will not look different,
such as pink and blue). Expected color confusions can be rigorously modeled
in color order systems, but these results are difficult to apply without
color measurement equipment. Guidelines specified by color names (rather
than color measurements) eliminate a wider range of color combinations
than necessary, but they are usable in the context of designing statistical
graphics. The following pairs of hues are not confused by people with the
most common types of color vision impairments: red-blue, red-purple, orange-blue,
orange-purple, brown-blue, brown-purple, yellow-blue, yellow-purple, yellow-gray,
and blue-gray. These ten color pairs are from a total of 36 pairs of basic
color names, so many pairs are confusing. For example, any combinations
among red-orange-yellow-green are potentially confusing.
I discussed the popularity and usefulness of spectral schemes in the previous section, but these schemes also contain numerous colors that may be confusing to colorblind readers. Keep in mind that colorblind readers can see lightness differences, so a well designed yellow-orange-red scheme will still be perceived as a lightness sequence (color vision impairments offer another reason to work carefully with lightness steps between hues in sequential schemes). Skipping yellowish-greens and using a lightness sequence of bluish greens, blues, and purples completes a spectral diverging scheme with hues that are not confused with the oranges and reds of similar lightness.
Color appearance is affected by context. Small colored objects will be more difficult to identify than larger colored areas. Thus, you will be able to distinguish fewer colors with small point symbols or thin lines. Different surroundings also change the appearance of a color. Small color areas tend to appear more similar to their surroundings because of a perceptual process called assimilation. Conversely, contrast between a larger patch of color and its surrounding color will enhance the difference between them in a process called simultaneous contrast. For example, a color of medium lightness will look darker on a light background and lighter on a dark background. A gray will look greenish on a red background (its opponent complement) and a green will look yellower on a blue background. These perceptual interactions between colors can change perceptions of individual symbols substantially so they may not match corresponding colors in the key (saturation is particularly susceptible to change). There is little that can be done to prevent contrast effects from occurring because the data distribution being represented controls the positions of the symbols. The best approach is to look carefully at the colors in the data graphic. Do not select colors by comparing them only in the key where they are seen in one order on a uniform background. It is also important to look at your selection of colors in the final graphic format (as a projected slide or overhead transparency, color inkjet print, lithographic proof, color photocopy, etc). Be sure you can identify examples of each color symbol and can tell colors apart throughout the graphic.
Color Preference
I recommend that you do not become overly concerned about which colors
your audience will like. Everyone seems to have an opinion about color
aesthetics, and members of your audience will undoubtedly have differing
opinions based on their own color preferences. There has been a substantial
amount of loosely structured research on color preferences. Regardless
of context, it seems that most people like blue and don’t like yellow,
but that is overly simplistic guidance for use in multi-color contexts.
People also like graphics with many colors, so focus your attention on
organizing the perceptual dimensions of color so that your data is presented
clearly, whether or not you’ve picked everyone’s favorite colors.
Mixing Primaries
We want to control the perceptual dimensions of color, but we are often
working with color mixture systems of RGB or CMYK (K for blacK) to specify
colors. Basically, these systems work in opposite ways. Increasing amounts
of CMY ink mix to darker and darker colors (these are the subtractive primaries)
and increasing amounts of RGB light mix to produce lighter colors, with
all three mixing to white (these are the additive primaries). I’ll provide
some basic rules of thumb for mixing CMYK that may assist your color design
work.
1. Set hue using proportions of the two higher percentages of CMY (Example:
20%C with 40%Y is approximately same hue as 50C-100Y)
2. Set lightness using overall magnitude of CMK percentages, since
Y will stay light (Example: 15M-5K is lighter than 30M-10K)
3. Set saturation using the lowest percentage of CMY or use K (Example:
20M-20Y is more saturated than 20M-20Y-5K or 20M-20Y-5C)
4. Create systematic perceptual changes with systematic percentage
changes (Example: 5-15-35-65 will look more evenly spaced than 5-10-55-65)
5. Equal percentage steps don’t look like equal visual steps; use bigger
steps in higher percentages (Example: 5 to 15 looks more different than
80 to 90)
6. Do not use all four inks: desaturate/darken with either K or the
least percentage of CMY (Example: 20C-40Y-8K will be easier to work with
than 20C-5M-40Y-5K)
Perceptual Color Systems
CMYK color mixing, as described above, requires a fair bit of practice,
and alternative solutions are to select colors using a perceptual color
space or to limit yourself to a palette of colors provided on a color chart
offered by the software. Some of the most rigorously designed color systems
from color science are Munsell, CIELAB, OSA-UCS, and NCS. All of these
systems use spectral order, with end points connected through non-spectral
purples, to define the order of hues around a central vertical axis defined
by one of the dimensions akin to lightness. Saturation or a similar measure
increases as you move away from the central lightness axis. Each system
also claims to be at least partly perceptually scaled, so that equal distances
in color space produce equal color difference perceptions. Munsell uses
hue, value, chroma (HVC) structured in cylindrical coordinates. The CIELAB
system, standardized by the Commission Internationale de l’Eclairage, uses
rectangular coordinates with RG and YB opponent hues defining perpendicular
axes intersecting at the vertical lightness axis. The Optical Society of
America Uniform Color Scales (OSA-UCS) has a sophisticated lattice geometry
that provides 12 perceptually equidistant neighbors for each color and
can be sliced on multiple planes to yield numerous well-structured color
sequences. The Swedish Natural Color System also has a cylindrical structure
that is based on perceptual mixtures of the four unique hues and white/black.
Systematic paths through perceptual color systems produce systematic color schemes with pre-determined perceptual characteristics. A simple example of using Munsell for color scheme design is demonstrated by selecting a set of symbol colors for a diverging scheme. Without going into the details of Munsell color notation, you could choose common hues for all colors in each of two sequences above and below the data midpoint. For example, we used a brown (5YR) and blue-green (5BG) hue pair for the Atlas of U.S. Mortality. Each hue was used with the same lightness and saturation steps (value steps of 4/, 6/, 8/ and chroma steps of /8, /6, /4) that converged on a light gray median class (N 9/). This systematic perceptual structure takes a lot of fiddling in CMYK but is effortless in Munsell. The CMYK percentages to print these colors on a particular printer are much less systematic and were arrived at through rules of thumb and a dose of trial and error: 40M-100Y-47K, 20M-55Y-18K, 6M-17Y-2K, 2K, 22C-6Y, 61C-20Y-15K, 100C-35Y-40K. Tektronix was promoting a Munsell-like HVC interface for a while and LAB is showing up in illustration software now. If you want to make ready use of perceptually ordered color in data graphics, push the software vendors to integrate the perceptual color order systems into your visualization tools.
Computer science offers a few poorer cousins to these perceptual spaces that may also turn up in your software interface, such as HSV and HLS. They are easy mathematical transformations of RGB, and they seem to be perceptual systems because they make use of the hue-lightness/value-saturation terminology. But take a close look; don’t be fooled. Perceptual color dimensions are poorly scaled by the color specifications that are provided in these and some other systems. For example, saturation and lightness are confounded, so a saturation scale may also contain a wide range of lightnesses (for example, it may progress from white to green which is a combination of both lightness and saturation). Likewise, hue and lightness are confounded so, for example, a saturated yellow and saturated blue may be designated as the same ‘lightness’ but have wide differences in perceived lightness. These flaws make the systems difficult to use to control the look of a color scheme in a systematic manner. If much tweaking is required to achieve the desired effect, the system offers little benefit over grappling with raw specifications in RGB or CMY.
I encourage you to be a critical consumer of color information, color systems, and color statistical graphics. Organize the perceptual dimensions of color in your graphic to parallel the logical orderings in your data, and choose color tools that make this task easy.
Bibliography of related research papers by Cynthia A. Brewer
These papers contain complete references to related work by others.
MacEachren, A.M., C.A. Brewer, and L.W. Pickle, “Visualizing Georeferenced Data: Representing Reliability of Health Statistics,” Environment & Planning A, Vol. 30, No. 9 (September 1998), pp. 1547-1561.
Brewer, C.A., “Spectral Schemes: Controversial Color Use on Maps,” Cartography and Geographic Information Systems, Vol. 24, No. 4. (October 1997), pp. 203-220.
Brewer, C.A., A.M. MacEachren, L.W. Pickle, and D.J. Herrmann, “Mapping Mortality: Evaluating Color Schemes for Choropleth Maps,” Annals of the Association of American Geographers, Vol. 87, No. 3 (September 1997), pp. 411-438.
Brewer, C.A., “Evaluation of a Model for Predicting Simultaneous Contrast on Color Maps,” The Professional Geographer, Vol 49, No. 3. (August 1997), pp. 280-294.
Olson, J.M., and C.A. Brewer, “An Evaluation of Color Selections to Accommodate Map Users with Color Vision Impairments,” Annals of the Association of American Geographers, Vol. 87, No. 1 (March 1997), pp. 103-134.
Brewer, C.A., “Guidelines for Selecting Colors for Diverging Schemes on Maps,” The Cartographic Journal Vol. 33, No. 2 (December 1996), pp. 79-86.
Brewer, C.A., “Prediction of Simultaneous Contrast between Map Colors with Hunt’s Model of Color Appearance,” Color Research and Application, Vol. 21, No. 3 (June 1996), pp. 221-235.
Brewer, C.A., and K.A. Marlow, “Color Representation of Aspect and Slope Simultaneously,” Proceedings, Eleventh International Symposium on Computer-Assisted Cartography (Auto-Carto-11), Minneapolis, October/November 1993, pp. 328-337.
Brewer, C.A., “Color Use Guidelines for Mapping and Visualization,” Chapter 7 (pp. 123-147) in Visualization in Modern Cartography, edited by A.M. MacEachren and D.R.F. Taylor, 1994, Elsevier Science, Tarrytown, NY.
Brewer, C.A., “Guidelines for Use of the Perceptual Dimensions of Color for Mapping and Visualization,” Color Hard Copy and Graphic Arts III, edited by J. Bares, Proceedings of the International Society for Optical Engineering (SPIE), San José, February 1994, Vol. 2171, pp. 54-63.
Brewer, C.A., “Review of Colour Terms and Simultaneous Contrast Research for Cartography,” Cartographica, Vol. 29, No. 3&4 (Autumn/Winter 1992), pp. 20-30.
Brewer, C.A., “The Effect of Color on the Perception of Map Scale,” Student Honors Competition Winning Papers, proceedings edited by E. Wingert, Cartography Specialty Group Publication #3, AAG, Toronto, April 1990, pp. 1-12.
Brewer, C.A., “Color Chart Use in Map Design,” Cartographic Perspectives, No. 4 (Winter 1989-90), pp. 3-10.
Brewer, C.A., “The Development of Process-Printed Munsell Charts for Selecting Map Colors,” The American Cartographer, Vol. 16, No. 4 (October 1989), pp. 269-278.
Go back to Cindy's page or PSU Geography