I have recently read some though-provoking articles that discussed data visualization by analogy to photography. I really like this analogy, both from a process perspective – photography and data visualization – and a people perspective – photographers and data visualizers. Anyone who takes a picture with a camera is a photographer in that moment, and anyone who makes a chart, diagram or map based on data is a data visualizer while they’re doing that. Both photographers and data visualizers produce images of information emanating from their subjects, to make a point, to record, to inform, to delight. Photographers choose the lighting of their subject and framing of their shots, then use cameras to capture their image. Data visualizers choose the data they use about their subject and the mapping of data attributes to visual attributes, then use algorithms to produce graphics. Both can post-process their images to exert even finer control over their products.

Light is the raw material of photography – the word ‘photography’ literally means ‘drawing with light’ – as data is the raw material of data visualization. It is hard to take a good photo of even the most interesting subject at dusk when it’s overcast, just as it is challenging to make a good data graphic with small volumes of low-quality data. Photographers seek to control the temperature, intensity and angle of the light shining on their subjects, as data visualizers seek data sets about their subjects which are complete, accurate and at the right level of detail to support their graphic. Some photographers snap photos opportunistically with their phone when the lighting is nice, and some exert more direct control with flashes, lamps and reflectors. Similarly, some data visualizers stumble across interesting data sets, while some set out to collect specific observations about their subject to support a graphic they are building.

Framing is critical to photography: controlling how the light bouncing off of the subject will enter the camera. Where is the camera in relation to the subject and the main sources of light? Which parts of the subject are in the shot, which are in focus, how does the subject relate to the background? Photographers choose the answers these questions when setting up a shot just as data visualizers make choices about how data about their subjects will be mapped into an image. What kinds of marks will comprise the graphic, and how will the attributes of these marks map to attributes in the data? What transformations will the data undergo before being mapped to visual attributes? Will extra data be blended with the original dataset to provide context? Both photography and data visualization have well-established idioms by now: the school portrait, the bar chart, the selfie, the pie chart, the tourist in front of the landmark, the scatterplot; but there are also innovative photographers taking interestingly-framed or -timed photos, and innovative data visualizers playing with new visual forms.

The camera is the technology that defines photography, as the algorithm is the technology that defines data visualization. Cameras can be film or digital, and algorithms can be executed by human or digital computers. People certainly drew, painted, carved or embroidered images before either of these technologies were invented, but such drawings are neither photographs nor data graphics because they were not constrained by the way light moves through a camera or a the way data moves through an algorithm. Innovations in the enabling technologies of both activities help drive innovation in their output: underwater cameras enable underwater photos, and efficient algorithms enable big-data visualization. Chasing these capabilities, some photographers collect camera bodies and accessories, or build new ones, much as some data visualizers try out every new piece of datavis software that is released, or build and share their own. At the other extreme, some photographers embrace the constraints of using a single simple camera, and some data visualizers stick with a single software package or technique, perfecting their master of its capabilities and parameters.

Once a photo comes out of the camera, either on film or data card, photographers have a wide range of options for post-processing, in the dark room or in Photoshop. Colours can be tweaked, filters can be applied, elements can be added or removed and different photos can be blended together. Similarly, data visualizers can take the output of their software package and use Illustrator to change colours, move labels around or combine graphics together. Some photographs, such as fashion magazine covers, are so heavily post-processed as to stretch the definition of photography, and some datavis post-processing can alter basic parameters of the data attribute mapping so as to corrupt the algorithm’s output. In both communities there are a range of outlooks on post-processing, from those who revel in the precise control they can exert on their products, to purists and minimalists who prefer to hone their skills in controlling lighting/data, framing/mapping and the knobs on cameras/the parameters of algorithms.

Of course, in their control over the processes of image production, photographers and data visualizers are influencing the way the resulting images will be perceived. The wrong lighting can make a subject look terrible, the right framing can capture a subject’s “good side”, and Photoshop can make a subject look like just about anything. Cherry-picking or inventing the data to visualize can easily make a data graphic lie or mislead, just as well as trimming the zero-line from a bar-chart or erasing outliers from a scatterplot. Just because technology is involved in both activities doesn’t make them any more truthful or objective than the aforementioned drawing, painting, carving or embroidering as means of producing images.

Post-processing can also make the relationship between data visualization and photography move from analogy to something much blurrier. In the practice of photoviz photographs are combined in datavis-like ways, for example my own Direction: Angrignon project. In scientific visualization domains such as medical imaging, control over “lighting” and algorithmic post-processing are both quite extreme, for example with CAT scans, PET scans, MRIs etc.

Beyond the correspondences in the processes, there are some interesting symmetries in the way people play the roles of photographer or data visualizer. Scientists take photos with telescopes and microscopes, and collect and visualize data in charts and graphs. There are photojournalists and data journalists. There are professional photography studios for hire and professional data visualization studios for hire. There are hobbyist photographers and hobbyist data visualizers. I think this last one is why I find this analogy so interesting, because it clarifies how I personally relate to data visualization much of the time: as a passion and a hobby, and I sometimes get strange looks at parties when I tell people that this is what I’m into. This analogy can help people understand the appeal and context of data visualization, not only as a hobby, but in many other domains where photography is today seen as a normal, useful and powerful everyday activity. “My passion? Data visualization! Ah, well you see, it’s a lot like photography…”

(with thanks to David de Koning and Michael Trauttmansdorff who helped me better understand the photography half of this analogy)

⁂