By 2017, the Europeana Sounds project will have made over half a million digital sound tracks and many thousands of related items accessible through thematic channels on Europeana. Providing an easy way to search and navigate though such a vast website collection will be one of the aims in the project.

There are many ways to visualise an overview of large numbers of audio files, but how should single sound recordings be presented on the web? Human sensory experience is multimodal, yet our primary sense is vision, and the web, while a multimedia platform, is likewise primarily a visual one. Whether we interact with it via touchscreens, through emerging wearable devices such Google Glass, or with traditional desktop computer screens combined with a keyboard and mouse, most information delivered is visual, whether in the form of texts or images or videos, perceived through our eyes. It is only audio delivery that requires a different interface, for sounds emitted from earphones or loudspeakers, to our ears.

So what is the best way to deliver, study and appreciate individual sounds? This is a problem that has been addressed for centuries on the written page, long before sound recording technology was invented. Visualisations have often been used to study sounds. The Jesuit and polymath Athanasius Kircher explored in print the world of music and sounds in his two-volume Musurgia Universalis, published in Rome in 1650. He used musical notation to illustrate sounds – even when attempting to describe the sounds of birds:

Kircher (1650) used musical notation to depict sounds made by farm and wild birds

Edison’s invention of sound recording technology in 1877 made it more practical to study and analyse sounds by replaying them at leisure. However, other than by playing a sound recording, portraying a sound in print with accuracy was impossible, especially those that are non-musical such as human speech or the noises of animals.

Another American invention, the sound spectrograph, was developed in the 1930s at the Bell telephone labs to display changes in frequency spectra over time. By the 1940s, the device was being used in ingenious experiments as an elocution aid for persons with hearing impairment: words enunciated by a tutor were displayed as spectrograms on a phosphor screen, and the pupils would then try to match the visual patterns with their own voice fed into the machine. In the 1960s, biologists realised the spectrograph was ideally suited to comparing and revealing the intricacies of bird songs. To this day, spectrograms are widely used to visually compare and depict bioacoustics signals in scientific publications.

Sound spectrogram of the song of a Screaming Piha, an Amazon jungle bird, which you can hear in the sound clip below. The vertical axis is frequency (related to pitch), the horizontal axis is time. Try ‘reading’ the graph from left to right while listening to the sound clip below.

Using digital processing, the process can work in reverse: converting images to sounds. Even images that have no audio origins can be used to create ‘impossible’ sonic textures that no musical instrument could ever create.  The example below is a graph of an audio track created using a camera and software by the Canadian musician “Venetian Snares” from his album Songs about my Cats. When the final soundtrack, Look, is processed with a sonograph, a hidden image emerges from the audio signal.

Venetian Snares’ cat photos are encoded as sounds on the last track of his audio CD

Venetian Snares’ cat photos are encoded as sounds on the last track of his audio CD

Described as a colour-note organ, CoagulaLight is a program that can turn any bitmap image into sounds, which is how Look was generated. Recoding digital information from static images into sound frequencies that vary over time yields unworldly sounds, such as this track generated from the Europeana Sounds logo:

In fact, the interrelation between sound and vision runs deeper, at least for people with synaesthesia. In the special form chromesthesia, different sounds evoke different colours, as experienced by the French composer Olivier Messiaen (1908-1992) or the contemporary English artist David Hockney who ‘heard colours’ when listening to particular notes or chords. More remarkable still is the completely colour-blind UK artist Neil Harbisson, dubbed the world’s first officially recognised cyborg, who ‘hears’ hundreds of different colours using a sensor surgically implanted in his skull that transforms the colour spectrum into tones (http://www.ted.com/talks/neil_harbisson_i_listen_to_color).

What is the connection with how sounds could be represented on Europeana? While some listeners prefer to listen intently without visuals, perhaps closing their eyes to focus their hearing, web users usually expect to view something while listening. One way is to show associated visual objects, such as photos, videos, transcripts or music scores of the sound being played. But these are not always available for each sound. In the digital age, the experience of seeing sounds is familiar to most, as waveforms (displays of amplitude v time) are so commonly used in audio editing and playback software, almost a ‘lingua franca’ for sound portrayals.

Examples of waveforms are the stylized graphics in the SoundCloud recording clips shown above, or the simplified waveforms used by the British Library on its Sounds website (example:  http://sounds.bl.uk/Environment/Listen-to-Nature/022M-LISTNAT00288-0001V0), while our colleagues at CNRS uses the Telemeta system (http://telemeta.org/) that allows users to select from many kinds of spectral and waveform visualisations for each sound clip. These help listeners gauge the duration of recording, to navigate to parts of interest, and can reveal details about the sounds.

Other kinds of visualisations are commonly used in audio players (an elaborate version is shown below), which though visually striking are rarely informative about the audio. Indeed, many listeners find they are actually a distraction from the experience of listening.

What’s your preference? Let us know!

This article is abridged and adapted from a presentation given in March 2014 as part of the British Library’s “Beautiful Science” season.

by Richard Ranft, The British Library