Right now in my literature review I’m interested in why researchers want to create virtual acoustic environments. It’s not difficult to find incremental research improving a particular model or testing the psychoacoustical limits of a technique, but it takes more effort to determine why the researcher cares. I’ve found several common motivations and have highlighted some key examples, though I add the disclaimer that much more has contributed to the field than is mentioned here.
Architects and Those That Hire Them
Big money is invested in new performance spaces and investors like to have a guarantee that their money will be well spent. Initial auralization work was done without the assistance of computers. Acousticians would make tiny scale models of spaces and study how sound waves would travel around the model in hopes of extrapolating what would happen in full-sized rooms. This work is now done with computer models by studying the physics of how sound travels and interacts with objects. Two of the major software packages used are CATT-Acoustic and Odeon. Though computers can create very complex models to precisely predict how sound will move throughout a space, the limitation is that the sound cannot be rendered in real-time. Moving and listening through a space cannot be modeled in an interactive manner. CATT-Acoustic is addressing this issue by looking at rendering the data required to compute the audio offline so that some amount of movement can be approximated. However, the approach they are taking, computing a large number of impulse responses calculated by the CATT-Acoustic software, requires a large amount of data storage.
Education and Posterity
The field of archaeological acoustics has grown in recent years as researchers have began to discover similar aural phenomenon across multiple archaeological sites. The questions then emerge: did previous civilizations have an understanding of acoustics; were these acoustic phenomena intentional design features; did these phenomena have a direct impact and role in the society such as in religious ceremonies? (The third chapter in Spaces Speak is an excellent reference on the subject.) Engineers approach these questions by meticulously measuring the spaces so that the spaces can be auralized and further studied.
More recently, audio engineers have acknowledged a need to preserve and record spaces of significance such as famous concert halls. Angelo Farina (see this paper in particular) and Damian Murphy have been two of the researchers actively trying to accurately capture and then model acoustic spaces of some historical note.
I attended the Audio, Acoustics, Heritage Workshop in 2008 which addressed a lot of these issues. I was particularly interested in the work presented by Gui Campos from the University of Aveiro in Portugal. The Painted Dolmen (Anta Pintada) is a neolithic site in Portugal with fragile paintings which have already been significantly damaged in previous archaeological excursions, so it is not open to the public. The Portuguese government wanted to create educational tools so that the public could still experience the heritage site without causing further damage. This seems to be an increasingly popular enterprise for governments, both the British and Italian governments have funded similar projects.
Researchers from the University of Aveiro used a laser scanner to precisely measure the space and then model it for virtual reality simulation. Though the data existed to create a complex, detailed model of the space, it could not be auralized in real-time, so a simplified model was instead implemented. A similar application was developed for an archaeological park in Italy using GPS and custom software for mobile phones (see the paper for details). The researchers found that including sounds to recreate the soundscape was well-received by the students that tested the system. However, even though they have 3D models of the ruins, they did not use any auralization, real-time nor previously rendered.
Entertainment
Interactive 3D environments are becoming increasing common and complex for the average consumer as video game hardware advances. A PS3 and a 5.1 surround sound systems trumps most research setups of the past decade. An enclave of the industrial and academic research lab is the CAVE. CAVEs are immersive visual environments that can use loudspeaker or binaural (headphones) technology for audio playback and usually have projected images that encompass an entire room. There are a number of applications that have been developed for CAVE-like environments. You can find a description of several different applications here.
The Acoustics research group at the Helsinki University of Technology developed at system at the turn of the century called DIVA (Digital Interactive Virtual Acoustics). It models performing musicians and allows a listener to move virtually around them while listening to their performance. The major compromise in such systems is accuracy for interactivity. It is deemed more desirable to have an immersive, engaging virtual system which only approximates a space that might exist in reality rather to be hung up on details and causing longer processing times. This is approach taken in all video games: perceptual approximation overrides absolute reality.
What Conclusions Can We Draw?
Applications using virtual acoustic environments are being developed for differing end-users with priorities ranging from high-precision acoustic recreation with a lesser focus on interactivity to a large focus on interactivity at the expense of accurate acoustic models. In between is the emerging field of edutainment which hopes to use the interactivity of virtual environments to attract and teach students about specific acoustic environments. The signal processing is falling short though. While great advances are being made in auralizing 3D models of spaces, complementary technology has not been sufficiently developed to aid in the real-time interaction with this data.
A visual parallel is computer animation. Feature-length films are created in non-real-time by the computers that are rendering the images as opposed to video games which require the hardware to produce images as the player moves in the game. The visuals in video games do not look as good as movies, but they are quickly approaching that quality as the hardware improves. The same is true of virtual acoustics, high-quality audio can be rendered offline, but it is only a matter of hardware in order for real-time, interactive audio of the same quality to be generated.
For the time being, clever algorithms need to decrease the need on heavy processor loads and large amounts of data so that high-quality, interactive audio can generated on mobile devices. A significant portion of my PhD work looks at efficiently summarizing and interacting with a large database of impulse responses, the data that generates the audio of a 3D model, so that lightweight applications can be created without compromising the audio quality of the original 3D model. I am also looking at clever ways of compressing the data so that less storage is required.