Audiovisual Integration in Speech Perception
The way we perceive speech can be dramatically affected by visual information from a speaker's face. For instance, if you hear the
consonant /b/ while watching the face of a speaker saying /g/, you are likely to hear the sound as a /d/ (McGurk & MacDonald, 1976).
reference: McGurk, H., & MacDonald, J. (1976). Hearing lips and seeing voices. Nature, 264, 746-748.
Try this example: Watch these movies, then play them again with your eyes closed. What did you hear each time?
It is likely that you heard the first as /aba/ and the second as /ada/ (or possibly /atha/).
However, if you listened with your eyes closed, you may have discovered that the sounds were actually identical!
Information from the movements of the speaker's mouth
changed your perception of the sounds.
Various theories have been used to explain how auditory and visual information are combined. However, little research has explored the influence of learning on audiovisual speech perception.
We have investigated the effects of experience on audiovisual integration in speech perception by training individuals to use novel visual cues for speech.
In our experiments, participants watched and listened to an animated cartoon robot that moved as it produced speech sounds. Over several training sessions,
participants learned about a systematic relationship between the robot's movements and the consonants /b/, /d/, and /g/.
Here are some examples of the animated robot:
After training, most participants learned the relationship between the robot's movements and the speech sounds well enough
that they could identify which consonant the robot produced just by watching it move, without sound.
Participants were able to use information from the robot to improve their accuracy in identifying consonants in noise (top two panels of figure).
Depending on how the robot vidoes were presented during training, listeners could use the robot to improve identification accuracy to the
same extent as by watching a speaker's face (bottom panel of figure). Further research in this area may have implications for improving speech perception in noise
or with hearing impairments, and may help to improve theories of information integration in speech perception.
reference: Stephens, J.D.W., & Holt, L.L. (submitted). Training of an artificial visual cue for use in speech identification.
Lori Holt |
Site designed by Seth Liber,
maintained by Anthony Kelly.