For decades, speech pathologists and linguists have been entertaining people at parties with an interesting phenomenon known as the McGurk effect. The McGurk effect occurs when people are exposed to audio of one sound, with a visual of another sound being produced. People hear something different from the actual sound. I first learned of the effect via the following video, in which Patricia Kuhl of the University of Washington elicits the effect with the sounds /ba-ba/ and /da-da/ or /tha-tha/:
Searching for that video, I found a fantastic example using the “Bill! Bill! Bill!” chant from the 90s kids science show, Bill Nye the Science Guy. Take a moment (24 seconds, to be exact) to watch and listen:
The audio is paired with images that affect how the word “Bill” is heard: first, images depicting different bills as shown. Then, as images of pails are shown the sound heard changes to “pail”. Next, images of mayonnaise are shown, and the sound shifts again to “mayo”. Did you hear the three different words?
A McGurk effect shows up in babies exposed to English by the time they are five months-old. This effect seems to strengthen with age. However, the likelihood of a listener falling for the McGurk effect depends on different factors. These factors demonstrate the fascinating interplay between hearing and vision in our ability to understand spoken language.
In a noisy environment, people are more likely to mishear what was said. That makes sense; if there are a lot of noises around, it is harder to pick out one sound from the rest of the noise and correctly identify it. If English is your native language, you’re likely to fall for the McGurk effect. Researchers have found that native Japanese speakers are better able to correctly identify the sound presented, even when shown video of someone producing a different sound, with similar results for Chinese as a native language.
This may be related to differences in cultural communication, specifically, eye contact. In English-speaking cultures, for the most part, eye contact is pretty constant, with some degree of occasional gaze shift away from the speaker by the listener. In Asian cultures, eye contact with a speaker is less common, with a much greater degree of the listener directing his gaze to something other than the speaker. How we hear language is impacted by the engagement of the visual system while listening.
Further evidence that how we listen to language affects our tendency to fall for the McGurk effect was found in a 2008 study published in Brain Research. In this study, deaf people who used cochlear implants to hear were compared with normally hearing people in their susceptibility to the McGurk effect. The normally hearing people did not fall quite as hard for the McGurk effect as the individuals using cochlear implants to hear, suggesting that the cochlear implant group relied more on what they saw the speaker doing with their mouth than the audio. This is further evidence that our understanding of spoken language is dependent on the sensory information we take in. This, in turn, seems to be related to our varied cultural communication styles.
We all come from different backgrounds of language, hearing, and abilities. It can be fun to share videos of the McGurk effect with people from diverse backgrounds, to see what they hear. Share what you heard in a comment below!
If you are interested in learning more about the McGurk effect, or if you would like to work on your speech hearing abilities, let us know. Until next time, let them hear your ideas, not your accent.
 Sekiyama, K. & Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility. Journal of the Acoustical Sociaty of America, 90, 1797-1805. link