Lingua East

People should hear your ideas, not your accent.

Category: research

Seeing is Hearing: The McGurk Effect

For decades, speech pathologists and linguists have been entertaining people at parties with an interesting phenomenon known as the McGurk effect. The McGurk effect occurs when people are exposed to audio of one sound, with a visual of another sound being produced. People hear something different from the actual sound. I first learned of the effect via the following video, in which Patricia Kuhl of the University of Washington elicits the effect with the sounds /ba-ba/ and /da-da/ or /tha-tha/:

Searching for that video, I found a fantastic example using the “Bill! Bill! Bill!” chant from the 90s kids science show, Bill Nye the Science Guy. Take a moment (24 seconds, to be exact) to watch and listen:

The audio is paired with images that affect how the word “Bill” is heard: first, images depicting different bills as shown. Then, as images of pails are shown the sound heard changes to “pail”. Next, images of mayonnaise are shown, and the sound shifts again to “mayo”. Did you hear the three different words?

A McGurk effect shows up in babies exposed to English by the time they are five months-old[1]. This effect seems to strengthen with age. However, the likelihood of a listener falling for the McGurk effect depends on different factors. These factors demonstrate the fascinating interplay between hearing and vision in our ability to understand spoken language.

In a noisy environment, people are more likely to mishear what was said. That makes sense; if there are a lot of noises around, it is harder to pick out one sound from the rest of the noise and correctly identify it. If English is your native language, you’re likely to fall for the McGurk effect. Researchers have found that native Japanese speakers are better able to correctly identify the sound presented, even when shown video of someone producing a different sound[2], with similar results for Chinese as a native language.

This may be related to differences in cultural communication, specifically, eye contact. In English-speaking cultures, for the most part, eye contact is pretty constant, with some degree of occasional gaze shift away from the speaker by the listener. In Asian cultures, eye contact with a speaker is less common, with a much greater degree of the listener directing his gaze to something other than the speaker. How we hear language is impacted by the engagement of the visual system while listening.

Further evidence that how we listen to language affects our tendency to fall for the McGurk effect was found in a 2008 study published in Brain Research[3]. In this study, deaf people who used cochlear implants to hear were compared with normally hearing people in their susceptibility to the McGurk effect. The normally hearing people did not fall quite as hard for the McGurk effect as the individuals using cochlear implants to hear, suggesting that the cochlear implant group relied more on what they saw the speaker doing with their mouth than the audio. This is further evidence that our understanding of spoken language is dependent on the sensory information we take in. This, in turn, seems to be related to our varied cultural communication styles.

We all come from different backgrounds of language, hearing, and abilities. It can be fun to share videos of the McGurk effect with people from diverse backgrounds, to see what they hear. Share what you heard in a comment below!

If you are interested in learning more about the McGurk effect, or if you would like to work on your speech hearing abilities, let us know. Until next time, let them hear your ideas, not your accent.

[1] Rosenblum, L., Schmuckler, M., & Johnson, J. (1995). The McGurk effect in infants. Perception & Psychophysics, 59, 347-357. link

[2] Sekiyama, K. & Tohkura, Y. (1991). McGurk effect in non-English listeners: Few visual effects for Japanese subjects hearing Japanese syllables of high auditory intelligibility. Journal of the Acoustical Sociaty of America, 90, 1797-1805. link

[3] Rouger, J., Fraysse, B., Deguine, O., & Barone, P. (2008). McGurk effects in cochlear-implanted deaf subjects. Brain Research, 1188, 87-99. link

How to Learn New Vowel Sounds

How would you explain to someone the difference between the sound a makes in bath and the a sound in bait? It’s hard, right? Kathryn Brady and her team at Southern Illinois University ran into this problem, and decided to approach accent modification with a 24-year-old native speaker of Farsi using a visual component.

Here at Lingua East, we’re huge fans of Boersma and Weenink’s Praat software, which can show you in a special graph called a spectrogram what your speech looks like. Our friend from CORSPAN, Iomi Patten, has used similar software to help native Japanese speakers working on their English r. It’s great, you can say a word into a microphone, and like magic, it appears on your computer screen as thick fuzzy lines representing the sound energy. These fuzzy lines have certain patterns and hallmarks that correspond with certain sounds.

A native speaker can produce a sound and ask someone who speaks with an accent to produce the same sound. Then, the person with the accent can use the visual feedback of the spectrogram to try to make their sound identical to yours. This can be a fun game of trial and error as the person makes small changes with their tongue or lips to try to get the spectrogram to match.

Even if the person does not hear the sound in the same way a native speaker might, using this method, they can learn to produce the sound. With listening practice, they can learn to hear the difference. People have known this for a while. Here’s some research showing it’s possible for native Japanese speakers to learn to distinguish between r and l.

But back to Brady and her team…

Their subject – let’s call him Pete – spoke English that was pretty understandable, but with a light accent. You see, this guy grew up speaking a language with only six vowels, compared to around fifteen in American English. It’s no walk in the park to learn to distinguish – let alone produce – vowels that you never used until your adult life. How can someone possibly learn how to say sounds that they can’t hear?

Figure 1, Vowel Quadrilateral

Figure 1 Vowel Quadrilateral

The researchers took an interesting approach. They decided to work with Pete on three vowels, and to look at his accuracy with another vowel that they didn’t train, just to see if what they were doing had any effect. They did four things: they gave Pete spoken models of specially chosen words that contained the vowels they were training, they showed him a picture of the position of the tongue for correct production of this vowel, they showed him a spectrogram of his production, and they showed him something called the vowel quadrilateral.

The vowel quadrilateral is a chart that shows all the different vowels of English in relation to one another according to what’s going on in the mouth. It should be noted that Pete’s version of this chart only had the vowels he was working on. See Figure 1.

The researchers hoped that by giving Pete visual explanations of how he needed to shape his mouth to lessen his accent, he would get it.

Training sessions were about half an hour a few days a week. There were eleven in total. During the training sessions, Pete produced the vowel on its own, in single syllable words, in multisyllabic words, then phrases and sentences. They also had Pete produce the vowel in short words paired with other, similar short words that did not contain the same vowel.

They recorded Pete’s productions of the training targets and showed him the spectrogram of his vowels next to a spectrogram of correctly produced vowels. They told him if he produced the vowel correctly or not, and also had him listen to recordings of himself and he had to judge them as correct or incorrect. There was a certain minimum number of correct productions and correct spectrograms he had to get before they would let him go home.

The researchers tested Pete a couple weeks after his final training session on his production of the vowels in some different words, both alone and with a carrier phrase. A carrier phrase is a phrase that goes with a word to make what the person says longer, like “This is a…”

Figure 2 Results for vowels in words, the bottom graph shows the untrained vowel.

The training worked! Pete learned during the training to produce more accurate vowels, and his accuracy remained pretty stable in the weeks after the final training session. This only really happened with the vowels that were specifically trained. The vowel the researchers measured but did not train improved a little bit, but not nearly as much as the other three. (See Figure 2, original Figure 1 from the study, used with permission.)

Furthermore, Pete noticed improvements in his speech and with his new understanding of how we produce vowels, he took a keen interest in the spectrograms and the production of the vowel the he hadn’t worked on in the study.

We asked Kathryn Brady if Pete eventually improved on that other vowel, and she reported that with further training, he mastered it!

Vowels are hard to train, mostly because the articulators do not make contact, and vowels are changed by changing the shape and size of the mouth. However, with a little bit of training, it is indeed possible.

At Lingua East we love to play with spectrograms, especially when it results in clearer communication. After all, people should hear your ideas, not your vowels.



Brady, K., Duewer, N., & King, A. (2016). The effectiveness of a multimodal vowel-targeted intervention in accent modification. Contemporary Issues in Communication Science and Disorders, 43, 23-34.


Accent in English as a Second Language

It’s not you, it’s them (sometimes).

If you’re communicating in a language you picked up later in life – also known as an L2 – effective communication is more than just knowing how to put the words together. It’s about pronouncing the words clearly and fluidly, with just the right intonation to get your point across, and using the right words. That’s true even in a first language.

A topic of investigation for speech researchers is what, exactly, contributes to our hearing an accent in the speech of someone’s L2. Three factors have been identified as affecting how English spoken as a second language sounds: intelligibility, accentedness, and comprehensibility.

Intelligibility is a measure of how much of what a person says can be understood by a typical listener.

Accentedness is similar to intelligibility, but involves influence from a native language. When we speak languages we learned later in life, it’s hard to know how much of an accent we have, because the perception of accentedness comes from a listener who learned that language from birth. What can we do about this? Knowing we have an accent, we can work to make our L2 sound more natural, or native. This is where the accent in “accent modification” comes from.

Comprehensibility is a little bit different. It has to do with how easy it is for a listener to process what someone else says. It involves not just the sounds of speech, but also the meanings of the words and how they’re put together. Comprehensibility gets at our deep understanding of a language: knowing which words to use and when, in what order, and getting those words out clearly enough that the person you’re talking to can follow what you’re saying. I think we’ve all had conversations with someone speaking an L2 where it took a lot of mental effort just to understand what they were trying to say.

A study published in the Journal of Speech, Language, and Hearing Research looked at the three factors mentioned above in Spanish speakers with English as an L2. The speakers were each assigned to one of three groups depending on how much of an accent they had and were recorded as they produced three types of sentences:

  1. True/False: A statement that is either true or false. (Example: June is the first month of the year.)
  2. Meaningful: A sentence that is grammatically correct and makes sense. (Example: Crazy Mary digs a deep hole.)
  3. Unexpected: A sentence that, although it is grammatically correct, does not make sense because of the vocabulary used. (Example: The refrigerator ran across the field.)

Monolingual English speakers listened to the recordings and ranked them by accent. These rankings coincided pretty accurately with the accent groups the speakers were assigned to. Of the three sentence types, the True/False sentences were the easiest to understand, and were judged as being spoken with less of an accent than the other two sentence types. What’s more interesting is that while the meaningful sentences were pretty easy to understand, the listeners judged the recordings of the unexpected sentences that didn’t make sense as being spoken with more of an accent. In other words, when the speakers said sentences that did not make sense because of the vocabulary, the listeners perceived a stronger accent!


This research indicates that part of what gives us an accent when we’re speaking a second language comes from the person we’re talking to. We have to take into account how their brain is processing what we’re saying. We can do this by really thinking about and working to improve the way we present our ideas and introduce new topics. Often, it’s our most exciting and innovative ideas that we most want to be heard by others. Working on communication skills in your second language can help others to start thinking about the meat of your ideas without having to waste brainpower processing the words you’re using to communicate.

It takes a long time to learn another language really well, and if you have, chances are good that you’ve worked your tail off to learn the vocabulary to communicate intelligently with native speakers. Lingua East can help you with your accent and comprehensibility so that people hear your ideas, not your accent. Contact us today!

© 2017 Lingua East

Theme by Anders NorenUp ↑