Most current theoretical approaches to speech perception fall into two categories. Some theorists believe that we humans must have a special mechanism in our nervous system that explains our impressive skill in speech perception. Others admire humans’ skill in speech perception, but they argue that the same general mechanism that handles other cognitive processes also handles speech perception.
Earlier in this chapter, we examined three theories of visual pattern perception. Unfortunately, researchers have not developed such detailed theories for speech per- ception. One reason for this problem is that humans are the only species who can under- stand spoken language. As a result, cognitive neuroscientists have a limited choice of research techniques.
The Special Mechanism Approach. According to the special mechanism approach (also called the speech-is-special approach), humans are born with a specialized device that allows us to decode speech stimuli. As a result, we process speech sounds more quickly and accurately than other auditory stimuli, such as instrumental music. Sup- porters of this approach argue that humans possess a phonetic module (or speech module), a special-purpose neural mechanism that specifically handles all aspects of speech perception; it cannot handle other kinds of auditory perception. This phonetic module would presumably enable listeners to perceive ambiguous phonemes accurately. It would also help listeners to segment the blurred stream of auditory information that reaches their ears, so that they can perceive distinct phonemes and words (Liberman, 1996; Liberman & Mattingly, 1989; Todd et al., 2006).
Notice that the special mechanism approach to speech perception suggests that the brain is organized in a special way. Specifically, the module that handles speech per- ception does not rely on the general cognitive functions discussed throughout this book—functions such as recognizing objects, remembering events, and solving prob- lems (Trout, 2001). Incidentally, this modular approach is not consistent with Theme 4 of this textbook, which argues that the cognitive processes are interrelated and dependent upon one another.
One argument in favor of the phonetic module was thought to be categorical per- ception. Early researchers asked people to listen to a series of ambiguous sounds, such as a sound halfway between a b and a p. People who heard these sounds typically showed categorical perception; they heard either a clear-cut b or a clear-cut p, rather than a sound partway between a b and a p (Liberman & Mattingly, 1989).
When the special mechanism approach was originally proposed, supporters argued that people show categorical perception for speech sounds, but they hear nonspeech sounds as a smooth continuum. However, more recent research has shown that humans also exhibit categorical perception for some complex nonspeech sounds (Esgate & Groome, 2005; Pastore et al., 1990).
The General Mechanism Approaches. Although some still favor the special mechanism approach (Trout, 2001), most theorists now favor one of the general mechanism approaches (e.g., Cleary & Pisoni, 2001; Massaro & Cole, 2000). The general mechanism approaches argue that we can explain speech perception without
Section Summary: Speech Perception 61 proposing any special phonetic module. People who favor these approaches believe
that humans use the same neural mechanisms to process both speech sounds and nonspeech sounds. Speech perception is therefore a learned ability—indeed, a very impressive learned ability—but it is not really “special.”
Current research seems to favor the general mechanism approach. As we already noted, humans exhibit categorical perception for complex nonspeech sounds (Pastore et al., 1990). Other research supporting the general mechanism viewpoint uses event- related potentials (ERPs), which we discussed in Chapter 1. This research demonstrates that adults show the same sequence of shifts in the brain’s electrical potential, whether they are listening to speech or to music (Patel et al., 1998).
Other evidence against the phonetic module is that people’s judgments about phonemes are definitely influenced by visual cues, as we saw in the discussion of the McGurk effect (Cleary & Pisoni, 2001; Massaro, 1998). If speech perception can be influenced by visual information, then we cannot argue that a special phonetic module handles all aspects of speech perception.
Several different general mechanism theories of speech perception have been devel- oped (e.g., Cleary & Pisoni, 2001; Fowler & Galantucci, 2005; Jusczyk & Luce, 2002; McQueen, 2005; Todd et al., 2006). These theories tend to argue that speech percep- tion proceeds in stages and that it depends upon familiar cognitive processes such as fea- ture recognition, learning, and decision making.
In summary, our ability to perceive speech sounds is impressive. However, this ability can probably be explained by our general perceptual skill—combined with our other cognitive abilities—rather than any special, inborn speech mechanism. We learn to distinguish speech sounds in the same way we learn other cognitive skills.
Section Summary:
Speech Perception
1. Speech perception is an extremely complex process; it demonstrates that humans can quickly perform impressively complex cognitive tasks.
2. Even when the acoustical stimulus contains no clear-cut pauses, people are able to determine the boundaries between words with impressive accuracy.
3. The pronunciation of a specific phoneme varies greatly, depending upon vocal characteristics of the speaker, imprecise pronunciation, and variability caused by coarticulation.
4. When a sound is missing from speech, listeners demonstrate phonemic restora- tion, using context to help them perceive the missing sound.
5. People also use visual cues to facilitate speech perception, as illustrated by the McGurk effect.
6. According to the special mechanism approach to speech perception, humans have a special brain device (or module) that allows us to perceive phonemes more quickly and accurately than nonspeech sounds.
7. The current evidence supports a general mechanism approach to speech per- ception; research suggests that humans perceive speech sounds in the same way we perceive nonspeech sounds.
CHAPTER REVIEW QUESTIONS
1. Think of a person whom you know well, who has never had a course in cogni- tive psychology. How would you describe perception to this person? Using details from this chapter, describe how this person accomplishes two visual tasks and two auditory tasks that he or she performs frequently.
2. Imagine that you are trying to read a sloppily written number that appears in a friend’s class notes. You conclude that it is an 8, rather than a 6 or a 3. Explain how you recognized that number, using the template-matching and feature- analysis theories.
3. Look up from your book and notice two nearby objects. Describe the charac- teristics of each “figure” in contrast to the “ground.” How would Biederman’s recognition-by-components theory describe how you recognize these objects? 4. Distinguish between bottom-up and top-down processing. Explain how top- down processing can help you recognize the letters of the alphabet in the word “alphabet.” How would the word superiority effect operate if you tried to iden- tify one letter in the word “alphabet” if it were presented very quickly? 5. This chapter emphasized visual and auditory object recognition. How does top-
down processing (e.g., prior knowledge) operate when you smell a certain fra- grance and try to identify it? Then answer this question for both taste and touch. 6. According to the material in this chapter, face recognition seems to be “spe- cial,” and it probably differs from other recognition tasks. Discuss this statement, mentioning research on the comparison between faces and other visual stimuli. Be sure to describe material from neuroscience research on this topic, as well as difficulties encountered by people with schizophrenia.
7. Our visual world and our auditory world are both richly complicated. Describe several ways in which the complexity of the proximal stimuli presents challenges when we try to determine the “true” distal stimuli.
8. Both our visual system and our auditory system are designed to impose organ- ization on our perceptual world. How does the gestalt approach help in visual perception? What factors help us overcome the difficulties in recognizing speech?
9. What kinds of evidence supports the general mechanism approach to speech perception? Contrast this approach with the special mechanism approach. How is the special mechanism approach to speech similar to the findings about per- ceiving faces?
10. Throughout this book, we will emphasize that the research from cognitive psy- chology can be applied to numerous everyday situations. For example, you learned some practical applications of the research on face perception. Skim through this chapter and describe at least five other practical applications of the research on visual and auditory recognition.
templates feature-analysis theories distinctive feature recognition-by-components theory structural theory geons viewer-centered approach bottom-up processing top-down processing word superiority effect change blindness inattentional blindness ecological validity holistic (recognition) gestalt brain lesions prosopagnosia fMRI schizophrenia speech perception phoneme coarticulation phonemic restoration McGurk effect
special mechanism approach speech-is-special approach phonetic module speech module categorical perception general mechanism approaches Answer to Demonstration 2.4 63 KEYWORDS perception object recognition pattern recognition distal stimulus proximal stimulus retina sensory memory iconic memory primary visual cortex gestalt psychology figure ground ambiguous figure-ground relationship illusory contours subjective contours template-matching theory RECOMMENDED READINGS
Coren, S., Ward, L. M., & Enns, J. T. (2004). Sensation
and perception (6th ed.). Hoboken, NJ: Wiley. Coren
and his colleagues’ mid-level textbook emphasizes vision and hearing; however, other chapters provide information on taste, smell, the skin senses, and the perception of time.
Farah, M. J. (2004). Visual agnosia (2nd ed.). Cambridge, MA: MIT Press. I strongly recommend Martha Farah’s book for anyone interested in neuroscience; the book is both informative and well written. Greenberg, S., & Ainsworth, W. A. (Eds.). (2006). Lis-
tening to speech: An auditory perspective. Mahwah, NJ:
Erlbaum. Here is a current advanced-level exploration of speech perception, speech processing, and auditory scene analysis.
Henderson, J. M. (Ed.). (2005). Real-world scene perception. Hove, UK: Psychology Press. Most of the current chapter focused on how we perceive isolated objects. This interesting book addresses more complicated questions about how we perceive scenes in the every- day world.
Levin, D. T. (Ed.). (2004b). Thinking and seeing: Visual
metacognition in adults and children. Cambridge, MA:
MIT Press. Here’s a book that examines change blind- ness and inattentional blindness. Many of the chap- ters in this book also discuss whether we are aware of these kinds of deficits.
ANSWER TO DEMONSTRATION 2.4