The other day I wrote a bit about some of the intellectual and linguistic foundations of Knight-Thompson Speechwork. Today I’d like to write a bit about how Dudley found his way from there to here, as it were, and a bit about why KTS-based speech classes spend so much more time on vocal tract explorations, play, and teaching (the entirety of) the International Phonetic Alphabet (IPA) than traditional speech classes do.
When Dudley trained as an actor, at Yale Drama in the early 1960s, there wasn’t much in the way of speech training as we’d recognize it today. It was only later on, when he began teaching speech himself, that he encountered Edith Skinner’s ‘Good American Speech.’ Skinner and Good American Speech were really the only game in town for most of the 20th century, and even into the 21st. Skinner’s much-reprinted book, Speak With Distinction, is still used even today in many American actor training programs to teach her ‘Good American Speech’ accent pattern, or some version of it. In the 1980s and 90s, while teaching speech and voice at the University of California Irvine, Dudley began to ask some fundamental questions about the Skinner model. Where did it come from in the first place? Why was it such an entrenched, universal feature of American actor-training programs? What did its claims to clarity, correctness, superiority, and euphony rest on? And perhaps most importantly, was it effective? 
In order to answer this last question, it was first necessary to figure out what the goals of speech-training for actors should be. Dudley was extremely well-prepared to investigate this question. He’d been acting for decades in every medium—on stage, on television, in films, and on the radio. He’d also been a dialect coach for quite a while—for longer, in fact, than he’d been teaching speech in an institution. So he was well aware, from every vantage point, of the many different potential demands placed on actors’ speech and embodiment of language. Shakespeare in an outdoor arena required different strategies and skills, speech-wise, than an intimate scene in a TV drama shot in close-up. Increasingly, actors needed to be able to adopt and own different accents, and to embody them authentically and persuasively. Furthermore, across all media and material (leaving aside Brad Pitt in Snatch), actors needed, above all, to be intelligible—their words had to be understood by the audience. To be sure, there are other values one might add to these two. But for Dudley, it seemed that these two things were the absolute fundamentals. All the other usual suspects—beauty, “correctness,”—were examined and found to be problematic, at best.
Having come to the conclusion that these two skills—intelligibility and adaptability—were of primary importance in actor speech-training, the vital question then became: What was the most effective way to impart them? Was it, in fact, Skinner’s ‘Good American Speech’? The answer, of course, was no.
Towards an effective pedagogy
Dudley’s own practical experience as a teacher and a coach led him to feel quite strongly that drilling a specific, rigidly prescribed speech pattern was not the way to go. As soon as he started poking around, he quickly discovered that there was a veritable mountain of empirical evidence—on speech perception, second-language acquisition, and the like—to support this view. Science, in particular the science of linguistics, had moved on light-years ahead of theatre-based voice and speech practitioners. We had some catching up to do!
There are obstacles in the way of simply perceiving speech sounds accurately. Though these affect people unevenly—some adults retain more ability to discriminate between subtle variations than others—they affect all of us to some degree. This is the result of a phenomenon called categorical perception. When we hear a speech sound—the vowel [ɪ], for example—we need to sort it (quickly!) into a perceptual category in our brain. Imagine a bucket labelled /ɪ/. There’s another bucket, right next to that one, labelled /i/. As English speakers, it’s this sorting activity—[ɪ]-ish sounds go in /ɪ/, [i]-ish sounds go in /i/—that allows us to discriminate between pairs of words like green and grin, beat and bit. But imagine the situation for a non-native speaker of English. Let’s call him Armand. Armand’s first language is French. He’s been studying English for a while—he actually speaks it reasonably well—but he continues to be bedevilled by this /i/ vs. /ɪ/ issue. He’s often unsure about whether someone is saying seat or sit to him, especially when the context doesn’t help him. And remembering which vowel he’s supposed to use, and actually managing to—well, it’s hit or miss, at best. The reason for this is that he has only one bucket in his brain for sorting [ɪ]-like and [i]-like sounds into: an /i/ bucket. (The two sounds are acoustically very close to each other, being produced with very similar tongue positions.) So when he hears a sound, whether it’s an [ɪ] or an [i], his brain wants to sort it into that one /i/ bucket. That’s categorical perception, in a nutshell.
So what’s the issue with a native speaker of English, when speaking English? Well, nothing, as long as the goal is understanding what other people are saying (an important ability, to be sure!). Categorical perception is what allows us to do this pretty effortlessly, no matter what accent someone is speaking in. This sorting ability is essential, in fact, for human communication. Without it, we probably wouldn’t even have language! But it also means that we have spent decades training ourselves to ignore differences. We can hear [ə̯i], [ɪ̯i], [ɛ̯̽i] or any number of other variations and sort them all into our /i/ bucket. Which means that when it comes down to really hearing [ə̯i], [ɪ̯i], [ɛ̯̽i]—let’s say you want to figure out how to produce them precisely, so you can speak convincingly with a particular accent—we’re stuck. Our primary equipment for this task, our ears (by which I really mean our brain’s auditory processing capabilities), have been trained specifically and at length to do the exact opposite task. The implication for accent acquisition—or any alteration of speech patterns—is clear. If all we do is drill a sound over and over again, without ever addressing the fundamental perceptual issues, we’re unlikely to meet with much success. And even if we do, in the end, surely there’s an easier way?
There is, of course. Come at categorical perception head-on. The first part of a Knight-Thompson Speechwork class, in essence, is one way of doing just that. If speech work begins with extensive physical exploration of the vocal tract, some precise anatomy, and an emphasis on feeling and sensing the physical actions of speech, then perception opens up. Narrow descriptive phonetics can then undergird and deepen the newly developing perceptual and productive skills. This is one reason why it is so much better to teach the entire IPA—vowels, consonants, and even non-pulmonics—than a limited set of symbols corresponding to certain prescriptive targets. As I and others have pointed out, even if your goal is to teach actors a specific ‘standard’ speech pattern, this approach will be more effective than endless drilling, which may work for some but will inevitably fall short for others.
I think this post is long enough, at this point, that I should leave the next chapter in the story for another post. So, if you’re still with me, and if you’re curiousity is piqued, stay tuned for the next installment:
Why the Detail Model Had to Go
 For a lengthier (and far more entertaining) treatment of Dudley’s research into these questions, you can do no better than to read his own accounts. “Standard Speech: The Ongoing Debate,” originally published in The Vocal Vision in 1997, and “Standards,” published in the Voice and Speech Review in 2000. Both articles may be found on our articles page.
 There’s a third I would add myself, one I consider to be of equal importance (more on that another time).
 (Seriously, just go read the articles already.)
 Slant brackets //, of course, indicate phonemes—those mental buckets—and square brackets  indicate phones—the speech sound resulting from a specific physical action.
 For a much more thorough and deeply-researched exploration of categorical perception and why deep and thorough phonetics training is such a good idea in actor speech-training, see Phil Thompson’s article “Phonetics and Perception: the Deep Case for Phonetics Training.”
 In a three-year training program, the entire first year might be spent on exploration, anatomy, freeing up and isolating articulator action, Omnish, and descriptive phonetics. In a one-year speech class (not including accent work, in other words), this work might still take up half to two-thirds of the year.
photo credit: h.koppdelaney via photopin cc
Previous post: Roots Next post: Why the Detail Model Had to Go