Absolute Pitch research, ear training and more
Hello! I'm Chris Aruffo, your host here at Acoustic Learning. It's my goal to dispel the myth and the mystery that surrounds the phenomenon of perfect pitch (also known as absolute pitch), and to make it a skill accessible to anyone who wants to learn it.
To explore ways you can learn absolute pitch, click "Learn AP" or the research links above.
Perfect pitch requires categorical perception. I've been convinced of this for a while now. For a quick-n-dirty example of what this means, open your mouth and start saying the vowel ahhhhh. Now keep saying the vowel, but gradually close your mouth. You will hear two simultaneous but separate phenomena: you will, of course, hear the sound constantly changing as your mouth closes-- but you will also perceive that the vowel continues to be ah until suddenly it snaps into another vowel. You can open and close your mouth repeatedly (ah-oo-ah-oo) and, even though you can clearly discern that the sound is gradually sliding along a continuum, you will only ever perceive a few distinct vowels. Now compare this to your perception of musical sound. If you hear a sound sliding from one frequency to another, you can hear the slide-- but that perception of distinct categories, with that sudden snap from one to the other, is absent. Perfect pitch is not the ability to name notes. Not really. Perfect pitch is when your mind stops perceiving the pitch scale as an infinite continuum and starts perceiving it in seven discrete chunks.
It seems fair to say that there are seven pitches, rather than twelve. An octave is divided into twelve pitches, but in Miyazaki's experiments on speed of pitch perception, the "white keys" (major scale tones) were identified more quickly than any of the "black keys" (flats and sharps). This suggests to me that the magical number seven, plus or minus two, seems to apply to colors (ROYGBIV) as well as pitches (CDEFGAB). Listeners are able to hold the seven primary pitch categories in memory for immediate identification and recall; and, when presented with a pitch that is similar to one of those seven, an extra moment's thought will bring the adjacent category to mind. This appears to be similar to how we identify unusual colors-- when presented with an odd color, we immediately recognize it according to its primary category and then start making further judgments. Sky blue is just "blue" until you pay that extra moment of attention to give it the specific name. So while it will be ideal to learn all twelve pitch categories, the major seven are the critical ones.
It seems fair to say that there are seven pitches, rather than the 100-plus that are within the range of human hearing. If you recognize that pitch class and octave are two separate qualities of a musical tone-- a distinction made as early as 1913-- and combine that with the documented observation that absolute listeners are prone to making octave errors, then it seems obvious that listeners who are able to identify "100-plus" pitches are really making two separate categorical judgments: one judgment based on the seven primary pitch classes, and a second judgment based on the seven primary octaves used in musical performance. Even then, the predominance of octave errors suggests that absolute listeners don't have categorical perception of musical octaves, but make their best guess from the apparent "thinness" of the tone. So tone height is a quality that absolute listeners are aware of, but they aren't aware of it as height.
Learning perfect pitch, therefore, should be both learning to make categorical judgments among seven pitch classes, and learning to recognize tone height as a separate quality that isn't height. The second part's the stumper. As far as most of us are concerned, pitch is the height of a tone, and somehow we need to pull this apart. For the first part, though, we may actually have a road map of how to do it.
The problem is to retrain ourselves to hear a single dimension, vibratory sound frequency, as discrete categories instead of as an infinite continuum of values. When I was looking at experiments that were testing how categories were formed in adult brains, most of the experiments I found trained observers to form categories that were based on multiple dimensions, or at least categories which changed more than a single value between them. Experiments such as these demonstrated that adults are capable of forming new perceptual categories, and could do so based on near-unconscious observation of minute differences, but didn't persuade me that adult brains are capable of learning categories when there is only one dimension to be learned-- especially when we are already well familiar with that dimension as a non-categorical perception. But then I read this 1994 paper from Robert Goldstone in which observers were trained to categorize squares according to their brightness, and in doing so gained heightened perceptual sensitivity to each brightness category. The observers were only trained for about an hour, so they didn't gain actual categorical perception for these levels of brightness, but I see the takeaway message as this: after only an hour of training, the categories had started to form. The training was nothing more than observers seeing different members of each brightness category and actively labelling each one with its category name, after which they were told whether or not they were right. If only an hour of this training were enough to begin the formation of categorical perception, then you might think that a continuation of this training would be enough to complete the formation of categorical perception.
Yet merely naming pitches and getting feedback isn't enough to learn perfect pitch, no matter how long you train. We've known that since 1899. Anyone who practices naming notes will get better at it, but there's no evidence that this "skill" is anything like perfect pitch, in that it would actually begin to develop categorical perception for each of the tones. I suspect that there's a subtle and critical difference between Goldstone's procedure and what's been tried for more than a century: Goldstone's procedures did not present just one example for each category. His procedure presented multiple different examples within every category. If the same approach could work for turning pitch into a categorical perception, then the procedure will require training not with one precisely-well-tuned sample of each pitch, but with many different poorly-tuned samples of each pitch. Categorical perception might therefore be induced, not by hearing the perfectly-tuned ideal exemplar of each pitch and trying to remember it, but by actively sorting various bad examples into their appropriate categories. This, theoretically, could encourage our brains to start defining certain ranges of sound as belonging to this pitch category or that pitch category, thereby naturally developing a tendency to categorize every pitch we hear.
If sorting out-of-tune tones into their proper pitch categories could start to develop perfect pitch, the obvious question is which tones should be sorted. For this we turn to recent developments in phonetic science, and the idea of a bimodal distribution model. Traditional phonetics asserted that we learned different letter sounds by minimal pairs-- by hearing them change from one word to another. That is, if we learned what a bat was, and then discovered that a pat was not the same thing, then this knowledge would allow us to compare pat and bat and learn the difference between p and b. This is essentially the current mechanism of Absolute Pitch Avenue-- and although it does sharpen one's listening skills, it does not induce categorical perception. The bimodal distribution model presents an alternative that is closer to Rob Goldstone's procedure. The idea here is that, when we hear letter sounds in the real world, these sounds are produced by hundreds or thousands of different people, and so none of them are going to be exactly the same-- but they are going to cluster around the categorical ideal in a normal distribution. So because we most regularly hear an oo sound near its categorical ideal, our minds infer an average of all the oo sounds we hear and makes that the categorical center. As the oo sound moves away from the center, we hear it less and less regularly... and then it starts to sound like other vowels that we know are supposed to be oh. So the decreasing regularity clues us in to the change toward the oh vowel in the one direction, while the exact same process is working toward the oo in the other direction, causing us to form a categorical boundary between oo and oh. The bimodal distribution model suggests to me that presenting out-of-tune tones in a normal distribution around adjacent pitch categories could be a way to induce categorical perception of those pitch classes.
It doesn't seem likely that a child who learns perfect pitch will have done so by hearing thousands of out-of-tune pitches that they sort into categories. It seems more likely that they would be exposed to thousands of well-tuned pitches in different contexts. However, there is this still-magical phenomenon known as the "critical window" in which, from the ages of 3 to 6, children learn about language and sound differently than adults. I'm willing to chalk up the difference in learning to the critical window-- children may not need multimodal distributions to learn perfect pitch because they have no preconceptions about what pitch frequencies are all about, but adults need to train with a multimodal distribution to overcome their lifelong biases to hear pitch non-categorically.
This training process is what I'm going to realize in the next version of the Ear Training Companion: sorting distributions of tones into categories. I may have to start rebuilding ETC from scratch, if only because Realbasic is incapable of generating microtones on non-Mac computers. But even as I do, I'm going to have to seriously consider how to overcome the second part of the training: getting listeners to perceive pitch height as separate from pitch class, and to recognize it as something other than height.
I think I understand what separable pitch height is, by now, and I know I've written about it before: it's the overtone series of a given pitch. In fact, I know I've written extensively about this before, expounding about partial octaves and incremental pitch height achievable by manipulating overtones. I am sure that non-absolute listeners perceive this kind of pitch height. I recall an experiment in which listeners judged the pitch of two tuning forks; each played the same tone, in the same octave, but one was made of a thinner metal. The thinner-sounding tone was judged to be in a higher octave than the other, regardless of listeners' musical experience. I also have successfully gotten non-musicians to sing perfect octaves by asking them to sing "the same tone, but lighter (or thinner)." Learning to perceive this kind of pitch height isn't a training goal, especially if absolute listeners don't perceive it categorically anyway. No.. the problem is to persuade our minds that pitch frequencies, by themselves, are not fundamentally "higher" or "lower" than each other.
The consequence of interpreting frequency as pitch height is easy to imagine. Let's say that a person trained with multimodal distribution and (let's be optimistic here) successfully formed firm categories for each of the 12 pitches between middle C and high B. Great. Then the same person is presented with a high C. Why would this person, who has unconsciously believed all their life that pitches differ in height, place this in the same category as middle C, which is 12 whole categories distant from it? Why would they not become utterly bewildered by this out-of-bounds sound or, perhaps, place it in the same category as its neighboring B? I present these as rhetorical questions, but they must be answered. In the We Hear and Play process, new octaves are introduced by telling a child that the tones are the same pitch class, just "brighter" or "darker" in color, and a child's mind is okay with that. Adult minds won't buy that so easily. Absolute Pitch Avenue successfully teaches an adult mind to hear chroma, and the game works toward perceiving octave equivalence, but I don't know if that will be enough. I reject the idea of using unnatural tones, like Shepard tones that have no discernable height (or, looked at another way, simultaneously have every height), because I have no evidence to suggest that learning with unnatural tones would transfer to natural tones. There must be some kind of process by which a normal mind can accept octave equivalence as the greater truth. I just need to find it.
But, while I'm figuring that out, I do have a solution in hand for training categorical perception. Which means, in turn.. I've got a new game to design.
[And yes, updates to ETC will continue to be free!]