Acoustic Learning, Inc.
Absolute Pitch research, ear training and more


Phase 2:  Music and Language


August 29 - Right back where we started from

Can perfect pitch be learned?

Colors are learned.  It was easy to see that when we are very young, there is a tremendous effort devoted to teaching children their colors.  Perhaps if a similar effort were applied, children would "learn their pitches".  But if that was true, then I had to consider the reverse case:  what if children were not taught their colors?  Would we not be able to see colors?  Or would we just not be able to name them?  I immediately thought of an example that my mother, whose PhD is in linguistics, once described to me.

It's a well-known fact that Eskimos have dozens of words for "snow".  Most languages have one concept of "snow", and we relate all our experiences of "snow" to that single concept.  Well, some languages have only two words for color: red and black.  I'd known that for years, and always assumed that's just because those were the most obvious extremes of color.  But now it struck me-- what if they only name two colors because they don't actually see colors? What if their words are not "red" and "black" as we understand it, but "brightest" and "darkest"?  Everything in between can be described as relative to these two extremes.  

At least one researcher thinks that all of us start out with perfect pitch, but we either "use it or lose it".  I wondered: perhaps seeing color could be unimportant to some other cultures?  Perhaps they don't use it, so they lose it?  I imagined how, in an arid African plain or a vast expanse of icy tundra, there wouldn't be much practical need for color discrimination, since their environment would have very little color variation.  Could it be that the reason we know colors, and not pitches, is simply because we need to see and use colors every day?

And then I read about experiments which seem to show that the speakers of tonal languages do all have perfect pitch! That's the answer, I thought. These people have a need to learn perfect pitch, and so they do. Perfect pitch can be learned. And I began studying in earnest.

Gradually, though, my confidence started to fall apart.

First I learned that the tonal meanings of the words in languages was determined by relative change, not absolute pitch. I heard from a woman with perfect pitch who said that some of her co-workers were native speakers of tonal languages, and they do not describe their experience of pitch perception the same way that she does.

Then my own efforts at learning suddenly became stalled. I had been delighted to discover that I could hear the "character" of pitches, but once I started listening to intervals my comprehension came totally unglued.  When notes I could name individually were paired into major and minor thirds I could barely name even one.  I could hear the individual notes within the pairs, and I could even repeat them by singing, but somehow they just all sounded the same to me.  They all sounded like "top" or "bottom" notes.

I finally started studying the biology of the ear.  I was relieved to discover that the ear is actually quite precise, just like the eye; the receptor cells in the inner ear break apart a complex sound into its components.  Those receptor cells are even hardwired to stimulate corresponding areas of the brain. It's a built-in spectrum analyzer.  So I was even more puzzled-- if the ear could detect frequencies so specifically, why it was so difficult to hear pitches within intervals?

Then I had a surprising conversation with the same woman (who has perfect pitch). She said that notes did not sound differently played ascending or descending.  She never imagined they could.  I, of course, never dreamed they wouldn't.  I went to the piano and played combinations of three notes over and over again.  I tried very hard to hear the middle pitch as the same regardless of what came before-- but it persisted in sounding higher or lower.  I eventually concluded that I must be hearing the first note, and then listening for what changes in the next note, instead of listening for the entire pitch.  Since both notes overlap in their stimulation of my receptor cells, I figured I must be associating certain receptor cells with different notes depending on which came first.

If my response to receptor cells was unclear, then I figured I must not be listening to the characteristics of the specific pitches-- but then what am I listening to?  I built a spreadsheet to show me what the composite waveforms looked like, to see if it gave me a clue.  I was amazed to discover that all intervals do look alike. The waveform of any minor third looks just like every other minor third; the waveform of any perfect fifth looks just like every other perfect fifth.  I realized that I must be listening to the pattern of the wave, instead of the individual pitch!  I began to think more about the fact that someone with perfect pitch doesn't hear ascending and descending notes.

I had to conclude that I must be listening to patterns. Even when I hear a single frequency, I must still be collecting the information about its "critical bandwidth"-- the response of multiple receptor cells to a single frequency-- and processing that entire experience into a pattern. It's just another form of relative pitch, except that the "interval" I'm hearing isn't the effect of two notes merging together; it's the effect of multiple receptor cells merging into a single frequency.

It appears that the person with perfect pitch doesn't wait for a pattern to form, but reflexively registers their ears' receptors being stimulated. They know which receptor cells have been stimulated.  Perhaps ordinary people process sound intellectually, and don't hear pitch in anything when it forms a meaningful pattern. But someone with perfect pitch could process sound instinctively, hearing a sound sensually before it can become anything meaningful.

Instinctively. Instinct. As in "born with". As in "not learned".

So I'm left asking the question, again-- can perfect pitch be learned?

August 31 - A black cat in a dark room

I now have a number of books heading in my direction.  I ordered a book to teach myself about basic harmonic theory; I ordered Diana Deutsch's book on music cognition; I ordered Music and the Brain to complement Music, the Brain, and Ecstasy.  In addition to this, I remembered today that there are two research papers which supposedly validate that absolute pitch can be learned, and I discovered that I could request them through interlibrary loan.  So I've done that.  I expect those research papers to show me two things.  First, I expect that their descriptions of the goal and the result will be naming notes; second, I hope to see exactly why and how they validate the method they're testing.

I don't know why researchers seem to be dedicated to the notion that perfect pitch is that of naming notes.  Daniel Levitin seems to be the person with whom my investigations most strongly agree; yet he concludes that perfect pitch is linguistic coding, not primal reflex.  According to his on-line publications, it seems that "perfect pitch information" is available at multiple points along a sound's path in the brain; he asks why we don't all pay attention to that, and his answer is that perfect pitch people must code it linguistically.

I began searching the web last week for the new idea that perfect pitch was a reflex instead of an intellectual process.  I came across a "classic" psychological article from 1896 which fascinated me because it said, essentially, that even an instinctive reflex is not genetically inherited, but is part of a subconscious process of expectations.  The article explicitly described expectations of hearing:

If one is reading a book, if one is hunting, if one is watching in a dark place on a lonely night, if one is performing a chemical experiment, in each case, the noise has a very different psychical value; it is a different experience.

So you hear what you expect to hear, even at the most basic level.  Change the expectation, and you will change the reflexive response.  And when I followed this up by searching for "reflex modification" on Google I discovered that most research on this subject has been devoted to eyeblinks and the "startle reflex".  You can imagine how you'd modify your own startle reflex, can't you?  Imagine a sudden BANG going off near you.  You'd jump out of your skin.  But if someone counted down and told you that the BANG would happen at zero, you would have a far less extreme reflexive reaction.

So yes, you can use expectations to alter your reflexive responses.  And you can definitely train yourself to have new physical reflexes, so why not train yourself to have new mental reflexes?  Perfect pitch can be learned; we're back on track.  We just have to train ourselves to have new expectations.  Except...

What are we expecting to hear?

September 1 - Expect the unexpected

If perfect pitch is a reflex, and reflexes are controlled by expectations, then it's possible to conclude that a person with perfect pitch expects to hear something different from someone who does not.  This line of thinking is supported by my sine-wave constructions, and I was pleased to see on Daniel Levitin's page typical quotes from people who have perfect pitch:  "I don't hear the melody.  I hear pitch names passing by."  They're listening to something different.  They hear something different.  But what?  What is it that they expect to hear which the rest of us don't?  If it's reflexive, they must not know themselves on an intellectual level what it is they're listening for-- which is why they don't simply tell us.

I initially thought that perhaps they're taught from an early age to expect all sound to be music.  Perhaps when they're extremely young, a child in a musical family will be exposed to music so frequently that they would reflexively come to expect all sounds to be pitches.  But I quickly realized that this would not explain why children in totally non-musical families, with no musical instruction whatsoever, also gained perfect pitch.  I shelved that idea and began looking for specific information about how reflexes could be modified; I thought perhaps the methods used for modifying reflexes could be applied to perfect pitch.  But the most specific thing I found was a very expensive book called Startle Modification (which I didn't order).

While I was stalled, I decided to stir the hornet's nest of a Yahoo discussion group, just to provoke response and present some ideas to be challenged.  It is strange how emotionally invested people are in their own opinions of what perfect pitch actually is; I'll have to be careful to keep from becoming biased to the point of not hearing new ideas.  In any case, I was spectacularly failing to explain myself-- I suspected it was because I was writing as though a reader were familiar with the lines of thought I've been following here on my website-- when an unexpected voice chimed in.  This person claimed to have learned perfect pitch after doing years of exercises; he said it was listening in a totally different way; he said that when he shifted to "perfect pitch mode" the music disappeared and he just heard pitches.

I was floored.  This was the first person I'd "met" who claimed to have learned perfect pitch, and he instantly validated three conclusions I had drawn:  that perfect pitch can be learned; that it means listening to something totally different; and that people with perfect pitch do not necessarily hear combined musical sounds.  I asked him to tell us what exercises he had done, and what he was listening for that helped him hear this different thing, and I crossed my fingers to hope he'd answer.

His answer could not have pleased me more.  He said that in order to hear differently, he ascribes a vowel sound to each of the different pitches.  "When I hear the notes, I hear vowels," he said, and proceeded to list which pitches corresponded to which vowels.  Just listen to the vowel sounds, he said, and you will recognize the pitches.

As I sat staring at his discussion post, my eyes got wider and my jaw dropped further as the enormity of what he had written began to sink in.  He was listening to vowel sounds.  This meant he was not listening to music.  Of course people who have perfect pitch are not conditioned to think that all sounds are music; "music" is, inherently, relative listening.  They're conditioned (or preconditioned) instead to think that all sounds are language.  That's why the region of the brain that analyzes speech is enlarged in people with perfect pitch-- as far as they're concerned, everything that they hear, from the moment they're born, is speech.

September 2 - Throwing in the vowel

While I was driving to the theater on Friday night, I was thinking about the idea of pitches as vowel sounds.

I had been intrigued to learn (weeks ago) that the reason we hear different vowel sounds is because of "formant patterns".  The mouth and tongue form certain overtones on top of the basic pitch of our speaking voice, and the vowel sound is determined entirely by which overtones are stronger.  That is, the only difference between "O" and "U" is the difference in their overtone pitches.  If the only difference between vowels is the pitch we hear, I considered, then this agrees nicely with the idea that pitches each have their own distinct vowel sound.

But that didn't seem to agree with the idea of reflexively expecting all sound to be language.  The English language derives all its meaning from consonants, not vowels.  I've seen plenty of examples which make this point-- they'll take some random sentence like "The girl chased the ball around the hilltop" and remove the vowels to show you that it still made sense:

Th grl chsd th bll rnd th hlltp.

But then, to underscore the point, the example usually continues by presenting the same sentence with only the vowels, to show you that it made no sense:

e i ae e a aou e io.

As drove along, I idly spoke this meaningless sentence out loud, and suddenly realized-- I was singing!

Suddenly it made sense that someone with AP expects to hear "language", as my thoughts began to tumble over each other.  I'd been told by more than one voice teacher that, in singing, consonants carry the meaning, and vowels carry the emotion.  Emotional reactions are, theoretically, based on primal fight-or-flight principles.  Humans are the only creatures who can form consonants; animals can only make vowel sounds.  I'd read in one of Daniel Levitin's publications that wolves can identify each other by the absolute pitch of their howls.  So maybe someone with perfect pitch expects language-- a primal level of language!

I called my mother (the PhD linguist) to talk about this.  She understood what I was saying, and pointed out our vocal cords can only make vowel sounds; it's when the front of the mouth interferes with the air flow that the vowels become consonants.  Vocal cords are designed, on the most basic level, to produce vowels.  And if vowels are distributed according to pitch frequency, then our frequency-specific ears are designed, on the most basic level, to hear vowels.

But then why is it that we don't pick up perfect pitch just by listening to speech?  There are vowels in everything we say!

The Japanese have a vastly higher occurrence of AP than we Americans.  Each time I've seen this fact mentioned, its writer has speculated that this is because the Japanese are typically taught by the Suzuki method, which emphasizes listening.  But there is a second critical factor.  I had read a letter from a European choir director:  "As a European," he said, "it is easy for me to pronounce vowels purely while some Americans naturally tend to turn the vowels around and make them joined diphthongs."  Of course!  An American is trained to comprehend and to speak a broad variety of sounds as vowels, with most of them slurred into diphthongs.  By contrast, the Japanese language is very precise about vowel pronunciation:

The vowel sounds in the Japanese kana syllabery are full and uninflected. Constants of spoken language, these sounds are universally recognized and the most natural to pronounce...  Listen for the core vowels in the words given and drop any changes or inflections; you should be able to extend each of these sounds indefinitely with no change in quality.

My mother confirmed my suspicion that the English language is rife with diphthongs and the Japanese language has almost none.  The Japanese language is more than five thousand years old; as this link shows, the "phonetic principles behind human language [are] implied in the kana syllabery."  It's not merely the Suzuki method which encourages the Japanese child to hear absolute pitches-- it's the language itself.

The American language, by contrast, discourages us from hearing absolutely.

In the first place, our categories are ridiculously unspecific.  Not only do we categorize seventeen different types of sound as the vowel "O" within ordinary speech, but even those sounds vary regionally.  We can hear the simple word "Hi" pronounced by a New Yorker, a Texan, a Southern Californian, or any of a dozen other local dialects-- and although they sound totally different, we still identify the sound as an "i".

In the second place, diphthongs force us to listen in "relative pitch" even when we are listening to language.  That's partly why our categories are generic-- in order to recognize the word "Hi" in these three dialects, we have to be able to understand that the combined vowel sounds i-e, ah-i, and oh-i-e are all "i".  We listen to the relationship between the vowels, and we interpret the combined sound to be a single vowel.  In relative pitch we listen to the relationship between pitches, and we interpret the combined sound to be a single chord.  And perfect pitch is responding to pure sounds, not interpreting combined sounds.

Our language has reinterpreted "vowels"; they're not what they were originally meant to be.  The idea of hearing pitches as vowels is, in fact, an ancient practice.  This is, then, what I mean when I say that someone with perfect pitch reflexively expects to hear language:  perfect pitch is a primal sense of hearing, responding to a primal comprehension of language.  Human "language" is different from what apes and wolves have to say to each other

September 4 - Pitch a la mode

I've got two separate tracks to follow now, both of which seem highly promising.  I want to find out if there is a tangible correlation between pitch and vowel sounds.  I also discovered that there is a substantial body of research available about the Stroop Effect; although this effect is typically linked to color perception, the research is essentially about automatic cognitive processing, and about linking language to sensation.  It seems probable that studies of the Stroop Effect will illuminate this topic quite well.  Right now I've got to take a moment aside, though, to relay parts of a conversation I had last night with a friend of mine who is a singer.

He reacted incredulously when I told him that perfect pitch was a reflex that could be learned.  "Learning a reflex?"  he laughed.  "That sounds like learning to sweat on command."  As we talked about it further I discovered that I'm being imprecise in my explanation of the learning goal.  The task of perfect pitch (that of registering the sensation of a frequency that stimulates the ear) could be an involuntary reflex, yes, and that's not something we can control-- but that is something every one of us already does.  We don't need to "turn on" our reflex; we need to pay attention to it.  The Stroop effect studies seem to make clear that automatic mental processes can be modified with continued practice, and I look forward to learning more about that.

But, more importantly: when I suggested that true perfect pitch is a mode of hearing that can't hear musical relationships directly, he asked, what good can perfect pitch possibly do if you can't hear the music?

Of course, people with perfect pitch can be very good musicians.  They may not hear the relationships as one sound, though.  I've quoted one fellow with AP who said that calling C-E-G "a chord" is the same to him as calling four quarters "one dollar"; this suggests that he knows it's equivalent to a dollar bill, because he's been told so, but he's only ever seen the coins.  Yet the man who said this is a respected professional musician.  He may not hear music in the same way, but he definitely hears music and hears it well.  Even though people with perfect pitch don't directly hear the relationships between notes, they do hear the relationships in some meaningful way-- something more than just knowing how far apart the notes are.

This seems to lead to an obvious conclusion:  if it is possible to learn pitch recognition in "relative mode", then it makes sense to consider that it's possible to learn interval recognition in "absolute mode".  And it is indeed possible to learn pitch recognition in relative mode.  There's a guy in the Developing AP Yahoo discussion group who says that he has reached the point where he can recognize or recall any note just as quickly as he can recognize or produce any interval.

The fellow I've corresponded with who taught himself to hear in perfect pitch mode says that when he listens in that mode he can't hear music at all, only pitches.  This is just like the three-to-six-year-old students who learn Suzuki violin.  The instructor I met with said that before each lesson, her students listen to a CD of the music they are supposed to play.  When they are performing their lesson, they know when they've played a wrong note-- but they don't know what to do to correct it.  They'll just stand there, confused and frustrated.  They know that they've played the wrong note; they remember what the right note is; but they don't know whether to move their finger "up" or "down" to reach the note they want.  Gradually, of course, they do learn relationships.  But initially they hear in a way which does not understand relationships between notes.

I see two possible parallel pathways-- absolute and relative listening.  With relative listening, you begin with combined sounds and gradually learn to break them apart.  With absolute listening, you begin with individual sounds and gradually learn to put them together.  Both paths seem to end up in the same place, making complex music with a sophisticated ear-- but they took different routes to get there, and although the paths intersect they don't actually merge.

This makes me think that total ear training might have to be done twice, once in each mode.  The best thing would probably be to learn relative pitch first.  Since that is what we are familiar with hearing, that makes it comparatively easy to do.  Then we can learn individual pitches, listening in relative mode, recognizing the qualities of each type of note.  Then, accustomed to listening to individual pitches, we can learn to listen to pitches in absolute mode-- and, once you know how to listen in absolute mode, re-train yourself in musical comprehension while listening in absolute mode.

This could take quite a long time.

September 6 - Pudding in the mix

Because I'm not a lab scientist, I can build on ideas that seem plausible instead of having to prove each idea every step of the way.  Of course, what I conclude are therefore ideas instead of facts and, without corroboration, remain my own fanciful inventions.  So it's very reassuring when things pop up which verify what I'm saying-- sometimes anecdotally, but other times in official scientific research.

I thought that perhaps everyone's brain received the same absolute pitch information, but processed it differently; Levitin confirms that "there are frequency-selective neurons at every stage of the auditory system," and wonders why we don't all have perfect pitch.

I suspected that people with perfect pitch don't hear the sound of an interval directly; I came across a clever experiment which seems to say the same thing.

I wondered if there was a total distinction between absolute listening and relative listening.  First I found a person who said that he can switch between two wholly disparate modes of listening.  Then I discovered an article which asserts that "normal brains are musical brains", and a research study which shows that ordinary people are trained to listen relatively whether they know it or not.  And then I found another site which related the anecdote of a famous pianist who chose to listen only in relative mode and, consequently, "lost the ability" to hear absolutely.

I hoped that the Stroop effect was somehow related to perfect pitch, and found one scientist who thought so too.

So for you skeptics out there-- including the guy in the Yahoo discussion group who exclaimed in bewilderment "How much of this are you just making up?"-- I am making it all up, but follow the links I'm offering you and you'll see that what I'm writing, despite its non-scientific approach, is solidly grounded.

September 7 - With targets not much bigger

I'm currently waiting for books and publications to arrive, so it's probably a good idea to leave theory alone for a while and concentrate again on practical listening and learning.  I've already described my process of listening to individual pitches so that I always get them right-- if I'm patient enough to follow the process, and wait until I know instead of rushing ahead to guess.  But I didn't explain a certain part of it, which are the "trigger words" that I use to identify notes.

Trigger words are linguistic associations that help you to recognize pitches.  You may have heard someone else describe how pitches sound to them-- certain pitches, they say, may be "round" or "whiny" or "bold" or whatever.  You can use words in the same way to help yourself recognize which pitches you are listening to-- but you can't use theirs.  You have to make your own.

It's not very helpful for someone to tell you what the notes sound like to them.  When I say the word "twangy" it means many things to me-- the bounce of a trampoline, words with a Texas accent, the string of a slap bass.  If someone tried to tell me that a note was "twangy", I would try to hear in that note any and all of the sound qualities which I associate with the word "twangy"-- and it's entirely possible that none of those sounds that I know is what they're actually talking about!

When you use a trigger word, it needs to be very specific in your own mind.  Some people try to make arbitrary associations and make those associations stick: they might assign a color to each note, or let someone tell them adjectives for each note.  But the note names, A through G, already are arbitrary associations.  Unless your associations mean something very specific to you, you won't do much better by using other random associations; you might as well just use the notes' letter names.  When you hear the characteristic described by your trigger word, you must know that no other note causes you to feel the same way.  It's so easy to assign your own words, too, that you really should!  You can create your own trigger words while you work on identification drills, first by assigning words and then by refining them.

Here's how you can create your own trigger words.  Write down on a piece of paper the note names that you are working with.  Play each note, let yourself feel the note, and think of a word that you associate with that feeling.  Write that word down, next to the name of the note.  This word does not have to be an adjective!  Just come up with a word that makes you feel the same way.  It could be "pizza" or "cartoons" or "skeptical"-- any word at all, as long as the word evokes a specific feeling that you can recognize.  Do this for each of the notes you are working with.  Then, when you begin listening to the notes in your drill, listen for how the note makes you feel.  Does that feeling match one of the trigger words you've written down?  If so, you can name the note.  Bingo.

But what if you hear your trigger word and you still get the note wrong?  Then you have to change your trigger word.  Let's say that you said that D was "stretched" and C was "plump", but when you heard C you felt "stretched", so you said "D".  That means "stretched" is not good enough, because both notes make you feel that way.  You've got to change your word for D.  Play C and D again.  Listen to D and how it makes you feel.  Choose another word that makes you feel the same as D does.  Play C again to make sure that that word does not work for D.  Eventually, by eliminating the words which work for multiple notes, you will have words that each work for only one note.

I've been working with C through F.  For me, C has become "round", D is "whiny", E is "raised eyebrow", and F is "bright".  These words probably will not work for you.  These words will probably change when I add more notes, too, since there are probably other notes which will also make me feel these things, but as long as I'm recognizing only C through F these are good enough.  When I feel "round" I know it's not D, E, or F; when I hear "whiny" I know it's not C, E, or F-- et cetera.  Eventually I will have trigger words for all the notes.

As I have mentioned, trigger words are only a stepping stone to recognizing the notes.  Ideally you want to recognize the notes just for what they are, just feel them purely as pitches, and then use words to talk about them only if you have to describe your experience to someone else.  But until we become familiar with the pure feeling of a pitch, we can use our trigger words to help us recognize what the feeling of a pitch actually is.

September 9 - Spicks and specks

Ernst Terhardt has posted thorough descriptions of what he terms spectral pitch and virtual pitch, which lends further credence to the notion of listening in two different modes.  He specifically says that "the formants of speech vowels elicit corresponding spectral pitches," which makes me think that I'll find something interesting in spectral analysis of spoken vowels.

I think it's strange that so many people tend to "believe" whether or not absolute pitch can be learned.  They seem to think that there's so little information available about perfect pitch that they have to "believe" whether or not it can be learned.  I've only spent five months researching, so far, and even in such a short time I've assembled enough research and information that seems to make it impossible to deny that it can be learned.  I have to wonder why so many people still declare it a mystery, and why the conclusions I've reached aren't simply common knowledge.  Maybe it's because the Internet is still fairly new, and it wasn't possible before to do such wide-ranging research.  Certainly, if not for my forays into international discussion forums on-line, I would most likely never have met anyone else who had taught themselves perfect pitch.  But I have.

It's also interesting that so many people think that absolute pitch perception is dependent on early musical training.  The important factor seems not that children are trained in music; the correlation is that they were trained early, before normal listening could take root.  Early musical training encourages a child to recognize that there is such a thing as "pitch", while their aural intelligence is not yet fully formed, so there's a chance of their learning to recognize the pitches uniquely before they're trained to ignore them in favor of their relationships.  It seems probable that children could receive "pitch training", without musical training, and perceive absolute pitch with no musical ability at all.

September 12 - You put your right foot in

I've been making an obvious assumption-- that I know what a "tone" is. I've studied the biology of the ear and how it responds to waveforms; I've read and researched and interviewed to find out how people perceive pitch; I've begun to explore the idea of meaning in sound and vowel formants. But I hadn't thought to ponder the question "what is a tone"-- or, perhaps, why is a tone? My copy of Music and the Mind arrived today, and I didn't realize I hadn't considered the idea of a tone until the author, Anthony Storr, almost casually offered his definition of tones: "separable units with constant auditory waveforms which can be repeated and reproduced." You can take that definition or leave it as you prefer; I appreciate the definition mainly because it prompted me to think about it.

Why do we hear tones at all? Light is a waveform that can be described by its frequency; so is sound.  Why can't we see sound? Why a second sense? The most obvious answer is that sound waves and light waves are fundamentally dissimilar. Light waves are a form of radiant energy, created by electron oscillations.  When we see light, what we are responding to is literally a burst of energy. Sound, on the other hand, is air compression.  When something creates a sound, it does so by disturbing the air; the molecules thus disturbed subsequently jostle others, which then affect others, and through this transmission of energy the resultant compression reaches your ears through the air. But how does something create sound? How does it disturb the air?

By moving.

It's an interesting idea, isn't it-- that hearing exists to tell us how the world is moving around us.

Vibrating things produce sound.  That's common knowledge (common sense, even).  Subtly restated, we "see" that vibration as the sound that we hear.  The distinction is fine, but significant.  If you plucked a guitar string for me, and asked me to identify the sound, I would say "that is a guitar string"; but if you asked me to tell you what happened, I would say "The guitar string created the sound" instead of "the guitar string was the sound."  When I look at a blue flower, I know from my science classes that the natural qualities of the flower cause it to reflect blue light into my eyes, but I would never dream of saying that the flower "created" blue. I would tell you that the flower is blue. Blue is a characteristic of the flower. In a Yahoo discussion group, IronMan Mike stated quite plainly: "I see in color; I hear in pitch," and I'm reasonably sure that this distinction is what he was talking about.  Vibrating objects don't create sound; moving objects are sound.

It doesn't seem that way, at first, when you consider what a pitch actually is. A pitch is, of course, the frequency at which a sound wave is vibrating. What made the "sound wave" vibrate at that particular frequency? What is the "sound wave"? It's a specific movement that caused an amount of air to be compressed in a certain way, and the energy of that compression is transferred through the air. I suppose a "sound wave" is actually a kind of illusion-- sort of like The Wave at a sports stadium. There really isn't a "wave" that exists as a literal object moving from place to place-- just people standing up and sitting down, or molecules bumping into each other.  Nonetheless, the chain reaction which results from the motive force is, for all reasonable purposes, a recognizable, independent entity that can be described as itself.  It could be said, then, that by altering the energy of its environment, the vibrating object literally "created" the sound.

But in using that perspective, I would have to acknowledge that my blue flower also "created" the blue light that I see. The flower absorbs certain types of light energies and reflects others, thus manipulating the energy of its environment.  But the flower did not create the light energy; a vibration does not create the air that it compresses.  Light waves transmit to our eyes the inherent properties of an object, and air compression transmits to our ears the inherent properties of an event.  Of course, that makes it all the more likely that we would think of a sound as "created"-- the sound happens.  It's an event, not a thing.  The guitar string is plucked.  The baby cries.  The tree falls in the forest.  There are many reasons to believe that the event occurs and the sound is merely a by-product of that event-- but if you consider your sense of hearing alone, you'll realize that the sound is the event.

That event is not going to be a "pitch".  In nature, there is no such thing as a pitch; sounds never occur as a pure sine wave.  Why then would we ever perceive pitch?  Because, it seems, sound frequencies are the building blocks from which all sounds are made.  Biologically, the simplest thing to do is to use a mechanism which collects the basic component information, so that the information from any sound event can be collected in the most efficient way.  But we do not use that information directly.  Our senses perceive change, and because pitches are constant waveforms, we learn to perceive "change" in sound as the shifting from one frequency to another.  We gradually become habituated to the pure sensation of pitch just because it's always there-- and it's never meaningful to our minds except when combined with other pitches.

So why is a tone?  Isn't it a pitch?  Why would we ever hear one if we train ourselves not to hear pitches?

We can go back to Anthony Storr's definition and emphasize that a tone is a separable waveform.  Pitch sensations do have duration and they have loudness, and those features can be varied to allow us to assign meaning to an otherwise invariable pitch sensation; but even when those are held constant we can still comprehend tones.  Most simply, if we change from B to C, even though our minds may most clearly recognize "one half step" as the meaning of the event, we still recognize that we have started at B and moved to C.  The concepts of "B" and "C" can thereby be discretely identified.  That is why tones exist in music-- they are our reference points, the dots that we connect to make music.

By the way, if you're following me so far, then you realize that I'm overlooking a tremendously huge question here-- if sound is our interpretation of temporal events and physical motion, then what in the world do our minds think we're hearing when we hear music?  Yes, I'm overlooking that question, and with good reason-- that's one for the philosophers and one for the books.  It's far too big a topic for me to address when talking about tone, and it's exactly the sort of inquiry which causes a book like Music and the Mind to be written.  I encourage you to let your imagination explore the topic, and I may later take a stab at a general summary of the concept, but for now I'm tabling that particular discussion.

In fact, I had better stop there right now, because my boot drive just crashed.  After six attempts, I've been able to get it to run in order to finish this update (and back up some critical files), but I'll have to come back later for Tones Part Two.

September 13 - Half a loaf

I'm still recovering from my hard drive crashing, so I won't have time to write about tones again until next week-- but I've been thinking about vowel formants today and after finally finding some information about the ranges of vocal formants in the English language, I then carefully read a critical study of the relationship of formants to vowel perception, and I ended up with something like this.  I know it's not very easy to follow; I'll have to come back to expand it later.

Most people believe that vowel formants are solely what define a vowel sound, and there are demonstrations which show how vowel sounds can be created from plastic-tube models.  These experiments, however, demonstrate that as the fundamental frequency changes with the perceived vowel held constant, the formant patterns change as well.  Further study showed that the same vowel can be created with the "articulators" (which create the sound) in different configurations; there is no consistent "throat shape" for each type of vowel.  And, regardless of the age or gender of the speaker, when a speaker uses the same fundamental (F0) pitch, their vowel formants are the same.  It seems somewhat reasonable to say that the speakers' formant production corrects the vowel sound rather than defines it.

The main idea here is that vowel production and perception may be bound principally to the fundamental frequency, and not to the shape of the formant series.

September 16 - You load sixteen tones

Okay-- back to the question, why is a tone?  How can tones exist-- and why are they part of music, which is a relative comprehension?

It seems reasonably clear how a tone can exist. The ear is designed to break down sound information into its fundamental frequencies. Since frequencies are the basic components of all sound information, it seems inevitable that a "tone" would emerge as a recognizable feature of sound.

It seems logical to say that tones have musical identity because, even though a tone is musically meaningless by itself, they are the relative points from which the structure of music is formed. The points can be differentiated by "moving" up or down.

"Duration" and "loudness" are part of a tone's individual identity, not part of its musical identity. It's easy to imagine how this makes sense-- just think of songs you've heard performed by different artists. They can vary the character of any of the individual tones by its duration or its intensity, but as long as they follow the same relative pattern it's recognizably the same song. Also, although it may seem obvious, it's nonetheless important to recognize that with no duration and no loudness, there's no tone. You'll notice that the more a song slides around, without stopping on any specific tones, the less it actually sounds like music-- you can probably think of a Beatles song or two which dissolves into this kind of nonsensical roaming.

But is a slide atonal-- or even amusical? Does a slide have tones?  On a piano, yes, it does-- a musical "slide" from a piano is merely a rapid series of tones. On a stringed instrument, or trombone, or other non-keyed instrument, the slide has no obvious steps, but can be considered an infinite series of tones in the same way that a geometric line is an infinite series of points. The tones within the slide have no theoretical duration, but they have loudness.  Hearing a musical slide is the aural equivalent of looking at a rainbow; the tones in a musical slide can most easily be recognized at the endpoints of the slide-- as the endpoints of the slide-- but even so, tones can be identified and categorically recognizable even within the slide. A slide is inherently musical, because there is a relationship between the start and end points and every pitch in between.

Tones have musical identity, but music also has tonal identity. Even if the identity of the music is in the fact that we've just moved from note to note in a specific interval, even if the greatest part of a tone's impact may be the note from which we traveled to reach it, there's still plenty of oomph to be had from any one tone. The length of time we dwell on it, the volume with which it's presented to us, and the pitch at which it is played or sung-- any and all of its characteristics affect us differently.

This leads me to a description of "tone" that isn't quite literal in the same thrust as Anthony Storr's inclusive definition, but which does provide understanding of its nature-- a musical tone is a discrete and invariable sound event. We can recognize a tone because we can say that something just happened.  A tone is therefore not the same thing as a pitch. Even though a tone is typically described as its pitch, the "tone" exists as an event, not a characteristic. The temporal nature of tone as an event is not insignificant compared to the inherent characteristic of pitch frequency.  A tone is musical; a pitch is not musical.

So... how is this distinction useful?  Why am I bothering?  It suggests is that hearing musical tones is not the same as hearing pitch. Even when we hear how a single note makes us feel, the reason we feel anything is because it is a tone, within which pitch is merely one characteristic.  Learning to recognize, identify, and recall tones because of how they feel to you naturally incorporates more than just the simple sensation of pitch.  It's feeling the entire musical tone.  You can learn "perfect pitch" without ever actually sensing pitches directly. Pitch is not how you feel in response to a tone; it is neutral; it simply is.

This distinction between pitch and tone is important to our learning goals.  When we learn "perfect pitch", we are not actually attempting to hear pitches. We are learning to identify and recall tones.  But we want to learn tones, because pitches are not features of music.  Pitch is a characteristic of tone.  Tones are musically important; therefore, "perfect pitch" training is useful to our comprehension as musicians.

September 19 - Correct me if I'm song

Ack!  Information overload, redux.  I've now been pelted with One Note Complete Method ear training, The Psychology of Music, and webpage links and new observations and all kinds of other everything all at once!

Possibly one of the more important "everythings" is a webpage which demonstrates vowel production differences between male, female, and juvenile voice..  Although it is in Dutch, you can download the AIFF files and hear for yourself.  I used Gram to look at the files and the most telling ones appear to be a single pitch manipulated through an aspirator of some kind.

I'm delighted to have heard these files; they seem to validate the direction I've been going.  If you look at them, using Gram, you see that the only thing that changes is the single pitch frequency.  If you listen to them, knowing what vowel sounds to listen for, you definitely will hear the vowels-- but (and this is the best part) if you listen to them without paying attention to the vowels it will sound like random musical notes played very breathily.  And I was further delighted to discover that when I didn't know what vowels I was supposed to hear, I only heard pitches; but as soon as I knew, there they were.  This fits in directly with the idea that we can dehabituate ourselves to hear pitch.

I've got a lot of reading to do.  Lots and lots of reading, and not all of it will fit into the category of "music and language".  But I have here reached a very tidy, if entirely speculative, conclusion for the concept: vowel sounds are pitches.  Linguistic vowels, "virtual" vowels created by vocal formants, were developed based on the known sound of a vowel.

Follow my logic with me:

Accepted scientific knowledge does say that a spoken vowel's formants are what defines the vowel

However, the formants change, within a range, when there are different people speaking.

If I'm understanding it correctly, this Dutch-language site proves that a vowel can be clearly heard with only a single pitch.

And a study shows that the fundamental frequency is actually more important than the formants.

Now, if you consider that the brain will listen to an overtone series and hear a fundamental pitch that isn't actually there, a "virtual pitch",

and you acknowledge that the overtones can be perceived separately from the fundamental...

...then it seems probable that the formants don't actually define the sound of the vowel. Instead, the formants force the listener to hear the "virtual pitch" that corresponds to the true vowel regardless of the apparent fundamental frequency that is being spoken.  That is, the formants of a vowel don't define the vowel, as previously thought, but correct it so that we hear, either truly or virtually, the appropriate vowel pitch.

That could help explain why there are specific ranges of formants, since the vowel only has to be corrected up or down within a single octave.

It also suggests a reason why people with different dialects hear different notes as the end point of the scale; perhaps they're accustomed to hearing certain vowels adjusted "up" and other vowels adjusted "down".

It could also help explain why people who speak tonal languages appear to have absolute pitch perception, even though they don't actually seem to perceive pitch the same way as a person with natural absolute pitch.

It may not be a coincidence that there are 12 vowels in the English language and 12 tones in the scale.  It may not be a coincidence that the early Greeks labeled their musical pitches with vowels.  It may not be a coincidence that the Hindu scale has 22 notes and their language has more vowel sounds as well.  It may be that music developed from language developed from vowels.  It may be that music was language and language was music.

On the other hand, it's possible that although vowels are pitches, pitches are not vowels.  It could be that a pitch is the natural characteristic of a sound, and a vowel is merely the linguistic expression which most purely expresses that characteristic.  It's something to consider.

Maybe some scientists will pick up on this idea and work to prove it.  I've been quite impressed by the scientific reading I've done so far, because of the ingenuity of the experiments.  I think I may have to conduct some experiments of my own in order to test this theory; as far as I know, nobody else is pursuing it right now.  It may be that the "experiment" is nothing more than learning perfect pitch, myself, by hearing vowel sounds (this has already been done, but not by me) and describing the experience-- and that, my friends, is something I'll have to do.  It's already clear from the One Note Method that doing so will not interfere in the slightest with learning ordinary "perfect pitch" as well.

Continue reading with Phase 3


Home