In this second lesson, let's turn to the perception of the vocal sound signals that we have just talked about. And let's begin this with a few general points about language, and what are called phones and phonemes. There are about 6,000 languages extant in the world today, and, as you might imagine, these are disappearing at a pretty rapid rate. So you wake up tomorrow morning, there may be fewer. And the estimate of 6,000 is obviously a very general number. Phones, this word, is the descriptor of the sound signals that are produced by the vocal tract, what we talked about in lesson 1. So those are the physical entities that are recorded coming out of the speaker's mouth. Phonemes are the sounds that are perceived by listeners. So these are very different things, again based on the theme that we have been dwelling on and will come back to again and again, that what we hear is not just the physics of the sound. So the physics of a phone is not what we hear as a phoneme, they're distinct in an important way. There are about 200 phonemes in all in human languages taken together. The vocal tract is capable of making 200, very approximately, different sound signals that we can hear as phonemes. But it should be obvious to you that a child, an infant, is at first capable of hearing all of those phonemes. Why cannot be said with assurance, because you take a kid, put them in a different environment, an adoptive situation, they of course become fluent in the language of their adopted culture. But each language doesn't use all of the 200 phonemes. They use, again, very roughly, 30 to 100 phonemes, depending on the language. That is, we hear in a language, 30 to 100 phonemes in that idiom. And the child, over the course of its development, comes to focus very quickly in the first few years of life on the ability to perceive those phonemes and enunciate the relevant phones. And they lose the ability to generate the phones in another language, or to hear the phonemes in another language. That's why when you are in high school and you study another language, you have trouble with the pronunciation, you have trouble understanding it. And you're going to always have an accent, because of the plasticity of the nervous system decreases during childhood. And after you're eight or ten years old, you're no longer able to have that learning ability that make you fluent in another language. That's why all of us, second language that we've learned as an adult don't, I mean we may practice it for many years and get pretty good at it, but we're never completely fluent. We retain an accent that distinguishes the fact that we don't hear the phones and we don't, as the phonemes in that language quite correctly, and we don't pronounce them and generate the phones in that language quite correctly. Of course, phones and phonemes are divided into vowel and consonant categories. The vowel categories, as I said before in lesson 1, when we were talking about the vibration to vocal cords, are voice, meaning that they depend on the vibratory primacy of the sound signal of the phone that's generating the power through vocal cord vibration. Consonants, not all of them, but most of them, don't depend on vocal chord vibration. They depend on the position, the shape of the vocal tracts, and particularly the shape of the tongue, position of the tongue, rather, and the position of the lips. So when you say pa or ta or ma, you are enunciating a consonant that is not in general based on the vibration of the vocal cords that are critical in saying the vowels A, E, I, O, etc. So those are distinctions that you wanna keep in mind as we go forward here. So, coming back to the theme that the physics of sound signals and what we hear are different things, I wanna point out to you that the physical parameters of the local sound signal, the phone, are not what we really hear. And a very nice way of demonstrating this to yourself and recognizing is to take a sound signal like this. This is a sound signal in time, a speaker speaking the sentence this is a glad time indeed. And so, these ups and downs, this is the baseline, this middle line here, the ups and downs are the time-varying amplitude of the sound signal as the vowels and consonants of the sentence this is a glad time indeed are being spoken by the speaker. Well, you'd think, because subjectively it certainly seems that way, that what we are hearing are this sound signal broken up into words and syllables and individual phonemes, but that's not really the case. And you can see this if you look at the signal here and relate it to the phonemes, the syllables, the words that are being spoken. There's no relation between, well, there's a relation but it's not a simple relation, between what we hear in terms of the impression that we're hearing individual words and that there is a break between words, that we're hearing individual syllables, that there is a break between syllables. This is of course what we learned in school and hell, we're subjectively aware of it, but it's not really the way we are parsing the physical sound signals. So there are no breaks between the syllables that you can identify. There are no breaks between the words. There are periods of silence in here as well. But you couldn't tell what the speaker is saying or what the syllables or words are based just on the time-varying signal itself. The point also that can be made from this is that human vocal sounds are both tonal and noisy. There are periods in this sound signal, this time signal, that are tonal in the sense of representing a systematic repetition, as you see here. A vowel sound, in this case, A, and there are periods that are much noisier here that don't seem to have a periodic repetition. These in general would be constants like the T here in time. So again, in addition to the fact that you don't hear things the way you subjectively think you should be hearing them, you're hearing noisy and vocal, voiced sound signals combined together in a way that is just different from the way you would expect it to be in physics. Another good way of convincing everybody that this is not just the way, we don't just parse sound signals the way they are occurring physically, and we hear something that's quite different from that is reading. If you use eye tracking equipment to ask what's a subject looking at when they are reading a sentence in a piece of text, like the sentence here. You'd find that the position of the eyes is not going from word to word, or syllable to syllable. It's going from Much larger chunks that don't have to do with the breakup of the sentence into particular words and syllables. The bottom line of this second lesson Is just to convince you with yet another bunch of evidence that the physical parameters of vocal sound signals are not what we hear and are aware of subjectively. And that's, again, going to be a theme that's ongoing.