Hello, welcome back to the course on Audio Signal Processing for Music Applications. This week that we're talking about sound transformations, in the demonstration lectures, we have been exemplifying the different models that we have been talking about during the course. And some transformations that can be done using those models. For example, on the first class, we talked about the short-time Fourier transform as it can be used for morphing. On the second one, we talked about time scaling using the sinusoidal model. And then on the last one, we talked about how to do pitch changes using the harmonic plus stochastic model. Now I want to go back to the idea of morphing, but using a different model. In this case, we're going to use harmonic plus stochastic model. And we will be showing that with this model, the type of morphing that we can do is very different from the type of morphing that we're able to do with STFT. So in order to do the morphing, we first have to have a good analysis of each of the sounds that we want to morph. So let's start with the GUI, the SMS tools model of GUI. And let's go to the harmonic plus stochastic. So let's start with one of the sounds we're going to morph. We're going to morph the violin sound with soprano sound. So let's start with the violin sound, okay? This is the sound. [SOUND] So we have to choose quite a few parameters. The blackman window is a good choice for this stable note and the side lobes are quite low, so that's good. Have to choose the window size and that's always requires some computing. So B3 is around 246 hertz. So in order to decide the window size, we just take the number of beams of the window which is 6 times the sampling rate 44100 divided by the frequency of this note. So we need around 1,075 samples. So let's put it here, 1075. The window, the FFT size has to be larger, so let's make it quite large. So we get all our observed padding, 4096 for example. Okay, magnitude threshold, -100 is fine. Minimum duration of the sinusoids, yeah, it's a single node, we want to do interpolation with another sound, so the longer, the better, so 0.5 would be good. In terms of the maximum number of harmonics, again, we want quite a few, as many as we can. So, 100 is also okay. Now in terms of the range of the fundamental frequency, we said that it was 246 fundamental so definitely this has to be below that. If we do from 200 to 300, that should be okay. And the f zero detection error, well, 7 should be fine. The deviation, yeah, 0.1, it's quite a bit and it's fine. And the stochastic approximation, yeah, let's not do too much of an approximation. So we get the good quality of the residual, so let's 0.8, the maximum would be 1 which would be whole magnitude of the spectrum, 0.8 is okay? So let's compute it. Now let's listen to the sinusoids. [SOUND] Okay, this is fine, the stochastic. [NOISE] Okay, that's quite noisy but it's soft, it's okay. I think we can manage that, okay? And here we can see this representation. We could try other parameters but let's leave it like that. And now let's analyze the other sound. Let's analyze the soprano sound, okay? And this is an E4. So now again, let's listen to that. [SOUND] And let's give the blackman, the window doesn't have to be that large. So let's choose the window size. It's 6 times 44100 and an E4 is around 330 Hertz, kind of. So it's 330.0. Okay, so doesn't need to be that large, the window, so let's I'll leave it as 801. And FFT size, well, let's leave it at peak so that's good. Magnitude threshold, -100, duration 0.5 is fine. Let's keep the number of harmonics. It has to be the same number of harmonics because we're going to be interpolating the two of them. So 100 is fine and now since the frequency was 330, let's now, the voice has a vibrato so it will change quite a bit so let's be safe and let's put from 250, for example, to 400. Okay, and now we can just leave the same parameters, the same error threshold for the f zero detection. The same deviation and the stochastic factor of 0.8. So let's compute that. Okay, it's a little more difficult to analyze this sound because of the form, there is some areas of the voice and there is not much. But let's listen to the result. [SOUND] The sinusoids look good. [NOISE] The stochastic sounds, okay, good. And of course, the sound is fine. Okay, now we are ready to go and to do the actual morph between these two representations. So let's close this and let's go to the transformation directory and let's type python transformations_GUI. This is the interface for the transformations, so now we can go directly to the HPS morph option, and in fact, the sounds that we are going to morph are already the default ones so we will use those. And now let's change the parameters to the ones we decided to use. So we decided to use the size of the window for violin 1075, and the FFT size, a big one, 4096. The threshold -100. The minimum duration of a trajectory, we decided 0.5. And given that these frequencies around 246, we decided to use from 200 to 300. And now the error threshold we set 7. And here maybe we can be a little bit more open and just say 0.05, okay? And for sound two, the soprano. We are going to use the same window, blackman. We are going to use a smaller window because it's a higher pitch, so 801 is fine. And we decided to use the same FFT size. And similar values for the rest. So for the minimum, this is a higher frequency, so we need to, it's 300 and something, so we need, yeah, 250 to, there is no need for 100, 400 should be enough and this 7 and this, okay? So we can now analyze. I wanted to change the number of harmonics. It has to be the same number of harmonics. And we set to have 100 harmonics. And the stochastic factor, we wanted to have a good stochastic component. So we set 0.8. Okay, let's try it again. Okay, so these are the two sounds. Clearly on the violin, it found more harmonics than on the voice. So that means that we are only going to be able to interpolate the harmonics of the voice. So now what the transformation will do, it will be interpolating this as two sets of values. And we have three ways to interpolate the set of values. We can interpolate the frequencies of the harmonics. We can interpolate the magnitude of the harmonics. Or, we can interpolate the stochastic component. So for example, let's just have the frequencies of sound 0 of the first sound. So let's put that at time 0 we'll have the sound, the first sound, which we'll refer as 0. And at n, we also have the first sound. So basically the frequencies are of the violin. And the magnitudes, let's say, are of the voice. So we'll put that at time 0, we'll have 1 which are the magnitudes of the voice. And at 1, we'll also have that. And for the stochastic, well, we can just put 5815 so we can just put a time 0.5 in between and at times 1, we will put 0.5. Okay, let's see what happens. Okay, so this is a result. And the frequencies clearly look the spacing of the violin, but the, let's see the magnitude, we don't see them here because we don't see the magnitude of the lines. But let's listen to that. [NOISE] Yeah, so clearly it sounds what it is, it sounds a little bit the magnitude of the voice, but at the pitch of the violin. Now, let's go from one to another. So if we go from all the values of the violin to all the values of the voice, we can just do it by putting 0011, and again here 0011, and here 0011, okay? And let's apply it. Okay, and here, clearly we see that it's going from one sound, and here from the frequency we see that there is this kind of which is because the pitch of the voice is higher than the pitch of the violin. So let's listen that. [NOISE] Okay, so clearly we see this evolution. And of course, in these, we have an envelope that we can specify any interpolation and in any time varying fashion. So we could have quite sophisticated interpolation envelopes. Clearly, this is very different from the short time for a transform that we did. So okay, let's finish this. And basically we have talked about a transformation, the morphing, using the harmonic plus stochastic model. That's within the SMS tools. And clearly it's a different type of morphing. It has different possibilities than the SDFT. We can now interpolate basically every set of parameters. And obtain any sound in between. So even though we are using the same term, morphing, the model has a big impact on the possibilities that the technique offers and what we can do with this idea of interpolating between two sounds. So that was all. So hopefully between these demonstrations, you have gotten an idea of the different possibilities that these different models offer for transformations. And of course, there is many more possibilities and some models, we have not used them, and they also offer a different transformation facility. So feel free to do it by yourself. So, thank you very much for your attention and I will see you in the next lecture, bye-bye.