Hello, welcome back to the course. Audio Signal Processing for music applications. This week, we are talking about sound transformations. And for example, in the previous demonstration class, we talked about morphing and in particular about how to use the shorten transform to morph the qualities of two different sounds. Now in this second class, I want to talk about time scaling. Therefore, about how to change the duration of a sound, and basically then modify the length of a particular sound fragment. So, lets start by using the possibilities of Audacity for this type of transformation. Lets start from this sound, this is orchestral sound that we have in sms tools directory [MUSIC] And now in the effects menu of Audacity, there is quite a few of transformation possibilities, and even there is the possibility for extending this transformations by external plugins, and you can find many available online. But let's focus on very few of these transformations that relate with changing the duration of a sound. And the simplest one is, for example, this one, it says change speed. This is a very simple algorithm that the only thing it does is change the reproduction speed basically. So, it changed how the samples are read and therefore reproduced. And of course the effect of these is that it modifies not just the time but it modifies the frequencies of the sound. So, let's listen to that same orchestral sound but, changed by, for example, well, if we put it at zero, there would not be any change and for example, it changes by 30% so, this will speed up the reproduction of the sound by 30%. Okay, clearly we see that the sound is now shorter and now we can play, >> [MUSIC] Okay, so this is the standard effect of reading a disk and changing the speed of reproduction of an analog disk or of a tape and that can be done in digital media. But clearly this is okay but it's not that interesting in the sense that we are manipulating not just the duration but the whole quality of the sound. So more interestingly, when we talk about time scaling what we normally are referring is to be able to change the duration of the sound without changing the quality of the sound without changing the frequencies that are present in the sound. And within the density there is a few algorithms into that the most sophisticated one that I know of. He's this one that says is sliding time scale pitch shift, okay? And this algorithm performs both tempo change, or time change, and pitch change. And it uses a sinusoidal model. In fact, it uses sinusoidal model very similar to the one that we have been presenting in class. And you can of course look at the code of this algorithm on the Audacity website, you have access to the source code of that, the major difference which is an important one is that this sinusoidal model is based on a sub bond type of processing. So it does sign sort of modeling while it's splitting the whole sound in octaves and modeling every octave with a different analysis, synthesis approach. So, the quality is a little bit better than what we can do in the algorithm that we have been developing in class. So let's listen to that and it easily can be controlled in time, so here for example the control of this algorithm can be done in a way that we have an initial tempo change as a percentage, and a final tempo change. So, for example, what we can do is start slower, so lets say, okay lets start with lets say, 20% or 22% slower and let's finish by speeding up. So let's, for example, end up with 30% speed increase. Okay, but remember these we are not changing the reading of the samples, the time domain samples, this is done. Using the sinusoidal model. So therefore, re-synthesizing the sinusoids but a different speech while maintaining the pitch of the sinusoids. So, let's analyze this, it is quite fast. In fact, it's a very efficient algorithm implementing C. So now, we are done and we have change the duration, well it's a time varying duration so let's listen to that. >> [MUSIC] Okay. So, we have the time duration and clearly, the peak or the frequencies have not changed, the sound has changed in duration, has speed up, the tempo has increased but, not the frequencies and that's a pretty good algorithm. There's some more sophisticated ones, but this is pretty decent one, that can perform good time scaling algorithms. So, now let's go to our the own implementation. So, let's go to the GUI of transformations that we have. And let's go to the sign model. Okay. And let's go to the, for example, a sound. Let's use the piano sound. Okay. Of course, now, this algorithm, we have to decide the parameters. So, in the Audacity implementation, everything, kind of, it's a blackbox. And it does pretty good. It chooses the best parameters for this type of a transformation. Here we have to choose the parameters. So, we first have to choose the analysis parameters of the sinusoidal model. It's a piano sound, it's a, well it's a harmonic sound but this in the sinusoidal model, this is not an issue. Let's, we have to make sure that we distinguish the peaks of the spectrum enough, so that we can build sound waves out of that. So, let's start maybe with a humming window, 800, maybe, let's increase this a little bit. Let's put 1,000 samples of the window size. The threshold minus 90 maybe, let's do even more. Minus a 100. The minimum sinusoidal duration, well, let's leave it like that, maximum number of sine, we can put as many, for example, 200, so we, we really do a lot of sinusoids. And well, let's leave maybe this a little bit higher offset, so we allow frequencies to change a little more, and let's first do the analysis resynthesis of the piano. Okay. Here is the analysis, the sinusoidal analysis that I have performed, and the resynthesis. Let's listen to the resynthesis. Well, let's first listen to the original sound. [SOUND] And then the resynthesized on. [SOUND] Okay, it does a pretty good job. So, that's a good starting point. Now we can apply the transformations and in this interface we can do scaling and of the frequencies and of the time. So, let's not do any frequency scaling. So for not doing any frequency scaling, let's say that at time 0 let's just put the value of 1 and at time 1 we put the value of 1. So, there is no frequency change. And in the time we have we can put the time scaling operation. So for example, we can say a times zero just started zero and for example the end and we can just normalize to say 1. Let's slow down by a factor of 2 for example. So, this was a slow down quite a bit more than we are doing before. So, this will stretch the sound To twice as long. So let's see what it does. Okay. So, this is the stretch sound, so this is much longer. The original sound was around four seconds, now it's eight seconds and here we see the sinusoids that of course since they have been stretched We see a little bit more of a mess here because we see twice as long and therefore more compact type of representation. Let's listen to these straight sound. [SOUND] [SOUND] Okay, that's pretty good. And of course we can do, with this interface, we can make changes that vary, not just from beginning to end. But, for example, we can say at the beginning let's start at zero. Let's put that in the middle, let's speed up by 50%. So, it's going to be at 0.25. And at the end let's, for example, leave it, well we can just leave it so that it's really longer, so it's 1.3. Okay, so now we can apply the transformation. Okay, so this has done a time varying, time stretching and the beginning has sped up. And the ending has slowed down. So, let's listen to that. [MUSIC] Okay. So, that's pretty good, and that's quite an interesting transformation, and there is quite a lot of opportunities to do interesting things here. Again, this is different from the algorithm that Audacity includes, but the basic concept sinusoidal model is the same. The only thing is there it's a little bit more complicated. And of course, implementation is in C and it's maybe more efficient. So, that's all I wanted to say. We have talked about time scaling and use two algorithms, both the Audacity and the one available in the SMS tools. So, hopefully that has given you a view of the potential of the sinusoidal model for time scaling operations. And that's a very interesting transformation that is very much used for many types of applications. So, next class we're going to be talking about changing the pitch of the sound. So, that's a very complimentary operation to changing the time. So, I hope to see you in our next class. Bye-bye.