0:00

Welcome back to the course on audio signal processing for music applications.

In the previous theory lecture, we started talking about sound transformations.

In particular, about how to use the short time free transform and

the model to manipulate sounds.

In this second theory lecture, we want to continue that,

and talk about the harmonic plus residual model and

how it can be used for transforming the frequencies of the harmonics.

And then, the harmonic plus stochastic model, and

how can it be used for time stretching and morphing operations.

0:41

These morphing by the way is quite different from the one that

was possible with the short time period transform.

And of course, these are just some example operations, example transformations.

But the number of possible uses of these models is much larger.

1:01

Let's start with the harmonic plus residual model.

This is the plot diagram that we have seen before in which from the input sound,

we analyze the spectrum, find the peaks.

And then, find the fundamental frequency, find the harmonics and

we subtract those harmonics from the original signal.

We need to compute the spectrum of the original signal again in order

to develop a spectrum from which we can subtract these harmonics in a proper way.

1:34

Okay, and then where we do the transformations is in the frequencies and

amplitudes of the harmonics.

We don't apply any transformation to the residual in order to

apply transformation to the residual it make sense to have a model for it.

And that what we are going to be introducing

in the next model the Harmonic plus sarcastic model.

1:58

And here, the transformations on the harmonics, again, like in the sinus solar

model, it's applied only on the frequency and amplitudes.

The phase are forgotten, we don't use them,

we will be re-synthesizing them again after transformation.

2:15

And the main issue of that is because that phase is a very sensitive to

transformations and it's very difficult to handle them.

So it's better to just regenerate them after transformations and

it doesn't sound that bad.

Anyway, so after the transformations we can

sum these new harmonics to the residual signal and

these residual signals since it does not include any deterministic.

Any harmonic information, it will merge quite nicely.

And then, it will create a sound that has the same residual as the original and

the harmonics will be modified.

2:53

In terms of types of transformation that can be done to these harmonics

to the frequency of these harmonics.

There are many and here I explained three common ones.

The most traditional one is to just transpose

all the frequencies by a given factor.

So, we multiply all the harmonics by a scaling frequency factors so,

that would correspond to a transposition so we multiplied by two.

It would correspond to converting the sound to an octave higher if

we multiplied by 0.5 it would be an octave lower.

Then instead of multiplying, we can just shift all the frequencies.

For example, we can sum a factor to all the frequencies, so

we'll be shifting all the harmonics by an additive component.

3:47

And finally,

the last one I want to mention is what is called frequency stretching.

In which, we are applying frequency change that is dependent on the harmonic.

So here, what we are doing is basically dividing every harmonic by it's harmony

number, so basically get a frequency around the fundamental.

And then, we multiple by the harmony number to the power of a scaling factor.

Therefore, the higher harmonics will be modified very

differently from the lower harmonics, it's very much harmonic dependent and

that can create some nice effects that we will see.

4:30

So this is a plot of these three transformations that

exemplify these operations quite nicely.

So on the top left, we have the original harmonics all located

at the harmonic number location, one two, three, four, and five.

Then if we transpose by two on the right,

it's just simply every by two of harmonic one in position two,

harmonic two in position four, etc.

We are maintaining a harmonic spectrum with just transposing by a factor or two.

If we shift instead by a given value, in this case since

we are having the frequencies normalize to integer values.

Here by 0.5 will be shifting by half of the frequency

of the first fundamental, of the first harmonic.

And in here, we see that we're shifting everything by a constant value.

From now after this transformation, we do not have any more harmonic series

because the first harmonic is not just the fundamental frequency anymore.

Okay, so we have now a series of values that are equally spaced but

they are multiples of another frequency.

So the sound may be quite a bit different and quite interesting in some cases.

And finally, the frequency stretching operation,

it's a quite different operation.

Now it changes the distance between the harmonics and

it is like an accordion.

So if it is here at one point three means that well the first

value the fundamental will not be touched.

Then first one will be stretched by a factor of 1.3 and

then these are going to be powers of these.

So as we go higher up, it's going to go sort of the distance we'll increase

kind of exponentially and these create again quite very interesting effects.

6:44

So let's listen to sound in which we have applied some stretch and

transpose kind of transformation.

So let's listen, so it's a flute sound that we can listen to.

[SOUND] Okay, A4 below that we see the harmonics

that have been identified on top of the residual spectrum of the original signal.

And then, what we're doing is leaving the residual cities so

that the background spectrogram is exactly the same as the top one.

And we're just modifying the harmonics, so

if you look here, the harmonics have been heavily modified.

And if you see they have been stretched and

at the same time transposed a little bit, so let's listen to that modified sound.

[SOUND] Okay, of course quite different.

And if you pay attention the residual is there and since the residual is basically

these breath noise is emerges also quite well with these transpose harmonics.

8:02

It's very similar to the previous one and now, the transformation is applied both

in the modeling of the receiver signal and of course in the harmonic component.

So we have an stochastic approximation and therefore,

we can transform that quite easily because it's a model that has simplified

that the residual quite a bit and is quite flexible in terms of manipulations.

8:47

And so in terms of what is actually done is very similar to what

we presented to the sinusoidal model.

Now, what we are introducing is the transformations

on the stochastic component in a similar way.

So the frequencies of the harmonics can be scaled by a factor and

the time of the reading of these frequency values

also can be changed in order to obtain these time scaling operation.

So we have this scaling function, the amplitude is the same way and

the stochastic component is handled in the same way, okay?

And of course, the phases are generated and

they are generated by starting from an initial phase.

And then adding the frequency of every harmonic that we want

to generate to that phase, so we can generate the instantaneous phase that way.

9:46

Okay, let's show an example of that.

This is the sax phrase that we are also we have heard before.

Let's listen to that again.

[SOUND] Okay, and then we have the analysis of it.

Where we have analyzed the harmonics and

the stochastic component and then we can change that.

In this particular case, we have done time scaling, so we have changed the timing and

now with the residual approximating residual stochastic,

it's very easy to do time scaling of the stochastic and the harmonics.

And that's it,

we have just focused on the timescaling transformation in this particular example.

Let's listen to that, a very simple timescaling operation.

[SOUND] So, basically the only thing we have done is we

have compress the first part of the signal by quite a bit.

In fact, it's like half of the duration and

the second half has been extended by twice as much.

So in fact, the overall duration of the sound remains the same and

of course as you can imagine, this is very flexible and

we can do a lot of envelopes that we can play around with.

Okay, now, let's talk about the morphing using this harmonic plus stochastic model.

Okay, here we have simplified the block diagram.

We have two sounds, x1 and x2.

Now they're basically at the same level and basically,

what we're going to do is interpolate the two representations.

So from x1, we obtain the frequencies and amplitudes of

the harmonics and the stochastic approximation of the residual.

And of sound two we do the same thing and then,

what we are doing is interpolating these two sets of functions.

We are interpolating the frequencies, the magnitude of the harmonics and

we are interpolating the stochastic envelopes of the residual.

And then of course, we can synthesize back the output sound by

generating the stochastic component and the sinusoidal component.

12:11

Let's look at the particular frame of a sound in which we have applied like

a 50% interpolation, so on top left is X1.

One particular frame the harmonics

visualize as the location is the frequency and

the height is the amplitude and then the blue one is sound one.

X to the red one is a sound too and then, we can interpolate these two.

Basically, harmonic by harmonic we can interpolate and as you see,

the X2, it's a different frequency.

It's a sound that has a higher pitch, so

the interpolated result has a pit which is in between the two,

and has a shape again that is in between the two, okay.

So it's basically, a 50% interpolation between the two set of values and

the suppressing component below does the same idea.

So on X1, this shape is the approximation of the residual,

so the [INAUDIBLE] is where we actually have values has an envelope.

And then X2 is another envelope for X2 and the output is the interpolated envelope,

so it's a shape that is in between these two shapes.

13:55

So here, we see on the top the violin sound, and we can listen to.

[SOUND] Okay, and it's analyzed, so we see the harmonics.

Well, we see I think here is the first like 40 harmonics and then,

the stochastic component of that.

And then, we see below the analysis of the soprano sound,

let's listen to this [SOUND] Okay, where we see again the harmonics,

it's a higher pitch and the stochastic component in the background.

And now,

we can choose the interpolation values to go from one set of values to the other.

So it will progressively go from one sound to the other.

Let's listen to the result.

[SOUND] Of course, since the soprano sound is

higher than the violin, we hear this glissandi.

But at the same time, we hear the evolution of the timbre, and

every single parameter is slowly changing from one to another.

Again, the possible type of Functions to control these interpolations

is enormous so we can play around quite a bit with these ideas.

15:13

That's all, so again, Wikipedia, there is not much and there is not much that we can

refer to in terms of trying to understand some of these transformation aspects.

And in fact a lot of these things is just by trying, so

I would encourage you to try these things and

develop your own intuition of what it works and what it doesn't.

And that's all, so

this has been the second theory lecture on sound transformations.

Of course, in this week, what is important is the programming and

the demonstration classes, even more than the theory lectures.

Because after all, it's a very applied topic, and it has to be grasped

from an application, and from an intuition and musical perspective.

So thank you very much for your attention and I see you in the next lecture.

Bye-bye.