0:00

So we're back and now let's take a very close look at Preferential Attachment.

And begin to understand how it generates links and degree distributions.

So, Preferential Attachment so the idea now is newborns again are going to form m

links to existing nodes. born one each period, so time just keeps

track of birth dates. And you could adjust it however you like.

so the important thing now is the formation process is going to be

different. So at sometime t, the number of links

that are going to be in the system are going to tm links, right there.

T nodes born it reached form m links so tm in total.

And the total degree is then 2 times tm, right?

So each link has two different nodes that are going to get to count it as a link.

So we have 2tm as a total number of, of degrees in the system at any point in

time. And the probability of attaching to a

given node is proportional to it's degree compared to the overall degree in the

society. So nodes that have very high degree, are

going to have higher chances. And so in particular, the probability of

attaching to node i, is going to be it's degree divided by 2tm.

Which is very different than the m over t that we had before.

So before, every node that was out there had an equal chance of getting a new

link. Now the chance of getting a new link is

proportional to your degree, compared to the overall degree in the society at that

time period, okay. So this is the proportional attachment

rule. The Preferential Attachment for which

this model is known. Okay, so let's do a mean field

approximation. So again we're going to do a continuous

time of approximation. We're going to be really finding the

degree of the distribution of expected degrees.

And then to check whether this actually matches the real degree distribution.

We could simulate the model, and see whether this is giving us a good,

approximation. There have been theorems now which show

that the mean field approximation of this particular model works well.

when we get to richer models that are different from Preferential attachment.

And the exponential model, then, we're not going to know whether or not the, the

actual degree distribution is well matched by this approximation.

But we can always check and simulate the model and then make sure that the, that

what we're getting out is the, the closed form solution from this kind of system is

not too different than what happens from the simulations.

Okay Distribution of Expected Degrees. So again, when you're born, you form m

links. So that's the initial starting condition.

How does your degree change? Well, you're getting new links at a rate

which is proportional to how your degree compares to 2t over m.

There's m new links being formed per unit of time.

Therefore this differential has this equation, and we can cross out the m's in

the numerator and the denominator. And so what we'll end up with is a simple

differential equation. Right, so we end up with the derivative

with the degree, with respect to time, is just looking like the, the actual degree

over 2t. And we have a starting condition,

differential equation, this one's even easier than the last differential

equation to solve, in fact. So the degree looks like m times the time

over the birth date raised to the one half power.

So square root of t over i. So a simple square root equation that

gives what the degree should look like over time.

Okay, so we've got a very simple equation for the degrees.

And we can compare that. So let's go back to the, situation we

looked at before where people were forming links uniformly at random.

Let's again have everybody form m equals 20 links when they're born.

And let's compare what happens when they're forming the links uniformly at

random compared to preferential attachment.

And what we see is that the Preferential Attachment is in some senses a steeper

curve here. And it ends up with more degree, more

high degree notes. The older ones are, are higher degree

than the, the uniform at random. And then the younger ones are actually

lower degree. It's harder for them to get new links,

because they have lower degree. And the older ones now have an added

advantage, not only of being older and they've gained things.

But they're, also you gain things proportional to how large you already

are. So the older ones are even easier to find

and easier to attach to, under Preferential Attachment.

So they get this extra bump where we see the extra bump in the curve.

So this is going to give us the fat tails.

So the fact that we've got this extra bit here, and this bit lower means we're

going to have more with high degree, more with low degree.

And that's going to generate the, the power law that we're looking for.

5:08

Okay so now, degrees in Preferential Attachment.

Let's figure out the degree distribution. We can do the same kind of calculation we

did before. Which are the nodes with degree less than

35 at some time. Well, we just solve for this curve.

Which are the ones that has the degree exactly 35.

It's going to be all the ones that are younger than that.

So born after that time period, whatever that time was.

And then we can figure what's the fraction that have that?

So we'll be able to figure out our distribution function Fof d, the same

technique we used before. But now with just a slightly different

equation that we're using to solve that same system.

So remember you're degree at time t is m times t over i to the half.

So the nodes with degree less than 35, are going to be the ones for which this

function is less than 35. So 35, has to be bigger than our m here

is 20, t equals 100. So, if you solve that out, and solve that

for i, you get that i has to be bigger than 32.7.

So now, this is 32.7 as opposed to the 42 something that we had before.

So, what's the fraction that have degree less than 35 at this time.

Well, it's going to be 100 minus 32.7 over 100, and then we can solve that out.

[BLANK_AUDIO] Right, so so roughly something like 68% of the nodes are

going to have that property. so when we look at this, just solving it,

generically. Again we've got this equation.

the ones that have degree less than d are going to be the ones where this, equation

is less than d. And therefore these are the i's that are

greater than t times m squared over d squared.

And now then to solve for f this, right, so we've got the, the fraction these are

the i's beyond some level. So going back we're want to find the,

what's the fraction of i's above some degree d.

Well we're just going to take the t minus these i's over t, that'll give us the F,

right. So F is just t minus this compared to t,

which works out to be 1 minus m squared over d squared.

So we have a very simple equation for the degree distribution.

So our degree distribution looks like 1 minus m over d squared.

And if you take the derivative of that and look at what's the density function

for the distribution generated by preferential attachment.

It's 2m squared over d cubed. So, this is proportional to d to the

minus 3, right? So, we've got 2m squared, so this is our

constant, times d to the minus 3. So, now we have a power law, we have

exactly something which looks like a power law.

And indeed when we take logs of each side we get the log of f of d looks like log

of 2m squared minus 3 log of d. So we've got a nice linear equation in,

in log, log plots. Which is exactly the, power law that we

were looking for. And in particular here, one thing that's

sort of interesting, when you look at the coefficient that came out, it's exactly

3. so why 3?

well, that came from the fact that the change in degree, over time had a 2 in

the denominator. And then when we do integration and so

forth we came out with a 3. but basically that's, is, is coming from

the fact that there's a certain number of links present in this society.

And if you want to vary this, you can actually produce different variations on

this coefficient. And the way that you can produce

different variations on this coefficient is just having the number of links being

formed at any point in time. either grow or shrink over time.

So you can have the population, the number of newborn nodes not be a constant

one per period but changing over time. And that'll allow you to control this,

this variable here. So, so a slightly richer variation of

Preferential Attachment can adjust that. There's actually an exercise in the book

I wrote that shows you how to do that. But basically the idea here is that you

can get different variations here by just changing the rate at which the system

grows. But the important thing is that the fact

that the older nodes are gaining, things that are richer, faster and faster time,

meant that, that we end up wth these fat tails and power law.

Okay, so we've, we've gone through a couple of different growing systems.

What we'll do next is try and enrich this a little bit more to span between these

sort of uniform at random and Preferential Attachment, and then that

will allow us to actually take these degree distributions.

And take them to data, and see which ones actually match the data.

So which is, is the actual Preferential Attachment a good match for the, for the

world wide web data? does, which ones match the romance

networks etcetera. So can we find variations on these models

that actually match the observed data.