0:00

Okay, hi folks, we're back again, and we'll be talking a little bit more about

Random Networks. And in particular, we're going to start

looking at Growing Random Networks. So, situations where there are new nodes

entering over time. And, so this fits into our study of

network formation. So in terms of the course, we are now in

the second part of the course. And we're looking at Network formation

models. And again, we've looked at Static Random

Network models. And now we're going to be looking at

Dynamic, Random Network models. Where there's growing numbers of nodes

over time. And there's lots of examples where this

happens in the world. So, the prime example is citation

networks. New articles are born over time.

New articles can form links, by citing old articles.

Old articles can't cite new articles. so articles are going to accumulate links

over time. Newer articles are going to have fewer

links then older ones, so there's a natural form of heteragenania that forms

this way. Same thing with the web, in terms of web

pages. Collaboration networks and

co-authorships, you have older researchers, younger researchers older

researchers will have had more collaborations than younger ones.

Societies you known generally human situations you're going to have some

older. And some younger and they're going to

have different characteristics based, just simply, solely on age.

So there's a natural form of heterogeneity.

That comes in in form of age. So let's think a little bit about what

they add. why do we care about having growing

random networks? Well you know, one answer people say okay

what's more realistic. And one thing I want to emphasize just

from the start is when we build models, we know that we're not going to try and

capture everything in the world. And so, realism is not usually a good

reason for building a model to, to try and match reality.

the, the reason that we build models, is to try and use as few moving parts as

possible in order to pro, reproduce the world.

So the only reason we want to add this feature of growing random networks is not

because that's the way the world is. But because this richer model might

capture something in the real world that was not captured in the static models.

So again, you, you shouldn't judge models based on realism.

They're always wrong. the question is are they capturing

reasonable elements and so here. The natural form of heterogeneity versus

age is going to be useful in getting out more connected nodes and less connected

nodes. And, and starting to see, power laws and,

and fat tails. So we're going to get a natural form of

dynamics. And in particular, we're going to end up

with a natural way of getting different degree distribution without just building

them into the statistical distribution. So, we could just assume that there's a

distribution that has the, the, the features of reality.

But instead we want to do is see if build a model that will end up generating.

Features that look like the real world, and this sort of dynamics is going to

help quite a bit. Okay,[COUGH] , so in, in order to start

this, what we're going to do is start by taking Erdos-Renyi random network

situation. but instead, just enriching it to have

nodes being born over time. And so that'll be a simple benchmark,

that'll give us an idea of some of the techniques.

And then we can enrich it to have different kinds of formation processes

besides the uniform at random. Okay, so, this idea of growing and

uniformly at random, what we're going to have is each day is going to be a new

node's birthday. So nodes come in at time one, two, three,

four, etcetera. And when they are born, they will form a

number of links to existing nodes. And, to start, in terms of this

Erdos-Renyi kind of setting. What we're doing, we're going to assume

that each mode is going to form it's links, uniformly at random to the ones

that are already existing out there in society.

Okay so you're born, you decide who to link to, you form some numbers of links

and you form them to over the existing node.

To keep things simple, we're going to have the, the link formation process only

occur when the node is born. So you don't keep forming them over your

lifetime. You'll get new ones as new nodes are born

and they form their links. But a, a given note only forms it'[s

links it, it, it's outward links in some sense when it's born.

But it can accumulate more links later on as new nodes are, are forming their

links. Okay.

So in order to have some nodes that you can form links to when you're born what

we're going to do is each one is going to form m links at random.

So, we'll have m nodes that are already existing which are fully connected so

that when the first node is born it has somebody to connect to.

So the, the process is going to start out, we'll start it with a very simple

seed of already having some m nodes fully connected.

You could start with a whole series of different situations.

That's not going to effect the asymptotic limit of it, it might effect what it

looks like for the short periods of, of time, until you get to a very large time.

Okay, new node is born, it forms m links to existing nodes.

Say two, three, four, how, what, whatever, how, however many we want.

And what that means, is if I'm already born and, there's some time t then I'm

going to have a probability That's, that's roughly m of the, the

number of links being formed, compared to t of the existing number of, of nodes

that are, are already there, of, of getting one of the new links, right?

So, m over t will give, give us the probability.

So, the more nodes that are out there over time the less chance that any

particular node is going to get one the new links, right?

Okay, so very simple process, but, it, it's going to have some dynamics to it.

Okay. So, now let's think about sum node i,

that was born after the original m nodes that we started with that we fully

connect and, and before sum time t. Okay.

And now let's ask how many links has it, does it expect to have collected by time

t? Well, the first thing is, that it forms

some m links when it was born. So it's going to have m links for sure.

Then in terms of expectations, the next day after it was born, if it was born at

time i, then the date now is, i plus 1. And there's some new node which is born

and it's forming it's m links. There's already existing i plus 1 links

nodes out there. And so it has a chance m over i plus 1 of

getting these new links, okay? And then at time, the next state there's

i plus 2. It's going to, a number of new links are

formed, and so forth. Right?

So, so this is the overall sum of what it's going to get over time.

Right? So, we had the first m that it got when,

at birth, you expect it from the next node born, and so on.

So it has this sum.

[BLANK_AUDIO].

Now, if you want to look at a sum like this.

So, you look at this kind of sum, what is an approximation for this sum, well these

are harmonic numbers. So if you can if you remember your, your

number theory, these are what are known as harmonic numbers.

So, they're growing proportionately to i plus some number so it's growing

proportionately to the inverse of t. And if you sum a series like this an

approximation for this is going to be m times 1 plus log of t over i.

So, depending on, on how far you go out and when you started this series.

You're going to end up with sum that looks like this.

So, we have an approximation for what the expected degree is of any node, after

some time period t. Okay?

So, very simple calculation. And now what can do is we can ask what

does this generate in terms of the distribution of degrees in society.

Or the distribution of expected degrees in society at any point in time?

Okay, so let's do a simple calculation. How many nodes have an expected degree of

less than d at sometime t? Well, it's those for whom their expected

degrees at time t are less than some d. Right?

So if you say okay how many are going to have degree less than 100?

Then we can ask that. How many are going to have degrees less

than 35? We'll get some numbers.

So, at time t, it's going to be the i's for which this inequality holds.

Okay, so let's have a look at that inequality in more detail.

So, let's suppose that we did this calculation at time 100.

So, we've got here our t is 100. And let's suppose that each node at each

time is, is forming 20 new links. So, what does the degrees of different

nodes look like at time100? So here we have the birth date of the

node. So, here is birth date of the node and

this is then the equation that we had. So, so, so this is our equation that we

had in terms of m times 1 plus log t over i, right.

So, this equation right here is m times 1 plus log t over i.

10:12

So, that's plotted out here. And what that means is, as you can see

the older nodes, have gotten more links than the younger nodes.

So, the youngest node is only formed at 20 and not gotten many more.

Some of the slightly older nodes have gotten a little bit more than 20 because

they've happened to get some of the new ones.

These ones have been around for a longer time, they have gained more links.

And the, the, the reason you have curvature here, is always easier to get

links early on. And as more and more nodes are born, it's

harder and harder to get the new links. So, ones that are born later and later,

are going to have a harder time getting new links than the ones that we're born

earlier and could get one. So, you get a natural curvature here,

just due to that fact. Okay.

So, what do we get? well we can do the same thing for degrees

at time 200. And at degree time 200 we'll have a

slightly different curve. And you know, the ones that were born at

100, time 100, now I've had a chance to gain more links.

Everybody's gained more links. how much they've gained more, depends on

what their birth date was. And now we've got the, the 200th, node

born has just formed it's initial 20 links, and so forth.

So, we're going to get different degree distributions at different times.

We'll have a different, a, this is expected degrees over time.

we'll have a different distribution as time evolves.

Okay, so what does this distribution look like?

Well, we can ask then what are the nodes for instance, that have degree less than

35? What's the fraction of nodes?

So, let's go back to our time 100, and figure out what is a distribution

function. So, what is, you know, how many nodes?

What does F of d look like. So, how many nodes have degree less than

some d. So, what does F of 35 look like, alright?

We can do that kind of calculation. Okay, these are the nodes with degree

less than 35, are the ones are expected degree less than 35.

there's going to be some node, which has expected degree right at 35.

And basically the nodes with degree, expected degree less than 35, are

going to be the ones that were born afterwards, right?

So, these ones. So, if we want to figure out the fraction

of nodes that were born after, that have degree less than 35.

It's going to be the ones that were born after this, compared to the overall 100

that exist in the society so far. So, let's go through that calculation.

So, what we want is the actual amount that they have, the expected degree they

have. Remember that this is m1 plus log t over

i. So, here we have t is 100, m is 20.

So, these are the nodes for which this equation is less than 35.

Okay we've got that, and so if we solve that equation.

Then what do we end up with? We end up with, these are the i's for

which they're greater than, you know, you just take exponentials of both sides of

this, solve out for i. And what you'll get is these are the i's

for which, they were born after time 47.2.

So, basically the ones that were born 48 or later are going to have expected

degrees less than 35. The ones that were born 47 and before are

going to have expected degrees, bigger than 35.

So, this is 47.2, this point right here, and now if you want to ask, what's the

fraction of nodes that have degree less than 35?

[SOUND] Well, that's going to be the 47.2.

So, we're going to have 100 minus 47.2 over 100.

Right? So, this is going to work out to be,

0.628. Right.

So, the, the fraction of nodes that are you know, rough, roughly 63%, are

going to be the ones that have degree less than 35.

So, some where between 62 and 63% in this case have expected degrees less than 25.

Okay? So, given this process we can figure out

how degrees grow. We can figure out a degree distribution.

And, more generally for any degree, the ones that have expected degree less than

d at some time. Are going to be the ones for which i is

bigger than same solution as we just saw for the 35 t times e at the exponent of

minus d minus m over m. Okay.

So, if you go ahead and solve that out, what we'll end up with is a degree

distribution at time t. That says the fraction of nodes that have

an expected degree of less than d, is given by this formula 1 minus e minus to

the minus d minus m over m. Okay, so very simple set of calculations,

little bit of algebra invovled, some exponentiation, some summation of series.

But basically what we're able to do is now calculate what the degree

distribution looks like for this growing for this growing random network.

this is a exponential, negative exponential, distribution.

So what we have is the distribution of expected degrees is such that d minus m

is exponentially distributed with a mean of m.

what about the actual degrees? Well, a good approximation for large t,

we're going to have to you know, so, so here we did the calculations of expected

degrees. Right?

So we, we, we can say how many each node expected, and we got a nice curve.

Well some, some nodes are going to happen to get more, some are going to happen to

get few. So, the actual degrees of the true nodes

are not going to follow that really nice, smooth curve.

They're going to bounce around a bit. And what that means is that the

distribution can be slightly different than this distribution of expected

degrees. it turns out that this distribution of

expected degrees, is a good approximation of what the actual distribution of

degrees will look like. Some will end up pushed above, some will

end up pushed below. But on average, you'll still expect

around 60%, less than 35 and time 100 and so forth.

So, if you go through and actually calculate the true distribution function.

It's going to look as t becomes large, it's going to be well approximated by

this negative exponential function. Now actually doing that, you need some

careful law of large numbers argument I'm not going to do that here.

I, in the book there's some discussion of how you go about doing that.

It's fairly straightforward. Again, it's just making sure that the

variation in the actual distribution isn't too large.

As you get to a large t, the noise in the system is going to be well approximated

by the mean. So, so here we've got a distribution of a

growing random network things have worked out well.

Okay, so in terms of where we're going next we'll look at slightly richer

growing models. We'll also talk about mean field

approximations, other kinds of ways of calculating these things.

So, we'll at at another set of growing random network models that will allow us

to get richer and richer, the degree distributions out.