0:00

Hi, folks. So, let's take a look at the diameter of random graphs and in particular,

Â remember we just had a theorem that we stated about the structure of a random graph.

Â In particular these ErdÅ‘sâ€“RÃ©nyi GNP graphs.

Â And these are also known as Bernoulli random graphs, Poisson random graphs.

Â We'll talk about some of the definitions in a bit.

Â But if you recall the basic statement of the theorem,

Â it was that if the degree was growing at least with the log of n,

Â so one plus epsilon times log n,

Â and if the degree wasn't becoming too large,

Â so that that degree relative to n is vanishing,

Â then when we look at the average distance with a probability going to one,

Â the ratio of the average distance compared to log n over log d is going to one,

Â and the same actually holds for the diameter.

Â And if you recall what we just went through

Â in terms of the basic ideas of how this worked,

Â we went through some calculations that, you know,

Â first of all looked at a tree calculation,

Â so that if everybody had

Â exactly this degree d starting and you're moving out in a tree like form,

Â then you know from a root node to get to the others you would need

Â roughly d minus one to the l power.

Â To reach n minus one,

Â l would have to be roughly on the order of log n over

Â log d. And the difficulty with this was well,

Â not everybody would have the same degree.

Â So let's suppose now we had instead of a tree,

Â we had a tree where the degree was randomly

Â varying and not everybody had exactly the same degree.

Â So what we can show- first of all it will show is that the fraction

Â of nodes that have nearly the average degree is going to one.

Â And this is going to be true as

Â long as this expected degree is at least on the order of log n,

Â and in particular what we're going to

Â use to prove this are what are known as a Chernoff bounds.

Â And let me just sort of state what Chernoff bounds are.

Â I'm not going to try and prove Chernoff bounds here.

Â You can find proofs in many different places.

Â In fact, you know you can go to- Wikipedia nowadays has a lot of

Â nice statements of different math theorems,

Â you can find backgrounds on this.

Â But the important thing here is that if we're looking at a binomial random variable,

Â so if you are just flipping coins with some probability p and then

Â counting how many come up- so how many come up it

Â as a given- so imagine I'm looking at a particular node and

Â it's forming all of its links and we ask how many links does it end up with,

Â well, the probability that the number of links ends up

Â with is between three times the expected number,

Â and a third of the expected number is

Â at least one minus e raised to the minus expected number.

Â So if I'm expected to have a hundred friends,

Â then the chance that I'll have somewhere between 100

Â over three and 300 is one minus e to the minus 100.

Â Well, this is getting very close to one, right?

Â So it's as if you expect me to have 100 friends,

Â then very close to one is the probability that I'm going to be- have

Â at least 33 and smaller than 300.

Â So Chernoff bounds begin to say, "Okay,

Â there's a very high probability that you're going to be within

Â a factor of three of what the expectation is."

Â Okay, so that's very useful.

Â How is that useful?

Â Well, the probability that a node is going to have a degree

Â is very close to the average- node i is going to be within d over

Â the expected degree over three times 3d is one minus e to the minus d. If we

Â want then ask that everybody has something which is within a third to a factor of three.

Â What we're going to end up with is we can just raise this to the nth power.

Â So we've got n different nodes.

Â This is the probability that any one of them is within a factor of three.

Â This is an approximation for

Â the probability that we end up within a factor of three for all the nodes.

Â Why is that an approximation?

Â Well, these degree of different nodes is not quite independent.

Â It's approximately independent.

Â So if we actually want to prove this theorem very rigorously,

Â there's a lot of detail that I'm going to admit, omit.

Â And here, what we're going to do is go through the basic ideas and why is this true.

Â And you know, you can check Apollo Bosh's book if you want gory details on the proof,

Â or you can go through Chunkin Lewis' proof.

Â So there's different proofs that will give you more detail here.

Â I'm going to go through just the basic point.

Â Okay, so we get that a probability of the degrees based on these Chernoff bounds gives

Â us a high probability that everybody is going to be

Â within a factor of three of the degrees.

Â So, what we've done is we've shown that

Â the probability that all degrees are within a factor of three of

Â the expectation is at least this expression:

Â one minus e to the minus d raised to the nth power.

Â And then given that d is at least one plus epsilon times log n,

Â if we plugged that in for the expression up here,

Â then we end up with this- the probability that all degrees being within

Â a factor of three is

Â at least one minus one over n to the one plus epsilon raised to the n,

Â and then using our approximation for this,

Â we know that this is converging to one as n goes off to infinity.

Â So we know that the probability that all degrees are within a factor of

Â three as n is getting large is getting closer and closer to one.

Â So what we've done then is showing that if we

Â have the degree being at least the order of of log n,

Â then the probability that all the degrees are within a factor of three is going to one.

Â And that allows us then to conclude that with probability one,

Â we are going to end up having the-

Â So basically what we've shown is that the probability that

Â all degrees are within a factor of three is going to one.

Â And so, the distance that it takes to get from

Â one node to all the others if we had

Â a tree expanding with these degrees being randomly distributed,

Â but having the properties that they're within a factor of three with probability one,

Â then we could bound the distances by saying that the shortest it

Â could be is if we ended up with

Â three times the degree as the average degree the longest it could be,

Â is if we ended up only with a third of the degree.

Â And so, that gives us bounds on what this could be.

Â And in particular, for d that's growing so that d isn't too small,

Â the log of 3d and log of d over three are both basically proportional to the log of d,

Â for a large d. Right?

Â So this is the log of d plus log of three.

Â This is log of d minus log of three and for large d,

Â the three part is washing out.

Â And so this tells us that basically as it grows,

Â this distance is going to be proportional to

Â log of n over a log of d. So whether it turns out to be off by

Â a factor of three really doesn't matter much in the average distance calculation,

Â because that's proportional to logarithms,

Â and logarithms wash out these factors.

Â Okay, so that's a very nice point.

Â And you know, we have this other part that some links may be doubling back.

Â And when the part of that it's going to help with the expansion is that, you know,

Â these links are being put down at random and most of

Â the nodes aren't really reached until the last step of this expansion process.

Â And so after K steps,

Â we've reached about d to the k nodes,

Â and there's still n minus d to the k to be unreached,

Â and if k is still not- we haven't expanded all the way outwards,

Â then there are still many more nodes out there

Â compared to the nodes which are already have been reached.

Â And so if we think about a given node and we ask where are its links going to,

Â most of its links are going to- given that they're independently at random,

Â are going to be reaching new nodes rather than nodes that have already been reached.

Â And so it's not until you get to very close to having reached all the notes,

Â that you have to start worrying about the fact that things are going to be really

Â doubling back with any high order of magnitude.

Â And so you know,

Â just to get a feeling for this let's suppose that each person had a 100 friends,

Â then you know at the first stage you would reach a 100.

Â The second stage is 10,000.

Â Then we get to a million.

Â Then we get to 100 million and so forth.

Â So things are actually expanding outwards,

Â and there's always more and more nodes

Â that haven't been reached than ones that have already been reached.

Â And so the average distance is

Â actually going to turn out to be very close to the diameter,

Â because most of the ones are being reached at the last step.

Â And that is also going to be a part of the proof in making

Â sure that these things aren't doubling back at too high a rate.

Â So this is sort of a heuristic treatment of why these things are true.

Â But what it does is, it gives you an idea of how that

Â Chernoff bounds can be used in these kinds of proofs.

Â Basically, the proofs of properties of

Â random graphs work by understanding different structures that underlie these things,

Â and then proving that things happen with

Â high probabilities using variations on laws of large numbers,

Â showing that as things become very,

Â very large- then properties are going to be true with probabilities going to one.

Â Thats a basic structure that underlies a lot of proofs in random graph theory,

Â and gives us an idea of why we're seeing the kind of

Â theorem that we saw in terms of diameter and average path length being

Â proportional to log n over log d. So now let's take a look at

Â some actual average path lengths and distances and

Â see whether we see that this seems to be accurate.

Â