And this hybrid model between random and regular graph is called Watts Strogatz graph. It's actually parametrized by two numbers, c and p. Remember were using p to parametrize random graphs and c to parametrize regular ring graph. And this is the probability of a link appearing and this is the deterministic degree for each of all the nodes. Now we're gonna use both c and p. We start out constructing Watts Strogatz graph by constructing a regular graph. So, you've got a lot of short range neighbors. Just like before, These are the nodes and that you can add a few more short range. And then, here is the randomization step. Randomization has been used in the past in our lectures and will be used a couple more times in later lectures. In today's lecture, the randomization happens by saying that give me any short range link that's existing in this regular graph. What I am going to do is to look at this short range link and delete it and instead of this short range link I'm going to have a draw from probability, probability p, that any pair of nodes, including those far apart will be connected by a link. Okay. So, instead of having this link, I will delete it and then I will say look at any pair of nodes probability p add a, or add a link. And in the analysis that we will be doing, we'll make a slight modification to say that the existing links will still be there. Okay. So that means we do not delete, we just add our links. This will help make the analysis in the advance material part of the video a little easier. And therefore, if we have 100 short range lengths that give us 100 chances to establish possibly long range lengths. So the question is, how much randomization do we need? Now we know that, if there is too much randomization, then it's going to look like a random graph. And therefore it will not have a large clustering coefficient. But if it's too little randomization, the number of long range lengths will be so small that will be so close to a regular ring graph. And then, the clustering coefficient would be large enough, But the l median distance between node pairs would not be small enough. So as it turns out that a little bit of randomization, A small p suffices. Okay? So here, I'm plotting p on log scale from 100%. To ten percent to one percent and so on, 0.1%.. As you can see that when p is sufficiently large, then if you look at these two curves, this is the clustering coefficient, capturing triad closure, then would be too small. Whereas, on the other hand if randomization is too small, l, the median shortest path will still be too high. What we want is for the c curve to bend over, to maintain at a high spot without bending over and we want l curve to bend over. So we want and, and everywhere this part of the curve. And around this part of the curve, so the intersection is somewhere in-between. This space is a good range. Okay? Now, this is a numerical example for n600, = 600, 600 nodes. And c6. = six, meaning that each node has a degree six in the regular ring graph that start at Watts Strogatz graph, Meaning three on the left and three on the right. Okay? For this particular example, it turns out that p in the range of point one to 0.1, to 0.01, will be the right range. That would give you still a sufficiently high clustering coefficient and a sufficient low shortest path length. And just to clarify that our y axis here is the normalized version, Meaning, we look at the clustering coefficient divided by the largest clustering coefficient that is achieved one piece, the smallest. And then, we look at l normalized by the largest L. . Now, is it a coincidence that, for this numerical example this range of p point, from a 0.1 to 0.01 suffices to give us both of the desirable futures. It turns out that's not the case. We can actually write down, as we were doing in the advanced material part of the section the following expressions for c and l. Turns out c is basically behaving like this for Watts Strogatz graph that we just talked about. Okay. So, function of both c and p, and is impact, the impact of p is the clustering coefficient of large C, so you can see it goes roughly like, one over one plus p, 1p is small. On the other hand, l, is approximated by many ways. Over of them says it's approximately like a log of ncp over C squared p. So, as a function of n, it grows like log n just as we desired. So, we'll wait until advanced material to talk about this analytic formula. And now, we want to intuitively understand why. Why is it that if we have a regular ring and then we add some long range lengths with a certain probability p, we can achieve both a large L and a large C and a small l? Why is that true intuitively? Well, it turns out that's because our definition. L. as you recall, is about extremal quantity. This is the median of the shortest distance. I am not talking about all the path between two nodes. I'm talking about the shortest path between two nodes and looking at a median of that quantity. In contrast, the C, the clustering coefficient is about an average quantity. It is talking about the percentage of triad closure out of all the connected triples in the entire graph divided by three for normalization. I am not talking about an extremal quantity. This is an average quantity, average across the entire graphs. And there lies the fundamental reason Watts Strogatz graph idea or idea of little bit randomization or little bit of random long range lengths suffices. The reason is because a little of randomization leading to some long range link is not enough to destroy an average quantity. and therefore. you still maintain the large clustering coefficient in a regular graph. And yet is suffices to shorten some path between any two node and that's enough to reduce the shortest the distance and, l is only about the shortest the distance. Most of the path are still long, that does not matter. We now got some shortcut long range lengths going from this to this for example. So, instead of hopping all the way, locally I can have one shortcut to put me very close to the destination. So, I don't mind if the other long paths are still there, because l is only talking about the median of the shortest path. And therefore, you may wonder, ,, since that's Watts Strogatz model works to explain structural small world, Why not we make L define L as the median of the average distances between nodes pairs rather than the median of the shortest of distances? Then it will be a fair definition, fair in the sense L and C will be standing on the same footing. Well it turns out that we don't need to. Why don't we need to? Because we don't need all the paths or the average of the path, linked between two nodes to be short. When we route, as long as we can find a short path and a smart enough routing method, it's going to pick out that path. If the functionality implicitly modeled here is routing and all we need is some short path. But that means, beneath this definition of l is a very basic assumption that says, you can actually have a routing algorithm that can find, such a short path. Maybe not a shortest, but a very short path. Now we just talked about a greedy routing method on social network. We call that greedy social search based on some composite distance metric scores. And let's say if we run this kind of greedy routing method on a social graph, what is the expected length? Let's denote this by little l. Big L is talking about the existence of short path. Little l is talking about the ability to route according to a greedy methid and end up with a short path. This is existence result. This is a computability or searchability, navigability result. It says that local information used in greedy fashion can lead you to some short path. If this is true, If this little l is also small, maybe not exactly as small as big l, then we're okay cuz it says that, you know what, I don't need to define l as the median of average distances because shortest the distance is going to be discovered, those paths would be discovered in a practical, greedy, distributed, routing method. The question is, is that true? Well, it turns out that is true as observed in many cases. Now, the question is, how can we reverse-engineer that searchability result? In other words, how can we have an explanatory model to generate an, an algorithmic small world?