In fact, in 2005, Salganik, Dodds and Watts carried out an interesting online experiment, that mimicked some of the features that we were discussing about YouTube video popularity. So let's very briefly take a look at this experiment. This was a controlled experiment involving a large number of participant by the standard of sociology study back then. 14,341 participants. And there were 48 songs from unknown bands. And what they did, is to show these 48 songs in different orders and presentations to these 14,000 plus participants, and then tally the download numbers. So they're going to look at the popularity by the download numbers in four different scenarios. Depending on two different criteria. One is, they will show either random ordering of these 48 songs, or a descending order by the current download number. Then for each of these two ordering, they will show either the download number, or hide them. So they're all together two by two, four possible scenarios. And as you can see, as you go from random to descending order of download number, the peer influence power increases. And as you go from hiding the download number to directly showing the specific number, you also increase the peer influence. So you've got four scenarios with increasing peer influential power. And the question they asked is what about the download number in the end of the experiment? Turns out that the vector of those download numbers, m1 up to m48. We've got one vector for each of the four scenarios. See that more social influence always leads to a bigger spread in the download numbers. We can measure that by variance, we can measure that by some kind of l1 norm of this vector. It also increases the unpredictability of the success of these songs from unknown bands. Now in the least influential of power scenario, we can take that as the benchmark, because in that case, the ordering of the songs are random, and download numbers are hidden. And then you see that as we increase the social influence through the other three scenarios, it becomes less easy to predict which songs will be more popular. The really good one and really bad ones, okay, the top and bottom of the vector in this benchmark scenario, pretty much stay the same. But what's in between, often change place a lot. So now the question is, is there some kind of general theory about these phenomena, including the impact of knowing what others have done in front of you? So we're going to try to build a very simple model on sequential decision making in a crowd. You can effectively think of that, different people forming a linear topology, a line, okay? You get to observe, so you are here, what happened before you. And then you're going to make a decision for the future generations to observe. And in these sequential decision making models, each person gets a private signal, and that releases a public action. The private signals will remain unknown to the future users. But the public actions will be staying there for them to see. And this creates an information dependence, because what you would be doing would, in part, depend on the information that you can collect, by observing what others in front of you have done. And we will see then, such an information dependence. In contrast to the independence assumption in writing averaging on Amazon, will lead to a cascade. It breaks down the wisdom of crowds, and leads to basically, a cascade where no matter what the private signals are in future generations, they will always follow the same action as the crowd before them have taken. And we'll see such cascades. It's actually easy to form. It can become very large, can be wrong, and yet can also be fragile, can be easily broken. And the way we're going to talk about this model is through a thought experiment. I want to highlight that this thought experiment assumes that people react in a rational way. But in reality, people don't. And that's one of the difficulties of understanding social networks. Is because most of the theories that are tractable, that we build, are assumed that human beings behave in the way the model asks them to behave. But having said that, and knowing that this thought experiment in reality actually sometimes doesn't work out the way the theory predicts. Let's still go through it because it offers interesting insights to chain reaction in the crowd. So we got people lined up here, okay? And then they go up to a table, okay? And beneath the table, there is a piece of paper with number 0 or 1 written on it. And that's the correct number. But you don't get to observe the correct number. The private signal each person gets, is a probabilistic version of that signal. So whether that signal is 0 or 1, you're going to flip a coin, load a coin with probability P, it will remain 0 or 1, the true number. But with probability 1 minus P, it will flip. So the 0 becomes 1, 1 becomes 0, and your probability signal therefore is incorrect. Now we're going to assume that with equal chance, half, half the true number is 0 or 1. But actually that doesn't quite matter, as we'll see later in the formula. However, we do insist that the probability of each private signal is better than half. Meaning there is, it is more likely to get the correct number than the incorrect number. That we will assume. Let's also assume that these P's, which in general can be different for each individual indexed by i, are all equal. Later in the next segment of video, we'll talk about briefly what would happen if they are indeed different. Now, observing this private signal, each person would think what would the correct number be. And let's assume each person is a rational Bayesian agent, and says that, well, since there are two numbers, I guess I have to guess one. Which one, whichever number has a higher chance, more than 50% chance of being the true number, I will guess that and write it down on the blackboard. Okay, the first user may say, I guess the true number is 0, second may say I guess true number is 1. The later people can observe all the previous history of the public actions but not the private signals. So that's the set up of this very simple thought experiment and we'll wonder what would happen. Now, suppose you are the first person going up to the table and blackboard. Let's assume that you see one. That's your private signal, what would you guess? Well, knowing that P is strictly greater than 50%. You know that with, it's more likely the true number is indeed 1 as opposed to 0. So your public action is just one. Similarly by symmetry, if you receive 0, you write down 0. So this is private signal and this is public action. And a public action is the same as private signal. Let's give a symbol X to the private signal Y, to the public action, and 1 index to the first person. So Y1 = X1, that's very simple. What about the second person? The second person coming up to the table and the blackboard sees on the blackboard a number, 0 or 1 from the first one, that's Y1 Now, even though the second person doesn't know what is X1. She can, however, reason through what the first person had reasoned through. So she knows that whatever Y1 is, it must be equal to X1 because that's what a rational person would do. And therefore, the second person actually, knows what is X1. because by logical reasoning, she knows that whatever Y1 is, that's also X1. So she has both X1 and X2 now. If these two are the same, clearly, Y2, her public action by the second person. Would that be the same as they? And so, if they are both 1, then this is 1, both 0, this is 0. What if they're different? X1 is not X2, 01 or 10. Then, Y2 basically, can be whatever. The second person say, well, then I have to just flip a coin and decide, okay? So there are different variations of this experiment. But ours will say, that if X1, X2 are different. Again, X1 release Y1, the second person knows X1 = Y1. If they are different, then I'll just flip a coin. In other words, Y2 may be 0, may be 1, each with probability half. So we can write this down, for example, as the following. We say, that the probability that Y2 = 1 given, X2 = 1 and Y1 = 1 = 100%. Similarly, the probability that this is 0, given both as 0, it's 100%. But the probability that Y2 = 1 given X2 = 1 but Y1 = 0 or the other way around = 50%. The other 50% is that our on Y2 is 0, that's another 50%. All right, that's the second person. So far, so good. What about the third person? Now, here comes the tricky part, and this is where, as we'll soon see, cascades stopped. The third person says that, well, I'm standing in front of the blackboard. And I can see two public actions, Y1 and Y2. I also get my own private signal X3 so I've got 3 signals here. And I have to make a decision through Bayesian analysis or what shouldn't my wife be. Now, if Y1 is not the same as Y2. Okay? That means again, that the X1 and X2 cannot possibly be the same. because if they were the same, Y1, Y2 be the same. Since X1 and X2 are now the same and therefore, I can pretty much ignore Y1 and Y2. Because last who used it, just got two different private signals. And that doesn't tell me anything, I am back to the same shoe as the first user. Okay, here comes the beauty of the simplicity of this thought experiment. So anytime, there's an odd number and an even number of two public actions being different. Then the following person, can just be back to the same shoe as that odd number user. Okay, back to the first user. But what if Y1 = Y2? Okay, then I see what X3 is. If that also = X3, clearly, my Y3 will just follow whatever all these three numbers are, they are the three same number. But what if Y1 = Y2 but differs from X3? Now, this is the scenario that is most interesting. I got two public actions from the crowd in front of me, that conflict with my own private signal. Which one should I trust? Should I trust the two public signal? Should I trust my own private signal? Or should I equally trust them and continue flipping coins? Well, let's see what would happen. In Bayesian thinking is, basically, the following, just as what we did in lecture five with Amazon and average rating. We'll try to find out what is the probability in shorthand notation P that the true number is 1. If I have observed the following, Y1 is 1, Y2 is 1 and yet my X3 is 0. This is the shorthand notation, okay? Probability that true number is 1, condition on and given that Y1, Y2 are both ones and yet my X rate is 0. Now, if this is bigger than 0.05, then knowing that we are just guessing between two numbers. Well, I will say the two number is 1, then Y3 is 1 and that's amazing. Because this says, I will ignore my own private signal and go with the previous two decisions. Okay, but if this is less than 0.5, then I would still be finding my own private signal. [COUGH] And do not depend on prior actions. If this exactly half of that, I'll be flipping a coin. So let's find out, is it bigger than half or not? By Bayes' Rule, we know that this expression is the same as the following. The probability at the underlined number is 1 times the probability that if the underlying number is 1. You get Y1, Y2 beam, 1X3 beam 0 and divide by P Of 1, 1, 0. One way to see this is that you can move this to this side of equation and then you can see that we're talking about probability that you observe 1, 1, 0, meaning, y1 is 1, y2 is 1, x3 is 0. Times the probability the condition on that, the underlying number is 1. That is nothing but the joined probability that the true number is 1 and you get 1, 1, 0 as y1, y2, x3. And yet, that's also what this expression is saying, right? It's probable that the true number is one times the probability that given the true number is one you observe one, one zero. So then you can just move the expression from that over here in the denominator. But we'll also know we can rewrite this expression. Simply as by the law of total probability as probability that the true numbers one times the probability that you observe one one zero. Given the true number is one plus the other branching which is the true number zero times the probability of sub one one zero given the underlying number is zero. So now we are done. In order to find out whether this expression is bigger than half or not, we just have to evaluate this division. So let's just take a look at what the numbers might look like. In the numerator, the probability that the underlying number is one is in our assumption, half. Okay, fine. Times this probability, if the underlying number is 1, what's the chance you see 1, 1, 0? While there are two possibilities. One is x1 and x2 are both ones and then x3 is 0. The probability of that is p times p times 1 minus p. But don't forget there is another chance that the underlying numbers is x1 is 1, x2 is 0. But so happens, the second user flip the coin and the coin flip shows that to break the tie, it will pick one. So when that happens with probability P times one minus P for the first two users, the third user sees zero, another one minus P. While the second user flip a fair coin, so with probability half, you would write down one. So two possible scenarios leading to this observation condition of the underlying true number is one. That's the numerator. Similar, you can write down what the denominator looks like and then you can start cancelling terms like P times one minus P. And then normalize them by four, divide and multiply them both numerator and denominator by four. And then simplify an expression you see, you are simply doing a division of 1 + p over 3. Is this bigger than half or not? Well, depending on what p is. But we know p is bigger than half. The chances of seeing the correct number for everyone is the same and they're all bigger than half. That means this expression is bigger than 1 plus half over 3 which equals half. So this expression is strictly bigger than 50%. And therefore, y3 would be 1. And therefore, even if you observe x3 being 0 different from the y1 and y2's. You still go with whatever y1 and y2 say. That means you ignore your own [INAUDIBLE] signal. That means you have just started throwing out your own independent private signal and following dependent on the previous crowd. That means you have initiated a cascade. Because as you can easily reason, the fourth user after you will see y1, y2, y3, all at once. And even if she gets a zero as x4, the primary signal for her, she will still follow you and write down y and therefore, on the blackboard. At this point, it will be 1, 1, 1, 1 all the way down. The information cascade starts. We say it's cascade because past this point, once three ones shows up, it will keep on going as one. We say it's information cascade because the cascade starts because the third user ignored her own private signal. And followed the information obtained by observing the previous two public actions. Of course, we were following this track of cascade of ones, it could also be a cascade of zeros. If the first two users both write down zero, the third user would write down zero no matter what our own private signal x3 is. So this starts the cascade and we wonder how likely this will be. Well, let's see what's the chance of starting a cascade? We know that as long as y1 = y2, cascade start on user three. So if we don't want a cascade, the probability of no cascade, means that I want the first two to be different. The first x, y1, y2 to be different. What's the chance that they are different? For example, if x1 is one, that means y1 is also one. But x2 is 0, okay? Then the two would be different, okay? So what's the probability of that? That is p times 1- p times half. Okay, now what's the probability that the other scenario, y1 is 0 and x2 is 1. Then the probability is also p 1 minus p times half, okay? Okay, again, let's look at the chance that x1 is 1, is probably p, given the underlying two numbers is 1 and we want y2 to be different. That means we want x2 to be different as the chance 1 minus p being 0. And since they are different, the second user will flip a coin and probably half it will actually indeed write down y2. That is different from that y1, that's the y we write on this expression. But by symmetry, you could also instead of being 1,0, could be 0,1, it's the same expression. So the total chance assuming the underlying real number is one, is the sum of these two, p times p times 1 minus p. But of course, the underlying number could be zero too. If it is zero, carry through the exact same calculation by symmetry, it's also p times 1 minus p. So, even after unconditioning against the underlying true number, the probability of no cascade being. Meaning that y1 and y2 are different, their probability is p times one minus p. Now, what's the probability of cascades of 1? So when we got up cascade, versus probability of cascade of 0, the down cascade. Well, by symmetry we know that they are both half of 1 minus the probability of no cascade. Okay, so the probability of having definitely some cascade is 1-p times 1-p. And the probability of either up or down cascade is that expression halved. So the probability of no cascade, probability of having some cascade, and the probability of either up or down cascade. Now, this is the argument for the first two users, with implication to the third user, whether a cascade starts there or not. Well, now what about running this for longer period of time? For example, what's the chance that there is no cascade? Even after 2n users? After 2n users. Why do we use 2n? Just to simplify the expression a little bit. because, we know that it's always a pair of two users, odd number, even number, odd number, even number, that may initiate a cascade, despite peculiarity of this thought experiment's setup. So if we use n users, we have to write n over 2 a lot of times down the road. Let's just write 2n therefore, so. If there are 2 users, after 2 user after 4 user after 6 user, what's the chance that next user will start a cascade? While the probability that there's no cascade after even 2n users is simply that this pair, they write on different probability actions. Y1 is different from Y2, which means the third user is back to the same shoe as the first user. And, these two users this pair Y3, is also different from Y4, and keeps on going after n round of these. That means, we want to say that the probability of no cascade after 2n people is P times 1-p. This happens for every pair. So this times multiply itself, n times. That means we raise this to the power n. Okay, and the probability of having some cascade, equals 1 minus this expression. And the probability of up cascade or down cascade, meaning all 1s or all 0s, eventually after 2n rounds, is just half of the probability of having some cascade. All right, now let's see what happens as n becomes large. We see that as n becomes large, this expression goes to 0. As P times 1- p, some number less than 1. And therefore, the chance of having some cascade, goes to 1. In other words, just wait long enough. If people think like this in this simple thought experiment of cascades, is guaranteed to happen no matter what.