0:02

Okay, so we've seen a bunch of different centrality measures, so let's take a look

Â at an application which can begin to distinguish between them.

Â And let me emphasize just from the start that when we're doing this application,

Â it's not designed to show that one in centrality measures is always better than

Â another. But just to show that in particular

Â context, we can actually say something systematically about which ones seem to

Â be working better than others in making some predictions.

Â And the question that we're going to be looking at is diffusion, and we're

Â going to be looking at the first contact points in a process.

Â So there was a, in this case, um... diffusion process that was started and we

Â have some idea of which points in the network were first contacted.

Â And then we see what the diffusion looks like, and we have a bunch of different

Â networks going on. And we can try and compare across the

Â different networks and say, how does the, the, the centrality of the nodes predict

Â how successful the diffusion would be? The eventual diffusion.

Â So let me put this in context, and this is a part of a joint project that I've

Â have been involved with for a number of years with Abhijit Banerjee, Arun G

Â Chandrasekhar, and Esther Duflo. And what were looking at in particular

Â was diffusion of micro finance in 75 rural villages in Karnataka, which is

Â Southern India. These were villages that were fairly

Â remote and isolated from outside alone availability initially, and a partiuclar

Â bank, BSS, entered 43 of these villages and offered micro-finance to them.

Â And we went in and surveyed the villages and mapped out social networks before

Â the, the lending agency went into these villages, and then we tracked the

Â microfinance participation over time. So we've got diffusion over time.

Â And we can look at, we know the initial points that they touched, who, who they

Â first told about microfinance. So the bank would come into a town and

Â say, look here's a, a group of people that we want to so what they did in each

Â village was identify a particular set of people that they should talk to first.

Â shopkeepers, teachers, self-help group leaders, people that they thought might

Â be well connected in a village, and then they told those people look, we're

Â going to come in and we're going to offer loans.

Â we'll be back in a couple of weeks. Tell your friends about it and have them

Â spread news and then in a couple of weeks we'll come back and then tell you more

Â about it. And then over time they kept coming back

Â every two weeks, and then people could join the loan program and so forth.

Â And across these different villages, in some villages, they would get an eventual

Â participant rate in the loan program of you know, say mid-40's.

Â About 44% was the highest of any villages.

Â The lowest of any villages was about 7%. And one thing we can ask is, did it

Â matter which points, which people they talked to in a village first?

Â So it might be that in village number one the teacher's a very central individual.

Â But it happens that in village number 12 the teacher's not a very central

Â individual. So if you talk to the teacher in both

Â villages then in one village you're talking to a very central individual,

Â another village you're, you're talking to a non very central individual.

Â Then, does that make a difference in terms of what the eventual microfinance

Â participation rate was. Does it make a different in how much news

Â got out? so we have 43 different villages and we

Â can look at how central those nodes are and we can use different notions of

Â centrality that we've looked at and see which ones work well and which ones

Â don't. So, just to picture Karnataka here.

Â so actually in, the slide got a little distorted, but this is the area of

Â Karnataka here. it's all within you know, a couple of

Â hundred kilometers of Bangalore in South Western India.

Â And when we loook at the different villages, in each village we mapped out a

Â full series of networks, so this is, if you had to borrow 50 rupies for a day who

Â would you borrow them from, so we've got a borrowing question and...

Â I-, if we blow this up a little bit so you get a better picture, then what we've

Â got is each little collection of dots here is a household, and the arrows

Â indicate who they said they would borrow from, so somebody in this household said

Â they would borrow from somebody in this household and so forth, so we end up with

Â a borrowing network We asked a series of different questions, we actually have 13

Â different networks in total. Who do you go to temple with?

Â Who would you go to for advice? Who comes to you to borrow kerosene?

Â Who would you go to in an emergency for medical help?

Â So we have a whole series of different questions And we can then aggregate these

Â up and, and say that two households are connected.

Â They could talk to each other if they answered yes to any of these questions.

Â And, and we'll, we can work with the networks in different ways, but lets take

Â an undirected version of this, where we aggregate things at the household and say

Â that two households are connected if they either borrow kerosene or would go to

Â each other for medical help, or would borrow rupies from each other, et cetera,

Â et cetera. Okay, so we've got networks.

Â We've got a lot of other information, demographics

Â We've got the microfinance participation over time, number of households and their

Â composition, age, genders, subcaste, religion, profession, education levels, a

Â bunch of other things we can control for. cast information wealth variables,

Â participation rates in, in self help groups and ration cards, voting, behavior

Â in a whole series of other things. Okay?

Â So, so now we want to see whether centrality makes a difference in, in the

Â diffusion of this lone program. And so what we can begin to do is start

Â with say degree centrality, right. So, so you know, here if this were what

Â we saw in a village then, you know picking you know this individual and this

Â individual would be the most central individuals in the village.

Â And if you hit those individuals, you would expect to, to reach more just

Â because they have higher degree. so one hypothesis is that if we look at,

Â in villages where the first contacted individuals have more connections, so

Â higher degree centrality, then there should be a better spread of information

Â about microfinance. and more people knowing should lead to

Â higher participation, so basically high degree centrality of the first nodes,

Â should equal high microfinance participation.

Â Okay, so what do we see in the data? Here is the average degree of the first

Â contacted individuals, which we call Leaders here.

Â So these are the degree of the first contacted teachers, self help group

Â leaders, and shop keepers in the village. And here, on this axis, is the eventual

Â participation rate of the village. So, each one of these dots is a village.

Â So for instance, this village had a 7% participation rate.

Â So fairly low participation. And the average degree of the leaders was

Â about 17. this village over here had average degree

Â of leaders about 21, and a participation rate of 44%.

Â and so we've got a bunch of things. If you fit a best fit line through this,

Â actually it doesn't look like there's any relationship.

Â And if anything, the slope is actually negative.

Â So it doesn't appear as if degree centrality really captures what's going

Â on. Okay, so maybe we need another centrality

Â measure. Let's have a look at, you know, again,

Â when we talked about Eigenvector centrality we realized that looking at

Â degree doesn't tell a lot of the story because it doesn't capture how well you

Â are positioned in a network. And so if we look at Eigenvector

Â Centrality, where we have the centrality being proportional to the sum of the

Â centralities of your neighbors, then we are getting something which reflects this

Â better connectedness, as we talked about in the last lecture.

Â Okay, so let's have a fiat and look and see if Eigenvector centrality does a

Â better job. So, revisit our hypothesis.

Â In villages where the first connected people have higher eigenvector

Â centrality, there should be a better spread of information about microfinance.

Â And more people knowing should lead to higher participation.

Â So let's have a look. And indeed, when we put now the

Â eigenvector centrality, the average eigenvector centrality of the leaders.

Â And plot that against the participation rate on this other axis.

Â Now we get a significantly positive and, and strong relationship.

Â So having better placed leaders in terms of eigenvector centrality does a

Â reasonably good job of predicting the eventual mark microfinance participation.

Â whereas the degree centrality didn't seem to pick things up.

Â And, the idea here is that, why's eigenvector centrality is working better?

Â Because, you know this communication's a repeated process.

Â You tell your friends. They have to tell their friends.

Â And so forth. So if you have well-positioned friends,

Â and they have well-positioned friends, that is good for diffusion.

Â An eigenvector centrality is measuring that whereas degree centrality is not.

Â if you begin to, you know you can do the regression.

Â Regress micro finance participation on a series of variables.

Â If we look at the eigevectors of the leaders compared to the degree of the

Â leaders and regress micro finance participation on these variables.

Â We get positive, and significant relationship between eigenvectors of the

Â leaders and mirco-finance participation. Slightly negative and insignificant

Â relationship of the degree centrality. So indeed, eigenvector centrality seems

Â to be doing a better job. you know, we can look at a bunch of

Â different, notions, so here we look at regressing micro finance on different

Â notions of centrality, so the Eigenvector centrality degree of closeness, Bonacich

Â between this... Here, what I've done also is, is we're

Â also correcting not only for the centrality, but also let's keep track of.

Â You know, some villages are going to be larger, so they might have larger numbers

Â of people. Some might have more people who

Â participate in self help groups, which means they're already more prone to be

Â borrowing and lending from eachother. we have variables on savings, we have

Â cast variables, we can look at a whole series of different variables and control

Â for those and see, you know, that takes some of things out.

Â And again, I can vector centrality, so now degree turns out to be positive and

Â we control for these variables, but still insignificant compared to its standard

Â area... Eigenvector centrality is the one which

Â turns out to be positive and significant the other ones turn out not to be

Â significant. So you know, this is just one

Â application, but it's one application where now if we have a very particular

Â question in mind and we look at which of the centrality measures correlates with

Â the eventual outcome Eigenvector centrality is one that's correlating in a

Â positive way and the other ones are not correlating significantly once we've

Â controlled for a bunch of other variables.

Â So this just gives us an idea that these things are measuring different aspects of

Â the network and sometimes one can be a better predictor than another.

Â Now exactly what the causation here is we can tell stories, I can explain that it

Â probably has to do with communication and better connected friends leads to better

Â communication and so forth. Eigenvector centrality's picking that up.

Â but, you know, this is observational data, so we're not sure what the

Â causation is, but we do see that different.

Â Measures or picking up different things in the data that's going to be important.

Â Now again I want to emphasize here that this does not mean eigenvector centrality

Â should be your only centrality measure. It just means in this particular

Â application where we looking at a very specific type of diffusion it seemed to

Â be a better correlator than these other standard measures of, of centrality, and

Â depending on which application you're looking at, you know, between this seemed

Â to do a little better at explaining what was going on possibly in the Florentine

Â marriage data. So depending on which application you're

Â looking at it might demand a different, centrality measure.

Â