Module 3, Threads in Go. Topic 2.1: Basic Synchronization. So, synchronization is when multiple threads agree on the timing of an event. So you have these global events. Synchronization means you synchronize on these global events whose execution is viewed by all threads at the same time so they can agree on the timing of this particular event. Now, why this is interesting is because in threads using threads in Go routines are basically threads. So when you're using Go routines, they typically, one Go routine doesn't know anything about the timing of it's fellow Go routines. So if a Go routine is executing some code, it knows what line of code it's at, it doesn't have any idea what line of code the other Go routines are at and it doesn't care. For the most part because they are largely independent, not completely and we'll get to that but largely independent. So normally, these Go routines don't have any understanding of the timing going on in the other Go routines. But synchronization basically breaks that. Synchronization says, look, we're going to make some kind of a global event that every thread sees or every Go routine sees at the same time. These are important to restrict ordering, to restrict some of the different interleavings that are possible. We have this small example here where we're ordering, relative ordering between two different Go routines or two different threads in this case, is actually very important. So it's important for them to synchronize on certain events. So just to explain this picture. On the left, I got a left column and a right column. I've got two little tables. One, on the left column and a right column, left column is one simple thread and it doesn't do much. It has two instructions, x equals one, x equals x plus one, very simple. Then on the right, that thread just does one thing, print x. Now what I'm showing here in these two different tables, one at the top, one at the bottom, I'm showing two different possible interleavings of the instructions and there are other interleavings but I'm just showing two. So in the top one, the print x happens after the first instruction on the left thread but before the second structure. So if you print x at that point, x is going to equal one. Where if you look at the bottom, you got this time in this interleaving, the print x happens after the second instruction on the left thread. So at that point, x is going to equal two. So, what I'm showing here is basically a race condition where the output, what's printed is going to depend on the interleaving of the instructions. The interleaving remember is not deterministic, it is determined. When I say non-deterministic, in reality it is not non-deterministic, there is a Go runtime schedule is an operating system scheduler and they are deterministic programs. But from our point of view, it is not deterministic because we don't understand their algorithms, we don't know of weird things like interrupts are happening and things like this that can alter the scheduling. So from our point of view, the scheduling is non-deterministic. So we don't know, as a programmer we can't tell which one of these two interleavings it'll execute and remember every time it might run a different interleaving. One time it might run on the top, one time it might right on the bottom and so on. So, in order to understand what the correct operation of this whole program is, we have to know what the intention of the programmer was. So let's assume that you want to do the print, you want the print to occur after the update of x, after the x equals x plus one. So that means the bottom interleaving is okay but the top one is bad. But remember that since we have no control over their relative ordering that's all up to the scheduler, we can't control which one of those two happens. Well synchronization allows us to control that. We're going use synchronization to make it so that the top interleaving is impossible. In order to do it, we need some kind of a global event, some sort of event and there are lots of different forms of this which we'll get into in later modules. But we need some sort of a global event that both of these threads see happening at the same moment. Then we can have conditional execution. One of them could say look, if this event happens, then I execute or else I wait, something like that. So let's show that right here. So here's my synchronization example, I got those same two tasks. Though task one on the left, it just does the x equals one, x equals x plus one. Task two on the right, it does a print x but notice I've got this global event now. So, I didn't specify what the global event is, we'll get to that later. But in task one after doing x equals one and x equals x plus one, it executes this global event, whatever it is. So that event is seen by all task, all threads including our task number two. In task number two, it says if global event has happened, then print x. So what happens here is that that print x in task two can't happen until the global event has happened in task one. So, that has restricted away some of the possible interleavings. Now, print x has to wait until after x equals x plus one. You can see that x equals x plus one has to happen first before the global event and print x has to happen after the global event and so the ordering is restricted. So, this is an example of synchronization. Now we need the synchronization in an example like this because of our intent. We needed this print x to happen at a certain time relative to the print x in the task two which has to happen after certain instructions in task one. There are examples like that, actually we'll come up with more of these examples in a later module but there's many examples like that where certain events have to happen in a certain order. Now notice that this is the opposite of concurrency. So concurrency, the beauty of concurrency and parallelism is that the interleaving is arbitrary. In fact if it's parallel, things can run exactly at the same time but even if it's not parallel, it's just concurrent, the ordering is arbitrary, we don't care about the ordering and that gives us a lot of advantages, it allows us to speed up the execution of the code. Like with concurrency, if a thread is ever waiting blocked waiting, then since the order doesn't matter, the scheduler can just move in another thread in the meantime while the first one's waiting. So we get a lot of optimization, a lot of speed improvement when we don't restrict the scheduling. But by doing this, by using synchronization, we are restricting the scheduling. So understand that every time we use synchronization, we are reducing the efficiency, we're reducing the amount of possible concurrency. So we're reducing the options for the scheduler so the scheduler won't be able to use the hardware as well as effectively. So there might be times when nothing can execute because we're waiting on the synchronization events. So in general, synchronization is bad because it reduces your performance and your efficiency in general but it is necessary for cases like this where you have to restrict the ordering. Certain things have to happen in certain orders. Well, there a lot of tasks where that's the case and we'll bring up more examples like that. So synchronization is necessary even and I also mentioned that it is complicated to use, it can be, although Golang makes it pretty easy. You have pretty straightforward constructs. So it's easier to use in Golang than it might be saying Pthreads or something like that. Anyway this is an example of synchronization, it is bad but necessary. Thank you. Module 3: Threads in Go. Topic 2.2: Wait Groups. So, WaitGroups are particular type of synchronization that are common. So, we're going to start talking. Right now, we'll talk a little bit about the Sync Package. Sync package, you have to import that. But Sync package, it includes WaitGroups. WaitGroups are useful for synchronization. We're going to cover other parts of the Sync Package later, but right now, let's use WaitGroups just to deal with the problem that we were dealing with earlier about the main ending early before the goroutine that the main started guess it's finish. So, the Sync Package has a lot of different synchronization functions built into it, synchronization methods. But, since that WaitGroup, what that does is it forces a goroutine to wait for other goroutines. So, WaitGroup is a basic group of you can think of it as a group of goroutines that your goroutine is waiting on. Your goroutine will not continue until all those ones that are in the all goroutines in the WaitGroup until they've all finished. So, this is what we need for the example that we had before, remember that print a message and then I had the new goroutine print the message and the new goroutines message never got printed because the main goroutine executed and completed before the new goroutine had a chance to execute. Then I force the new goroutine to exit and it never got to finish. Okay. So, what we want to do is make this main goroutine wait until the new goroutine that has created until that finishes. Until that exits, then the main routine can continue. Okay. In that way, we make sure that this new goroutine has the chance to actually execute before the main goroutine completes. So, that's what we want. We want to have a goroutine. In this case, a main goroutine, we want to wait on another goroutine. Now, WaitGrouping is a general function, general set of methods, general object with a set of methods associated with it. You can wait on a lot of different goroutines if you want to. So, you can have one goroutine waiting on 10 different goroutines and wait till they all complete. In our example, we're only going to wait on one, but understand this is generalizable, so you can wait on as many as you want. This sync.WaitGroup object, it contains an internal counter often called a counting semaphore but just think of it as a counter starts off at zero, okay? You increment the counter for each goroutine you want to wait for. So, if there are three goroutines you want to wait for, you increment the counter three times, so make it three. Then you decrement the counter when each goroutine completes. The counter is up to three. You're waiting for three. Every time a goroutine completes, it goes down from three to two, two to one, one to zero. Now, the waiting goroutine is waiting for these three, it can't continue until the counter is down to zero, which means that all the different goroutines have all completed. That's the idea of how the sync.WaitGroup object works. It got a set of methods associated with it that allow this to happen, so let's take a look at something like what it does. In this case, I'm sure, I'm trying to depict two things. On the left column there, label there Main thread. That blue vertical set of lines that's the main thread and then the right is this Foo thread which is the new goroutine that the main goroutine has created. Okay. So, I'm just showing a little bit of the code execution. Actually, first, let me go down to the bottom and name these methods that are associated with WaitGroup, so we understand what they do. There's an Add method, a Done method, and a Wait method. Add method increments the counter. Okay. Remember, I said there's this counter and you increment it for once for every goroutine that you want to wait for. So, Add does that. So you can say, add pass in an argument three, it'll add three to the counter. Now, Done decrements the counter. Okay. So, Done is what's executed at the end of each goroutine, each one of these goroutines that the main goroutine is waiting on, they need to execute a Done at the end to decrements that counter back down. Then once all of them have decrements the counter down to zero, then the Wait will continue. So the Wait is called inside the main. Okay. What will happen is that Wait will block until the counter is equal to zero. So, if the count is equal to one or anything greater than zero, that means that some of these goroutines that you're waiting on are still executing, which means you have to wait. So, it'll just block until that counter eventually gets down to zero because finally, all the goroutines have executed the Done. Then once it gets down to zero, the Main will continue, it'll go pass the Wait, the Wait will complete and you can continue. If we look at the picture now, I'm just showing a sketch, trying to highlight the important parts of the routine, of the Main thread for instance and the Foo thread. So, in the Main, first line-up there, I'm saying, I'm creating a WG, a variable called WG and it's a WaitGroup. Then first thing I do with it you see highlighted in red, I say WaitGroup, wg.Add one, because I know that I want my Main thread to wait for one goroutine, so I add one. Then, I create the goroutine, gofoo and I pass it. In this case, I pass it the WaitGroup. That needs the WaitGroup because remember that the Foo thread, the new thread that I'm creating, it's going to call Done on that WaitGroup to signify that it is complete. But let's stick with the main thread for a second. So, the Main thread, it creates the Foo, the new thread goFoo and then it calls Wait. So, it says wg.Wait. Now that point, it's going to block on that line until this Foo thread that we're waiting on actually completes. Now, then go over to the right, the Foo thread. It is created, you can see when goFoo is called, it creates a new thread, so you see that red vertical line that's the new Foo thread. So, the Foo thread does whatever it's supposed to do, I don't know what code it has and it doesn't really matter. But the end, whenever it's done, it calls wg.Done and that will decrement the WaitGroup, the counter inside the WaitGroup. Now, like in this example, I added one to the counter, so the count is equal to one, I added that in the Main. In the Foo thread, once it's done, it decrements instead by calling down. So, at that point, when it calls down, the counter in the WaitGroup should be equal to zero and then the Wait, the wg.Wait function call that was called inside the main thread that will continue, that will complete and the main thread that can continue with whatever code it has to execute after that Wait. So, this an example of how WaitGroups use. In this case, we're only waiting on one thread, this Foo thread that one goroutine, but we could wait on any number if we wanted to just by adding. If we want to wait on 10, we'd add 10. All right and then each one. Now, remember as a program, you've got to make sure to use these methods correctly. You got to put the wg.Add in your Main or in your goroutine before you want to wait and it has to be incremented to the value of which is the number of goroutines you want to wait for. You've also got to make sure that each one of these new threads that you're waiting on, they all call Done, the Done method on this WaitGroup at the end when they are finished and they going to wait till they've finished. It can't do it early. A lot of times people will use a defer to make sure that it happens at the end. Then you've got to make sure as a programmer that in your Main or whatever the waiting goroutine is. Like in this case, it's a Main, but whatever the waiting goroutine is, you call wg.Wait. If you don't call Wait then none of this will happen. There won't be any waiting going on. It's up to the programmer to inject the Add, the Done in the Wait at the right places to make this whole mechanism work. So, here's a slightly modified example of that print problem that I had, the prints assignment where I wanted the simple version. I just said, the Main create the Foo is actually you can see it here. If you look at the Main function in the middle of it, it says go foo. Right. So, it creates a new goroutine that executes Foo, but I've added some code. So after that, it's supposed to print the Main routine and the Foo is supposed to print new routine. So, I've added a few things, if I look at Foo, it prints new routine. But I add this line says wg.Done, so it has to call Done that was the thing that I had to add. So, I add that. Also in the Main, things that I've added are, I define the WaitGroup or wg Sync.WaitGroup. Then I say, wg.Add one, before I create the goroutine, I add one to the WaitGroup because they know there's going to be a new one goroutine that I'm waiting for. Then I create the goroutine with that go foo and then I call wg.Wait after that so that my Main will actually wait on the goroutine to complete. Only after that the Foo goroutine has completed, will it continue on and print Main routine. So in this way, my Main won't just end early and cause the Foo goroutine to never complete. Right. Now, the Main it won't print, it won't exit and won't print Main routine until after the Foo goroutine has already completed and done his job and printed new routine. Thank you.