0:00

Now we're going to look at binary heaps which is an ingenious and very simple data

Â structure that's going to help us implement all the priority queue

Â operations quickly. So, the idea of a binary heap is based on the idea of a

Â complete binary tree. So, a complete binary tree, well first of all, a binary

Â tree is either empty or it's a node with links to left and right binary trees so

Â that's an example of a binary tree. A complete binary tree is one that's

Â perfectly balanced, except possibly for the bottom level. So there might be a few

Â nodes on the, on the bottom level and one level lower than the bottom level. But

Â otherwise, all the levels are full. We'll see how that looks in just a second. The

Â property of complete tree is that the height of a complete tree with n nodes is

Â the biggest integer less than log base two of N, and that's really easy to convince

Â yourself that that's true because the height, if you add nodes one at a time

Â going from left to right on the bottom level say, the height only increases when

Â n is a power of two. Complete binary trees actually happen in nature. Here's an

Â example of one that goes one, two, three, four levels at least, so sixteen pushes at

Â the end there. Alright. Now, the way we're going to use complete binary trees to

Â implement priority queues is to first of all, associate information with each node.

Â We'll put our keys in the nodes, and also we're going to represent it with an array.

Â So, [cough] when we start putting the keys in the nodes we're going to impose one

Â more condition that's called Heap Ordering. And the idea is that the

Â parent's key is going to be no smaller than its children's key for, and that's

Â true for every node in the tree. The array representation, all we do is, we put we

Â start with indices at one, it's a little less calculation. That way we leave a(0)

Â empty and then we just take the nodes in level order. So first, we put the root.

Â Then, we put the two nodes on the fi rst level going left from right. And then, all

Â the nodes on the third level, going from left to right and so forth. This is

Â interesting because we can draw the tree to get more intuition about what's

Â happening. But in the actual data structure representation, we don't need

Â any links at all. It's just an array. The way that we move around the tree is by

Â doing arithmetic on the indices. So let's look at a few properties of binary heaps.

Â So that's complete binary trees represented in array with keys in the

Â nodes satisfying the heap order property. Well, first thing is that a(1) is the

Â largest key. It's larger than the keys and its two children and they're larger than

Â theirs and so forth so it's the largest key in the data structure. The other thing

Â is, as I just mentioned, you can use the array in the seeds to move through the

Â tree. For example, if the node is at position K, index K in the array, then

Â it's parent is a K over two and that's integer divide. So the parents of say H

Â and G are both N. H is ten, G is at eleven, N's at five so both of those are

Â ten over two, eleven over two, integer divide is five. And going the other way

Â it's easy to see that the children of the node at K are 2k and 2k + one. So we don't

Â need explicit lengths at all to represent these data structures. We can just use

Â array indices. So [cough] that's the basic setup or the invariant that we're going to

Â maintain in this data structure. And now, what we're going to do is take a look at

Â just a couple of different scenarios that where we violate that invariant

Â temporarily and then fix it. And that's going to give us the flexibility that we

Â need to implement priority queue operations. So one scenario shown here, is

Â if for whatever reason a child's key becomes larger than its parent's key. So

Â in this case, we have an example where T, the node T here, its value changes and it

Â becomes larger than its parent key P. So, the heap order condition is satisfied

Â everywhere, except at this node. Well, it's easy to fix this one. All we do is

Â exchange the key in the child with the key in the parent. After that exchange, then,

Â that would have T up here and P down here then the heap order condition is satisfied

Â at that node because the parent was smaller, so that one's smaller. And so

Â that one is still smaller so T is after its exchanged up here will be bigger than

Â both its children. But the heap condition will be violated cuz T is still smaller

Â than S. So we have do it again, exchange it with S. So we move up the tree,

Â exchanging the larger key with its smaller parent, until we get to a point where its

Â larger than both its children. That's restoring the heap order along the path

Â from the place where it's violated to the root. You can think of that as kind of

Â like the well known Peter Principle where a node gets promoted to a level where it

Â finally can't be better than, than it's boss. It's a level of it's maximum

Â incompetence. And implementing that in code is really easy. We call that the swim

Â operation, it swims up to the top. And if we have a node at index K and we know the

Â heap condition is violated there, as long as we're not at the root and K's parent, K

Â over two is less than A of K. Then, we just exchange it with its parent, and move

Â up. That's the swim operation to eliminate a violation when a key value increases.

Â [cough] So for example, this gives us a way to insert a new element into a heap.

Â What we do is, we add a new node at the end of the heap, so this one position

Â over. The thing is, remember represented in array, one, two, three And so forth. So

Â the [cough] next empty position in the array there's a place to put a new node.

Â And then we just declare that, that's part of the heap and that node, well if it's

Â less than its parent, we're fine. But in general, we have to check whether the heap

Â condition is violated and exchange it with its par ent as long as it's smaller and

Â that's just perform the swim operation. So if N is the number of items in the heap,

Â defined to be in the heap we're going to increment it, store a new key there, there

Â and then perform the swim operation. So that's a quick implementation of the

Â insert operation. And notice since it's just going from bottom to top in the heap

Â it takes at most one plus log base two of N compares. [cough] Now there's another

Â scenario where a key becomes smaller. For whatever reason, a parent becomes key

Â decreases, it might become smaller than one or both of its children. In this case,

Â the value at position two has changed to H for whatever reason and that's smaller, in

Â this case, than both its children. So how do we fix that violation? Well, that one's

Â also easy. We figure out which of the children is larger. In this case, it's the

Â S, and we exchange our exchange that one with the one that's violating the

Â condition. So that's moving the smaller key down. After that exchange, then S is

Â in position two, and it's bigger than both P and H. And the heap condition is only

Â violated again where H is sitting. And again, we keep going until getting to the

Â bottom, or getting to a place where both children are smaller. And that's maybe a

Â little bit what happens when a, a new boss is hired from the outside and then the two

Â subordinates struggle to take over that position and then the bros, boss would get

Â demoted to it's level of competence. And again, that level of flexibility. Here's

Â the implementation of it. And again, it's quite straightforward using the [cough]

Â index arithmetic to move around in the heap. If we're, and that's called the sink

Â operation, cuz we're going down the heap. If were at position K, then what we need

Â to worry about it is the nodes at 2k and 2k + one. And the first thing to check is

Â find out which one's bigger. It's either 2k or 2k + one and so set j accordingly.

Â So that's j now, is, after this st atement is the larger of the two children. And

Â don't forget to check that we're going off the end of the heap. And then if, [cough]

Â the K is not less than either one of those, then we're done. If it is, then we

Â exchange with the larger of the two children and move down the heap. Again,

Â just a few lines of code to eliminate the violation when a key value in a heap

Â decreases. And that one we're going to use to implement delete the maximum in a heap.

Â So delete the maximum, we have to do two things, one thing is the size of the heap

Â has got to go down by one. The other thing is return the maximum. Well, we know that

Â [cough] one that we want to return is the one at the root. So we'll save that value

Â away to return to the client. And then, since it has to go down by one the place

Â to get the to remove the element from the heap is at the end of the array cuz it's

Â now going to have to not occupy that position, so we take that element and

Â replace the root with it. So I move the H up and actually, put the root value there

Â just exchange them but it's not longer in the heap. Now, that element which went

Â from the bottom to the top is most likely going to violate the heap order. It's

Â going to be smaller than one of its, both of its children. So, we do a sink. [cough]

Â now in this case to implement the lead max we save away that value at the root in max

Â and we eliminate loitering by nulling out that vacated position then return the max

Â value. So that's an implement, implementation of the delete max operation

Â for heap using sink where a key value that decreases, go down, goes down in the heap.

Â So let's just take a look at what happens with a, a real heap with the demo when we

Â do these things. And you'll have a good feeling for how this data structure works.

Â So, we're starting at some point where we have these ten keys in heap and it's heap

Â order. So we've drawn the data structure with the links, so we have an intuition

Â for what's going on. But all the program sees is the array and grey at the bottom

Â where T's in position one, P and R are position two and three, and so forth. So

Â now, suppose that we're supposed to add S. So to add it to the heap, that's going to

Â go in the position at the end. Then now we've added it to the heap just by

Â incrementing N, putting it in there but now we have to bring the heap order back

Â into condition. And so that's going to, now that key is larger than its parent so

Â we're going to swim it up exchange it with its parent as long as it's smaller than

Â its parent. So, first thing it goes up exchange with the S, it's still bigger

Â than P. So we exchange it with the T and now we're done because S in not bigger

Â than T and the heap order condition is now satisfied everywhere in the heap. So, with

Â just two exchanges we insert that new element in the heap in this case. I now

Â suppose the next operation is remove the maximum. So, we're going to take T and

Â exchange it with the last element and then declare that to be no longer part of the

Â heap. So now [cough] we have to bring the heap order back because it might be

Â violated at the root so now we invoke the sink where we exchange that node with the

Â larger of its two children until we find a place where its larger than both its

Â children. So S is the larger of the two children R and S and now H is still

Â smaller than both it's children so we promote the larger, which is P. Now H has

Â no right child, just a left child and it's larger than that one, so we're finished

Â with that operation. We've removed the maximum and we still have our data

Â structure heap ordered and our n keys stored in first n positions in the array.

Â Let's remove the maximum again. Again we take it out by exchanging this time G with

Â the root and then [cough] decrease the size of the heap by one. Just take that

Â out. Now, t he node of the root violates the heap border so we have to exchange it

Â with the largest of it's two children, in this case that's R. Again, G is not larger

Â than it's two children so we have to exchange it with the larger of the two,

Â that's O and now we are done, we've removed the largest again. Now suppose we

Â insert S back into the heap. So [cough] that's adding it at the end, violates the

Â heap order exchange it with the parent smaller and keep doing until we get to a

Â place where it's larger than its two children. In this case, S goes all the way

Â up to the root. And it's all heap ordered again. So that's a little survey of some

Â operations on a heap and you can see how every operation is done with just a few

Â exchanges along the path from the bottom to the top or the top to the bottom. Okay,

Â here's the complete Java implementation of a priority queue using the binary heap

Â data structure. [cough] it's actually not so different from the elementary

Â implementations that we looked at in the last section. Our representation is an

Â array of keys and a size, that's the number of items in the heap. [cough] for

Â simplicity, we'll show the code where the client gives the capacity of the heap. We

Â can use resizing array, in industrial strength implementation, the same that, we

Â did for stacks and other data structures where we use arrays. So we'll build a new

Â array of keys and we have to use an ugly, ugly cast because we can't have generic

Â arrays in Java. And that, so it's comparable and, and we need one more than

Â the capacity to handle this thing where, we don't use position zero. So the

Â priority queue operations, is the insert and del max that we showed in the previous

Â slides, is empty, is just checking whether n is equal to zero. We have the swim and

Â sink functions that we showed earlier. And then we have helper functions less and

Â exchange, that access the array directly so that the co de doesn't have to access

Â directly. That's a complete implementation of priority queues in Java. And this is,

Â this implementation by itself is extremely significant because, it's really very

Â simple, optimal representation of the data. And only a little arithmetic with

Â array indices. But as you can see by looking at this table, it gives us a way

Â to implement priority queues where, both operations are guaranteed to happen in log

Â N time. Now, experts have worked to come up with improvements on this and there are

Â slight improvements possible. We can make the heap d way rather than just two way

Â and depending on the frequency of execution of the uncertain del max

Â operations that might work out better. There's an advanced data structure called

Â a Fibonacci Heap, where inserts are done in constant time and delete max done in

Â log N time on an average over all the operations. That ones generally too

Â complicated to use in practice. But still again, using theory as a guide maybe

Â there's a way to, [cough] to decrease costs a little bit from binary heaps. And

Â of course, we cannot get down to having constant time for all operations. Why?

Â Well, we can sort with a heap by inserting all the elements and then deleting the

Â maximum of getting a sort done and that would be in linear time if we had this

Â kind of variation. If, if we have constant time operations for both insert and del

Â max. But for certain applications, we can get close to constant time for one or the

Â other operations and that'll be useful in different implementations. Now, there's an

Â important consideration, that, in, that we have to bring up related to the

Â programming language and [cough] this is, a more general consideration than usually

Â we bring into focus in algorithms but it's worthwhile mentioning. We're assuming that

Â the client doesn't get to change the keys while they're on the priority queue. And

Â it's better not to ass ume that it's better to arrange for that in our

Â implementations by using keys that are immutable, who's values don't change.

Â There's many reasons that immu, immutable keys are [cough] that programming

Â languages provide the capability to build immutable keys and, and this is a fine

Â example of one. So and we'll talk more about that in a minute. The other things

Â that, that, we didn't talk about in the implementation should throw in, exception.

Â If the client tries to delete from an empty priority queue and we should have a

Â no argument constructor, and use a resizing array, to, account for gradual

Â growth and shrinkage in an industrial strength implementation. Usually we

Â provide two implementations, one that's max oriented, one t hat's min oriented so

Â that nobody get's confused and they're the same except the less and greater switch.

Â And we'll see later on, there's times when we want to add, expand the API and provide

Â other operations like removing an arbitrary item from the priority queue, or

Â give the client in the API the capability of changing the priority of an item. Our

Â sink and swim methods are good for making this happen, but we'll, delay these

Â implementations until we need them in a more complicated algorithm. So what about

Â mutability? So in every thing in Java is implemented as a data type, a set of

Â values and operations on those values and the idea of immutable data type, is you

Â can't change the value once it's created. So that's kind of like, when you [cough]

Â when you, when you create a literal value to be assigned to an integer, it has that

Â value. So here, here's an example say using the data type for vectors might be a

Â way to implement vectors. So we put the word final to means that instance methods

Â can't be overridden. And not only that, instance variables are private, they can't

Â be seen from the outside and they don't change. And so a constructor for an

Â immutable vector data type, it might take an array [cough] as it's argument, and

Â that array has got values stored in it, say doubles, and those are, those can

Â change but what immutable implementation would do would be to copy those values

Â into the local [cough] data array instance variable and then those values are not

Â going to change. And the instance methods won't change them and the client can't

Â change them. So that value stays the same. Lots of, implementations, data-type

Â implementations in Java are immutable, like string is immutable. When you create

Â a string that value doesn't change. [cough] if you want a new string, you have

Â to create a new string, using concatenation or some other operation. And

Â the same with the wrapper types, like integer and double, or color, and, what

Â lots of things. Whereas on the other hand, sometimes, the whole purpose, of a data

Â type is to maintain a changing value like a good example is like a counter, or a

Â stack. So you wouldn't put those things on a priority queue cuz the value is changing

Â but the other ones you would. So the advantages of immutability and again,

Â maybe this isn't the place to really sell those advantages more for a programming

Â language course is that it, it really simplifies debugging. We can be have more

Â confidence that our priority queue operations are going to work correctly if

Â we know that the type of data that's on the priority queue is immutable. If the

Â client could change the values, how do we know that the heap ordering operation is

Â preserved? If we want the client to be able to change values, we're going to

Â provide methods for that purpose as I just mentioned. And there's many other reasons

Â that people use immutable data types. There is a disadvantage that you have to

Â create a new object for every data type value but for a lot of applications that

Â disadvan tage is not viewed to be significant compared to the advantages.

Â Here's a quote from a one of Java's architect, Josh Black. Classes should be

Â immutable unless there's a very good reason to make them mutable. If a class

Â cannot be made immutable, you should still limit its immutability as much as

Â possible. And many programmers live by that kind of precept. So that's a basic

Â implementation of priority queues using the heap data structure represented as an

Â