And the natural greedy algorithm that is optimal, is called the furthest in the

future algorithm. So what is the furthest in the future

algorithm? Well, it's exactly what you think it

would be. It's basically what seems like a good

idea at the moment you have to perform a eviction from the cache.

Basically you want to put off judgment day.

You want to put off the regret of evicting this particular piece of data as

long as possible. When are you going to regret evicting a

piece of data? Well, it's when it gets requested next.

So if we have four things in the cache, you know, one is one is requested next,

one is requested in seven time steps and one is requested in you know, 70 time

steps, that's the one you want to evict now because it will take the longest

until you actually regret that eviction. So for example, in the example on the

previous slide, you can check that the furthest in the future algorithm would in

fact evict the ones you want to evict, a and b,

not the ones that we evicted in the example, c and d.

Now at this point, many of you are probably justifiably scratching your

heads. You're wondering, you know, why is this

useful. It doesn't seem like this is what we

wanted. The objection, to this result being that the furthest in the future

algorithm is clairvoyant. Its very definition assumes that you know the

future, it assumes that at the moment that you have to make an eviction you're

aware of when each of the pieces data in the cache will be requested next.

But if you think for a minute about the motivating applications for sudding the

ultimate caching problem, this assumption simply doesn't hold, you simply do not

know the future, you simply do not know when each of the pieces of data in your

cache will be requested next. So this algorithm is not defined, it is

unimplementable. Despite that, this is still an extremely

useful result to know. Why.

Well, two reasons. First of all, this unimplementable

algorithm can never the less, serve as a guide line for practical.

Implementable algorithms. For example it naturally motivates the

LRU or least recently used caching algorithm.

So what you do in the LRU algorithm is that instead of looking forward in the

future, which you can't do generally, you look in the past.

And you say, well, let me guess that whatever's been requested recently, will

be requested again soon. Whatever hasn't been request been

requested for a long time, will continue to not be requested for a long time.

So, that sets as a proxy for the piece of data that's going to be referenced the

furthest down the future, you look for the one that was most recently referenced

the furthest back in the past. So that's the LRU algorithm.

And as long as data exhibits what's called locality of reference,

meaning whatever's being requested a lot in the recent past is also going to be

what's requested in the near future. Then LRU is going to approximate furthest

in the future. And indeed, LRU is in many applications,

the gold standard amongst practical implementable caching algorithms.

The second reason this theorem is useful in practice is because it served as an

idealized benchmark. A hypothetical perfect scenario against

which you can compare your latest and greatest cashing hereistic.

So for example, maybe you have a caching application and you start by implementing

the LRU least recently used caching algorithm and then as a sanity check you

probably want to go back later once you have hindsight you look at the last few

days of traces of logs of page requests and you say, how well did we do.

Let's look at how well our caching algorithm LRU did.

And let's look at how well we would have done had we known the future.

And hopefully, you're just a few percent away.

And then you can conclude that, yes indeed, the data seem to have locality

reference. Yes indeed, LRU is doing almost as well

as if we know the future, and we can proceed.

On the other hand, if you go back through the last few days of logs, and you find

that your caching algorithm is doing much worse than furthest in the future, then

it's back to the drawing board with respect to your caching algorithm.

You should work harder, understand the data better, and come up with a smarter

heuristic. So for almost all of the greedy

algorithms that we cover in this course, I'm going to explain to you why they are

correct. I'm going to prove it rigorously.

this algorithm is an exception. I'm actually not going to prove this

theorem for you. the way it works is by what's called an

exchange argument. So again, you may not have seen any

examples, but you will soon. but the exchange argument to prove the

latter result, as far as I know it's pretty tricky, believe it or not.

Even though the algorithm is natural, you might think this result feels a little

self-evident. Try to prove it rigorously.

Not easy. Not easy at all.

Indeed if you look at a textbook and say operating systems or a field like that,

generally you'll see a description of this algorithm.

You'll see the claim that it's optimal but you won't find the proof.

some algorithms textbooks, for example Algorithm Design by Kleinberg and Tardos,

do include the proof of this theorem. Those of you that are interested, I

challenge you to prove it yourself without looking it up on the web or in a

textbook. I think if you try to prove it you'll

appreciate the subtleties that come up in understanding whether greedy algorithms

are correct or not. In the best case scenario, I would love.

Love to see, a simple proof of this theorem, something that I could explain

in a class like this, in say five minutes or less,

that would be amazing.