So what does the cross product look like? Well imagine you had table employee with these two columns and a table dependent with these two columns, and we do a, oops there's a, there we do a cross product of employee with dependent. That's all possible combinations of employees with all possible combinations of dependents. So we know there's going to be four in the output. And you can check to see that John is here twice, once for every instance in dependent and so on. All right. So now let's talk about join, and I probably maybe should have put a slide in here just about join in general, first. So really, we're talking about join. The most common instance of join that you're going to run into. And in fact, if we don't qualify it, this is what we mean, is what I'll call equi-join. And so equi-join is a join with an equality condition right here. Ok, so what does join do? Join says, for every record in R1 find me a corresponding record in R2 that satisfies some condition. And, in general, especially for those of you familiar with databases, you're going to be thinking of that in terms of equi-join, this would be for every course in finding student IDs that appear in particular courses. Okay. So joins on primary key and foreign key if you don't mind the jargon, are instances of equi-joins. They need not be. This is one point I want to make. So we're not talking too much about schemas. In this course, I'm more interested in teaching about the relational algebra and showing how it comes up rather than teaching about how to design a database in the first place. And the reason for that is that you often don't have the luxury, I made this point before, we don't have the luxury of a engineered schema. You don't have time to build one. You don't have- there isn't one handed to you in the first place. There perhaps isn't much need for one in that you're going to get your answer to a couple of questions and then you might sorta move on to something else. So there's not really a way to amortized the cost of developing the scheme in the first place and so on. So If you work with databases, most of the joins you're doing are going to be along these predefined relationships that are called foreign keys, but those need not exist in order to apply a join. Okay. And then I'll point out just this sort of syntactic note. You could write this two different ways in SQL. You'll sometimes see select star from R1 join R2 on some join condition. And other times you'll just say select star from R1, R2 where this condition is met. Now if I literally translated this into a brain-dead relational algebra plan. And we will talk about how to do this a little more mechanically in a bit. This is actually saying sort of saying well hey look, first build the cross product of R1 and R2, right? And then filter that cross product so that this condition is met. Okay. And this one is saying no, no, no, don't do that. Actually use the join operator. Rather than generating the cause product, but the optimizer, the databases are not that stupid. Right? They're smart enough to figure out that even in this case the right way to do this is to express a join. And so in practice there's no difference between these two different ways of spelling the same query. Okay. And in fact, to do a first approximation, two equivalent queries, different syntax but same semantics in SQL, there's not gonna be any difference between them. The optimizer doesn't care how you write your SQL. It's going to optimize the thing anyway, it's gonna turn into relational plan and manipulate that plan to guess the best way to evaluate that query, okay. And so I say to it first approximation because there are such things as query hints and other ways you can sort of tell the optimizer how you'd like the query to be evaluated. We're not going to talk about that because they're rarely important. And the other thing, it's not impossible to have two different queries that do the same thing that the optimizer can't figure out are actually equivalent. It's not impossible to get two different plans. But typically that won't be the case. And in this example, that doesn't matter at all. Okay. Join versus the where clause. I'll tend to write queries this way even when they come up, okay. Fine. So that's the most common and the simplest instance of this join operator. Okay, in fact I think I did us a disservice here by doing this on left and right. This actually kind of a nice example. This is the SQL equivalent of writing join, and this is the relational, sort of SQL equivalent of writing cross product followed by a selection. Right? They're the same. In fact, the only reason why in the algebra, you remember, the algebra is a formalism. Maybe you don't care so much about deriving new operators as long as you can express a thing it doesn't matter. There's so much work and there's so many good algorithms for implementing join that it deserves its place as a specific operation. We don't wanna have to write cross product followed by select. When we're actually talking about the efficient join algorithms okay. [MUSIC]