In this unit, we are going to learn about a functional way to do decomposition, which is called pattern matching. The task we are trying to solve is to find a general and convenient way to access heterogeneous data in a class hierarchy. The example we had was a class of expressions, and then that had subclasses, number and sum, and then possibly others; product, variables, and so on. Our first attempt was to use accessor and classification functions that suffered from quadratic explosion or a second attempt to use type tests and type casts wasn't very safe nor pretty. The third attempt to use object oriented decomposition worked in some cases but had limitations. Another solution is to use functional decomposition with pattern matching. This relies on the observation that the only purpose of test and accessor function is to reverse the construction process, namely, we want to know which subclass was used for the construction of this value and what were the arguments to the constructor. That situation is so common that many functional languages, Scala included, automate it. Scala supports functional decomposition through case classes. A case class definition is similar to a normal class definition, except that it's preceded by the modifier case. Here we have our expression class hierarchy and number, and some are now case classes. Those two classes are now empty and no methods given in either one, so how can we access their members? That's where pattern matching comes in. Pattern matching is a generalization of switch from C/Java to class hierarchies. It's expressed in Scala using the keyword match. For instance, to define an eval function on expressions, what we could do is do a pattern match on the expression in question. There are two cases; if the expression is a number, then we return it's numeric value, and if it's a sum over expression e1 and e2, then we evaluate e1 and evaluate e2 and add the two results. What you see here is that our pattern at the same time identifies a case number or sum and names the elements in that case. Here the n for the numeric value or the expressions e1, e2 for the left and right operands of the sum. In general, match is preceded by a selector expression and is followed by a sequence of cases that all take the form pattern arrow expression. Each case associates an expression with a pattern. If no pattern matches the value of the selector, an exception is thrown and the exception of the class match error. Here's a more complicated pattern. That pattern is formed from a constructor sum that gets applied to arguments. The first argument is a constructor number. It gets applied to a variable pattern, so that's the second form here. The second argument to sum is a wildcard, it's essentially a don't care operation, it says it can match anything and we don't bother to give it a name. Besides these forms of patterns, you can also have constants in patterns such as 1 or true, and you can also have type tests and pattern such as n: number. We could, for instance, say here an x: number as the second argument and that would match any number, so not another sum and name it x. Variables such as x or n, always begin with a lowercase letter in patterns. If they would started with an uppercase letter, they would be interpreted as constants. The same variable name can only appear once in a pattern, so sum x, x is not a legal pattern. Names of constants all should begin with a capital letter, with the exceptions of the reserved words null, true, and false. Let's look at evaluation. If we have a match expression of this form, that expression matches the value of the selector e with the patterns p_1 to p_n in that order, in the order in which they are written. The whole match expression is then rewritten to the right-hand side of the first case where the pattern matches the selector. References to pattern variables in the pattern are replaced by corresponding parts in the selector. We'll see how that works in an example shortly. What do patterns match? If you have a constructor pattern like this, then that matches all values of type C or a subtype that have been constructed with arguments that in turn match the patterns p_1 to p_n. A variable pattern matches any value and binds the name of the variable to this value. The same thing is a wildcard pattern matches any value and does not bind any name to that value. A constant pattern c, matches values that are equal to c in the sense of equals, equals. If you have, let's say 22, then that would get would match the value 22 and no other value. Finally, the type pattern like n: Number would match any value that is a number and name it with a name n. Here's an example. Let's evaluate the sum of number 1 and number 2. As usual, we replace the call to eval by its right-hand side, where the parameter e of eval would be replaced by the argument here. That gives the argument here in select a position of a match that you see here. Now we match this selector with the patterns one-by-one. The first pattern does not match because this one is a sum and that one is number, but the second pattern matches. We replace that match expression by the right-hand side. This one here. Furthermore, e1 gets replaced by the corresponding part in the selector. That's number 1, and e2 gets replaced by number 2 because that matches this part of the selector. Now to evaluate that expression further, we replace the first call of two eval here by its right-hand side. That would give number 1 match, again, the same two cases. Once we've done that, we still have to add with eval number 2. Now, we see that it's the first pattern that matches, so number line up here and the variable n gets replaced by the selector expression one. That leads to 1 plus eval number 2. Proceeding in the same way, eval number 2 gets 2, and the whole result becomes 3. Pattern matching functions can be defined anywhere. So far we've defined eval outside the expression type hierarchy, but we could also put it as a method inside trait expression. In that case, the match would be on the current values, on this instead of a given parameter. Otherwise, it would be exactly the same match. Here's an exercise for you. Write a function show that uses pattern matching to return the representation of a given expression as a string. We want to write this method and you should fill in the triple question marks. I have added what we had so far into a worksheet. We have the expression class hierarchy, the eval function, I have a sample expression, which is just 1 plus 1 and we evaluate it. Now we have to define a show function. The template is quite obvious. We will match on the expression. If it's a number, we have to do one thing and if it's a sum, we have to do another thing. If it's a number, what do we need to do? We just return the number here as a string because that's the result type of show, so it's n.toString. Remember, all values in Scala have a toString object that converts them to a string representation, or we have a sum expression, sum e1, e2. What do we do in this case? In this case, we return a string that contains the recursive calls of show on the two operands and a plus in the middle. Let's put that to the test. Let's say show off expression. What do we get? We get 1 plus 1 as expected. Here's a second exercise which is a bit harder. Add case classes Var four variables x and Prod for product x times y, as discussed previously. Change your show function so that it also deals with products. But pay attention that you get the operator precedence right, at the same time, use as few parentheses as possible. For instance, if you have a sum of products, then you can leave out the parentheses. That's a valid output. But if you have a product of sums, then you need the parentheses around here because otherwise it would change the meaning of the expression. Let's see what we have to add to the worksheet. First, the two case classes, use the Var class and here's the product class. How do we change show? For the Var class, we just return the name of the variable. For the product class, we might try this, essentially treat it like the sum class. Let's put it to the test. Define another test value, expr1 equals, let's say, product, expr, Var x and show expr1. I forgot to make Var extent the expression. Good. Now we see something. We have the x Var1. But if we show it, then it's actually wrong because it looks like we multiply x by 1 and add 1, where in fact we wanted to multiply x by 2. How do we fix that? Well, the problem is that we can't really use show here because it doesn't put it in the right parentheses. What I propose is we call another function, call it showP, that puts the proper parentheses around the operands of a product. How do we define showP? ShowP starts like show, but says, well, if the result is a sum, then print the expression in parentheses and otherwise just show the expression as usual. Now we see that indeed here at the bottom the two parentheses are put around the right expression, 1 plus 1.