Module One; Functions and Organization. Topic 2.2; Guidelines for Functions. So, I'm going to give a few tips on making good functions, okay? Functions that are understandable. To facilitate debugging and other people understanding your code, working together with people, and so forth. Modification later, maybe you want to update your code you need to understand what you wrote. So to facilitate that there are a few tips. Function naming, really important. Give functions a good name for goodness sake some kind of a name that describes the behavior of the function. So, what you want your goal in the naming, if at all possible is that the behavior can be understood at a glance. So, you just look at the name and you know what this thing does. Now, parameter naming counts too. So, you also want parameters that are well named too, so you understand what they mean. So as an example, I'm showing two functions just the first line of the declaration, right? The first function is called process array, it takes a which is an integer slice and it returns the float and that's all I know about it right now, right? Now, if instead let's look at the bottom one which actually these two functions, these can do exactly the same thing, okay? But they're defined differently, they're declared a little bit differently. So the second one, is called ComputeRMS. It takes in a slice called samples of floats and it returns a float. So, notice that these two are compatible, these two are probably doing exactly the same, say they do exactly the same thing. That first line is declared the same way but their names are different. So, ProcessorArray versus ComputeRMS. Now, RMS remember these names are always domain dependent, okay? RMS stands for root mean square, if you look in a time varying signal it is something like an average, okay. Now, I don't want to go into what RMS is but if you know this type of stuff you're in, if you're working in this domain you would know what RMS is. So ComputeRMS has a distinct meaning to anybody working in this domain. So you look at that and you know instantly what that is. ProcessArray can mean anything, right? Process how? Right? Who knows? Now then, also look at the name of the argument. For ProcessArray the argument it's called a, that is completely generic. Who knows what that is? Where ComputeRMS, I call it samples? Because guess what? It's a bunch of samples of a time-varying signal, right? So, the naming gives you some an idea of what type of data is being parsed and I can look at it and understand what it's doing without knowing anything about the actual code inside the function, about how it's implemented. I can just look at the name and say, "Ah that's what it is". So that's what you want. Now, notice that these names, they're going to be domain dependent, right? So ComputeRMS, you have to know what RMS is but that's a shorthand that anybody who does this type of work, who works on time-varying signals, they're going to know what an RMS is. So that's a good name. Now, another thing about names that I skipped here is that you don't want to be too long. Okay? People can go overboard, they can make them so descriptive there, just burdensome, okay? You don't want it to be too long and how long is too long? I don't know. ProcessArray is getting there as long as I wanted to be, maybe a little longer than that. There's no hard limit on that but you don't want to put too many words together, right? It gets ridiculous. So, naming is really important and in my classes I teach Python, here at UCI and I tell students this and they don't listen. They still name these variables X and I'm like what the heck? And they are like, "Professor Harris, what's wrong with the code?" I have no idea, I don't know what X and Y and Z are. How am I supposed to know? And nobody can know that and sure maybe you don't care what the professor thinks but you will one day work with a group of people and your boss will be like, "Okay, what is this?" All right and he/she will get upset. You know what I'm saying? If you want to work with people, naming is really important. And you yourself when you look at the code later like a month later, it will be much easier for you to understand your own code if you have good naming. All right, another thing that you want in function definitions is you would like to have functional cohesion. So what that means is that the function should perform, only one "operation" and note I put operation in quotes because what is an operation? I don't mean one instruction; plus, minus something like that. An operation the size of it, the complexity of it, really depends on the context on what the application is that you're making. So, giving an example, say you've got some geometry application. I don't know it's doing things with points in three dimensions. Maybe you got some functions like point dist for point distance, the distance between two points. Common thing you might do. Draw a circle, triangle area, these names are all things that are in the domain geometry and these names are all good names meaning you can look at the name and figure out what it does and not too long. Okay? So, just from the name you can look, you don't have to look inside the code, you can just look at the name. Now, what I mean by functional cohesion is you would like it if each function did basically one thing. So, point dist it computes one thing, the distance between two points. Draw circle, it draws a circle, it does one thing that makes sense in the domain of geometry applications in this case. Now, let's say though that you're making this geometry application and there's some case, some instance where under some conditions you need to draw a circle and then you need to compute the area of a triangle. You might have to do that, do the one thing and then the next. So, it would be a bad idea to put both those operations into the same function. You might say, "Well, I'm going to need to do both, I'll just put them into the same function and it can draw a circle and it can compute a triangle's area." One function that can do either or let's say. That would be a bad mistake because now you've got a function that does two things and the reason why that's a bad mistake is because it doesn't make sense to the human. Meaning how would you name such a function? Draw a circle compute triangle area, it's much cleaner in your mind if the operations that the function performs are separate. So drawing a circle and computing a triangle area, they are two separate functions to most people who think about geometry. They're two separate things, so you'd want to keep them as separate functions. If you start putting them together then it just doesn't make sense to the human and you want it to make sense. You basically when you write this code you want to be idiot proof, okay? You got to expect that a bunch of idiots are working with you and they're going to be looking at your code and they don't understand a thing. So you got to make this code so easy and obvious for them that they can't help but understand what you're doing, okay? That's what I'm going for here, right? You want it to be obvious and putting together different functions, different operations into the same function is a confusing thing. So you want to separate these operations into different functions if you can. So another thing to do with functions to make them simpler, is to reduce the number of parameters, okay? Limit the number of parameters that you take. So more parameters just means more complication because if you're trying to understand what a function does say, it goes wrong. Say it takes 20 parameters, you got to look at all 20 of these parameters, right? And which one could it be? It's much easier if you have fewer parameters that you can keep track of so because debugging generally requires tracing the data and which of the parameters. So if you have to trace that back, you don't want to have to trace back 20 different pieces of data, you'd like to trace back five or something like that or look through five pieces of data rather than 20. So, the fewer the better. So, debugging is just generally harder when you have more parameters. Now you got to think of why it happens like, say you do make a function that does have a lot of parameters? Why did that happen? It may be that the functions you wrote had bad functional cohesion. So, let us say for instance, you made the mistake I talked about before, you want a function that can draw a circle or it can also compute a triangle's area. These two operations require entirely different arguments, drawing a circle requires information about the circle, its center, it's radius basically. Drawing a computer triangle's area requires information about the triangle, maybe its points, its coordinates, or something like that. So, if you make a function that does both of these operations it's got to take all the arguments for both different things. And so you would tend to get more arguments, more parameters, so you want to reduce the number of parameters, you may want to look at the code and say, "Oh, wait a minute I'm putting these two operations together, I can separate them and reduce the number parameters required to parse to each individual function." Okay, so, another way to reduce the number of parameters. Say you can't split it the way I just said, say this function does have good cohesion, okay? So that's not a thing you can do is just split it. One thing you might look into is grouping related arguments into structures. So, as an example, say you got a triangle area function. A bad solution for this, when I say bad solution, a solution for parsing its parameters. You could say its parameters are three points, okay? Because you need three points to define a triangle, right? So you got to give it three points and each point let's say is in three-dimension space, we're working in. So each point is going to have three floats associated with it, right? XYZ. So in total, I can say this triangle area could take nine different arguments, right? XYZ for the first point, XYZ for the next, XYZ for the next. It is a lot of arguments, right? A better solution, good solution I'll say better solution let's say, not the best but better solution is instead, I define a new structure called point, right? And this structure called point, it has X and Y and Z. It has three floats XYZ. Then, once I define that instead of passing to my triangle area, nine different values for X and Y and Z, XYZ, XYZ, I can parse it three things, three points. That each point inside it has three floats but when I'm looking at my declaration for triangle area, I only see three things point one, point two, point three and it makes more sense, it's easier to understand in my mind. Now, an even better solution that I didn't put up here is I can say triangle area takes one argument which is a triangle. So I can make another structure which is a triangle type rather, and this triangle, it could have three points associated with it and each point has three floats. So I could make a triangle area that just takes one argument which is a triangle structure, right? That's even better, so anyway this type of thing by grouping related piece of data into structures. You can reduce the amount of arguments you get to parse your function. Now, remember don't force this meaning, only group pieces of data if they are actually related, right? You don't want to group completely random pieces of data into one structure because then you get a structure that makes no logical sense. You don't want that but often you can find the ones that are related and put them together. Thank you.