This lecture is huge. However, it does not introduce any new Python tools, and so, you can consider this, to be optional, if you like. In it, I am going to be developing a program from the beginning. I'm going to figure out, what information I need to read, and I need to figure out how I'm going to start and what functions I'm going to write. This is, meant to give you a taste, of the thought process that I go through when I write a program. This file contains a bunch of assignment one grades. There's a header followed by a blank line. Followed by a list of student number and grade on assignment one. We would like to know the distribution of these grades. So we're going to keep track of categories. How many ended up in the range zero to nine, ten to nineteen, twenty to 29, and so on. And we will have also a category for the people who got 100. We ultimately, in this program, want to create a new file containing this information. And also, one star for each grade that was in that range. So we see a 77.5, that means that in our output file, we'd like a star there. There's a 37.5, that's in that range. We see a 62.5. 72.5, so we have two numbers in the 70 range so far. There's 100, we've got a 55, a 95, something in the eighties, something in the nineties, and so on. Eventually we will have one star, for each number in the file, so that we can see the distribution of grades. In order to get a program that does this, we need to do the following steps. And step one is going to be, read the grades, and this is a natural list into a, a list. It's a list situation. So we're going to read the grades into a list. That's step one. Step two is going to be, Count the number of grades per range. And step three is going to be. Write the histogram to the file. Whichever file the user picks. This process of writing down the steps at a very high level is called top down design. When we first get a problem, it's often a bit overwhelming wondering how we're going to accomplish this tas, the whole task. So we break it down into steps. Once we decide on how we're going to store our data. We can do these steps in any order that we like. Because they are, roughly independent, And we can test each of them separately.. Let's start a new program. And let's take some notes. I'll get those scribbles down into text. Writing stuff out by hand like we did here, just the example of the output that we want and then the list of steps that we think we need to take in order to get this program to work, actually has a huge benefit. It allows you to step away from the programming environment and just think about how you might solve the problem. This is essential. Programs are just expressions of algorithms that you have in your head. And if you figured out what you need to do, you'll have a much easier time writing the code. Let's make a note about the, the format that we want. We want zero to nine and then a star or two, ten to nineteen. We want those to line up twenty to 29. Maybe there are that many, and so on all the way up to 90 and 99. Whoops we want that ended and then the hundreds. Now that I have that, I can get rid of this scribbles. We're going to be reading from files, and they're going to be files that the user selects. We want this to work with any assignment file of this format, not just this particular file. And so. Learn to use Tkinter's file dialogue. We'll start by asking the user for a file, and then just reading the file. We now have variable A1 file name being what PK enter dot file dialogue dot ask open file return. So, let's open the file. We're going to for line an A1 file, just, print the line. Just make sure that we are. In good shape. This process of checking frequently about how our program is doing is a very good habit to get in to because it allows you to catch syntax errors and logic errors early before you've made a mess of things. So we run the module. And we see invalid syntax. Oh, yes. Wrong kind of print statement. That was in my old Python 2.7 days. Save that and try one more time. I've made a shorter grades file just so that we don't have to work with so many numbers at once. And I have a type error. The problem here is that I'm using ASCOPE in file instead of ASCOPE, ASCOPE in file name. Ascope in file actually opens the file for you. But, I like to do it in two steps because I think it's more explicit for people who are just learning how to program. I'll select run, run, module again. It says to pick our input file and here, we have it. This file has about maybe nine grades in it. This process of running frequently has found two errors early on in only my six or so lines of code. And yet it's going to have saved me a ton of time later on when I have 150 lines of code. If I had tried to wait until I had most of the program written, I would have been much less clear about where I needed to go to look to see where my problems were. I can get rid of these lines of code now, because, I know now that I have my file open for reading properly. I need to do the same thing with my histogram file. And better names for the histogram final, that seems reasonable. I like to keep my main programs separate from the functions that, that I rely on in my main program. So just like with assignment one and assignment two, I will start a file just to contain the functions. I'll call it grade.pi. And I know I'm going to have to import it. . My steps are reading the grades into a list, counting the grades per range, and writing the histogram to the file. We will start. With reading the grades into a list. . Because I'm reading from a file, it's hard to get an example. So, we'll start with the type contract. I know that I have a file open for reading. . And I know that it's going to return a list. And that these, all these numbers in this file are floats. So, it's a list of float. Need an extra space there, get my indentation proper. And what this is going to do is read and return the list of grades in the file. . Well, I need a header. Read grades is what I'm doing. It's a nice verb phrase, and I'll call this grading file. So now I can. Finish up my doc string, grade file. There we go. I need to know that this style is in the right format. So the pre-condition to calling this function is that. Art file. Grade file. Starts with a header. Then contains to blank lines And has a blank line. Because I need to be able to skip that header. And then I need to know that the, each line is in this format. Where there's a student number and a grade. In order to accomplish this I first need to skip over the header. And then read the grades. Accumulating them into a list. I'm following the same top-down approach. I'm writing out the steps that I need to do, in order to. Accomplish this, smaller task. . But it's the same approach. The same top-down, design. In order to skip over the header, we will read the first line. And as long as that line is not the new line character, I need to read the next line. Now, I have just read the blank line, so now I need to read the grades accumulating them into the list. Well, I know I'm going to be reading that first grade and then. As long as what I read, as long as I haven't reached the end of the file, and I write it like that, if you recall, my line is not the empty string, then I wanna do the same thing. I want to do gradefile.readline. So this is going to read every line in the rest of the file. What I need to do now is into this pattern I need to accumulate the grades. In order to make that happen, I need my accumulator variable. I'll start off with an empty list, this is going to contain all of the grades in the file. As long as I haven't reached the end I know I have a grate, so now we have a string. Containing the information for a single student. We're going to find the last space in the line, that's the space right before the number, and extract only the rest of the characters after that. So find the last space, and take everything after that space. Well, I know that. Line.rfind, finds from the right. So I'm looking for the right most spaced, that gives me back the index of the right most spaced in the line. I would like to slice the line starting at one passed that position, the first digit, and go up until the end of the string, so this will extract that for me. See that in a variable. I have the number but, it's currently a string because line is a string and when I slice, I get a string. So, when I attend to my grades list, I would like to turn that into a float. I think we're almost done. I have my list of grades there. And so that is what I'm going to return. Let's try it out. I'm going to run this, so that I have access to read grades in the showe, I'm going to, whoops, yep I'm going to ask for, [laugh], TK enter dot file, file log, need to import this. I'm going to ask for a file. We will take our test file. Now I. Need to come here, and open, that file, and finally, I'm going to call v grades on that open file. This is going to, return to me, hopefully a list of the floats that we see on the screen here. And it looks like we have all of them. With that quick test, I think I'm ready to call read grades from my main program. Now I know how to read the grades into a list. . The next task is to count the grades for each of the, the range categories that we see here. Well, let's go start designing it. . With this one, we actually can have an example. We have to decide what we're going to call this. Maybe (great) ranges. We can always change our mind later. And. We now know that if we give it a list like this. We actually have an example of this. What we would like it to do is maybe return a new list with the counts. So we know that we have, how many 0's. We've got two in the zero to nine point something range. We have nothing in the 20's. Whoops. We have one number in the 30's. We have no numbers in the 40's. One number in the 50's. Two numbers. One two three four numbers in the 70 range. No 80's. No nineties and one, one hundred. So this hopefully is what we are going to be able to return. We are going to zero, there's one two three four five six seven eight nine. Zero ten, twenty. Whoops, I am off by one. Zeroes, no tens, no twenties, one 30 no forties, one 50, no, sixties, four seventies, no eighties, no nineties, and one 100. So two, three, four, eight, nine, numbers, one, two, three, four, five, six, seven, eight, nine, numbers so I think I've costructed my example, correctly. Gonna move this over here for just a minute so I can examine. That histogram here. So this is, the histogram that I ultimately want. And what I've seem to have decided here is that, at index zero, I'm going to count all the numbers that are arranged zero to nine. At index one, in my result of this grade ranges function, I'm going to count the number of tens. The number of twenties, number of thirties, and so on. And. Index zero is the 0s. Index one is the 10s. Index two is the 20s. Index three is the 30s. So I seem to have the index number. Be the first, be the 10s digit. 30s, 40s at index four, 50s at index five, 90s at index nine, 100s at index ten. So yes, that does seem to be what I'm getting. Understanding this data representation, how I'm storing information, and what each index means, and how it corresponds to these ranges here is going to, again, save me a ton of time later on when I try to implement this function. I'm gonna grab this here. And I'm gonna take a few notes about what I just figured out. 0-9 is index zero. This is index one. This is index two and so on. This is index nine and this is index ten and my result. Well I seem to be giving it [laugh]. If I fix that there. I seem to be giving this a list of Float and it's giving me back a list of It. And this seems to be returning a list of in int each index in, indicates how many grades were in these rangers. Grade ranges doesn't seem to quite capture the idea. So I'm gonna put count grade ranges. Take symbolistic rates. That means I need to change my example here. And that does seem a better verb phrase. My result is eleven numbers long. And I want to keep track of this count. So, range counts is going to be, well, they all start off at zero. So, this'll give me a list of length eleven of ints that are all zero and I'm going to update this as I look at each number. And I need to figure out how I'm gonna take that 77.5 and just get that first the tens digit there so I can use that as an index into this range count list. Any number up from 70, up through 79 point nine should end up with just the seven. So I'm going to come over here, and experiment a little bit. 77 point five, and just divide by ten, I'll end up with seven point seven, five, I want to get rid of that point seven, five. So I'll use the. Other form of division where I throw away the fractional portion. Does that work if I just have a 70? Yup. How about a 79.9? Good. Coming back to my program that means that once I have a grade from my list I want to use this division. And then I'm gonna be using that as a index and arrange council. I need to task that result to in it. For each grade in the grades file. I need to know which range I want to, add one to in range counts. Well, I had just figured out that I can take that grade and divide it by ten and turn that result into an int. So this will tell me the index into the range counts list that I want the increment. So range counts is. Range counts at index which range. Is what was there plus one. So I've looked at all of the grades. I'm done. So, I'll just return that list. It's time to test it. So let's grab that list there. Run this module. Call this and compare. 20010104001. Awesome. Two steps down. One to go. I know in my main program here that I want to count the grades per range. I no have a lovely little function to do that for me. So range counts is going to be what I get back from count grade ranges and passing in grades. Let's try it out. All we're going to do is print range counts and see how we do. See whether we seem to be making progress. Let's go to run, run module. It says okay, which one do we wanna open? We'll open our test file here. And where do we wanna save it? Well I'm gonna eventually save it in iii.hist, hist for histogram. But, we're not actually writing anything to that file. But I still need to select it, because I was asked to do so. And it says name [inaudible] is not defined. How can that be? I see, I need to use these through the name of the module. I also noticed that, I want to ask save as filing because this is the output file. Let's restart it. That kills our python program, so that's not hanging around. And we'll try one more time. And asked for our input file, there it is, or asked for our output file we'll choose that one and it says are you sure you want to do that? Yeah check my name that is the one I want to replace. We get the same list printed here. So our main program is now working, so far. Now that I have my list of counts for each range. It's time to write the histogram to the file. I've done with this print statement so I will comment it out. Just in case I need to un-comment it later on. Come back to my function file where I'm storing all my functions, and I need to write the histogram so, following our design process. I know that I'm going to call it and pass in that, the grade range list that I had generated, so I get the, this, range counts information and also the file that's been opened for writing. So. I know that I've got a list of int and a file open for writing. Doesn't return anything. All it does is it writes the file. . A histogram of the, of stars. Based on the number. I'm having trouble typing today. Of grades in each range. I wonder if it's frustrating watching me makes these mistakes. Well, I have an example of the output that I want. . And, I'll just paste this, so that whoever calls help on this can see what is expected. What is a Stewart writes a histogram, so "writes histrogram" seems to be a reasonable name. We know this is the range counts and that we have a histogram file that we are going to be writing to. The first line of output and the last line of output are slightly different there are more spaces after the number I want these to line up so, I, I want these stars to line up so when I print the first line and when I print I guess write the last line I need to write a few extra spaces so I'm going to deal with them separately. Here we go. We first want to write zero to nine, and then we want to write how many spaces? One, two three. Light does not put a new line character at the end, which is good, because now I wanna write some stars. So, I know I want to write some stars. And, how many are there? Well, it's however many grades were in that range. So, we're going to multiply the asterisk by whatever is in range counts at index zero. Which is the number of grades that were in the range zero through nine. . And then I want to start a new line, because, as you can see, the ten through nineteen grades appear on the next line. I'll take care of the other special case then. And, here we want to write 100. And, rather than range counts at index zero, it's actually range counts at the last index. That'll give me the number of 100s that were found in the file. In between I want to write one line for each of the indices one through nine so for I in range one up to but not including ten. Well, what does this do? This writes the 2-digit ranges. 10s, 20s, 30s, and so on. I need to figure out the low number and there is a high number for the current range. So when I is one, I need to figure out the oops, the ten and the nineteenth. When I is two I need twenty and 29 and so on. So the low end X is just I times ten. The high index is. I times ten plus nine. Just to check nine times ten is 90 plus nine is 99, so that'll work for 90 as well. So, yep that seems to give me the low and the high. I am going to write low but I need to give it a string so low, stir of low plus the hyphen in between the low and the high and then high. Followed by a colon and a space. Once I've written that, , then it's time to write the asterisks. So I want to multiply that by range counts at which index? Well index I. And finally, I want to write the new line. In one of two ways. I can do what I was doing before and run this and then come up with an example of range counts and send a output file and make sure that file is written, but This is the last step in my program so I actually can do that here. Now I've got this information elsewhere so I'm going to delete it. What I want to do now is write the histogram with the range counts and my output file which is a1histfile. I verified that my program seems to be working so far. I can just run this in order to check to make sure that this new function does what I want it to. We'll try our A1test.txt. And we wanna write it to a1test.hist. Yep, we wanna replace it. Uh-oh! I have a not writable problem here. It seems to be saying that when I called this file.right it says that, that is an unsupported operation. That this file's not writable. And that makes sense. Because I opened it for reading. Cuz I copied and pasted and forgot to change that. Let's run this one more time. There is my test input. No, I don't wanna write my program there. Let's save that, replace. Okay, no errors. We're going to switch to the finder now. There I am. Don't look. Grades examples, and I just wrote a1test.hist. We'll open this in text edit. And, it's empty. Now I need to go find out why. Let's go back and take a look at the program. Oh, I think I forgot to close my files, dont close your files and then they will not be written right away. Oh try one last time. Head back, see how we did. And there's our information. Let's try running it on our larger data file. Okay. 1.Pxt is our large one. And we will save it to A1 hist. Yeah. I'ld like to replace that. Switching back to the finder, we want hist. Aha, beautiful. Look at that. And that is really kind of how I write programs. I start out at the very beginning with only a vague idea of what I want and I have a bunch of decisions to make. What are my steps, what data am I going to be storing and how am I going to be storing it. And which order are I'm going to write my functions in and how am I going to test them. I hope this was fun.