So this program will use the CSV module's DictReader in order to parse a CSE file,
so let's take a look at how that works.
First we have to import csv so
that we can actually import Python's CSV module and use it.
And now we have a function that I'm calling dictparse,
this is similar to the parsing functions we've seen before, but
we're going to use the DictReader as I said, so it is slightly different.
It takes two arguments now instead of one, the first is the filename
of our CSV file and then the second is the name of a keyfield.
Well, why do we need that?
Well I want to parse this CSV file now
as a dictionary of dictionaries, instead of a list of lists.
So the outer dictionary is actually mapping
some key to the rows of the CSV file.
So what is that key?
Well each row needs to have a header column, and
the value of the header column will be the key identifying that particular row.
So if you think about the CSV files we've been looking at,
where we have the average monthly high temperatures for a bunch of cities.
The cities are the obvious columns to be the keys identifying the rows, right?
I want to know what are the average high temperatures in Houston, or Baghdad, or
Moscow, right?
I don't want to know, hey,
which rows had a 37 in February, right, that doesn't make any sense.
So the city is the obvious keyfield for that CSV file.
Now we actually need to know what the name of that column is, so
our CSV file also has to have a header row,
where the first row of the CSV file gives the names of each of the columns.
So let's take a look at how the function works,
the first thing we do is we initialize table to be an empty dictionary.
Because, as I said, we're going to have a dictionary mapping this key
value into the rows inside of the CSV file.
Right and then we open the CSV file, now notice some things here.
I explicitly used the mode rt here, which I haven't been doing,
that is the default in Python, so it doesn't really matter.
But we do know that CSV files are plaintext, and I'm trying to read them, so
this is mode that I would like to open the file.
But I've also added this extra argument here, newline equals the empty string.
Why did I do that?
Well I didn't point this out before, but
the CSV module actually handles new lines on its own.
So I do not need Python's file handling utilities to process the new lines, and
there's a good reason why I don't want them to.
And that's because I can have a new line inside the value
of a particular column in my CSV file, if that column is quoted.
So if there are quotes around it, I can have new line characters in the column.
And Python's file handling utilities can't differentiate between new lines that
are actually ending the line, and new lines that are inside some column value.
So by telling Python that the new line character is this empty string,
the file handling utilities ignore the new lines, and
leave all the new line handling up to the CSV module.
Now this doesn't matter if I don't have new lines inside of columns in my
CSV file, which is why it didn't matter whether we did this or not before.
But it also doesn't matter if you tell Python's file utilities to stop dealing
with new lines, because the CSV module will deal with them correctly.
So, it's safer to do this all the time,
even if you know that your file doesn't have new files within quoted columns.
Now, the next thing that we need to do is actually create the DictReader, and
I do this by calling csv.DictReader.
Remember that DictReader is inside the CSV module, so I need to have that csv.,
to tell Python to look inside the CSV module for DictReader.
And then I pass it the open file, csvfile, and
again I'm just going to tell it to skip initial spaces, set that to be True.
So this looks all familiar, hopefully.
Again, I can iterate through the rows of the csv reader, so
for row and csv reader, I can then add that row into the table.
Now because the table is a dictionary, I can't just append the row to the table.
Rather I need to get the key value out of the row,
and then assign the row to the table at that key.
Well notice that I'm indexing into the row with the keyfield.
The row itself, because we used a DictReader, is also a dictionary, and
that dictionary maps the names of the columns
to the value of that particular column in this row.
So the DictReader looks at the first row and
uses that as a names of the columns, and then creates dictionaries for
all other rows, mapping those names to the values at that column.
So I can look up the key value simply by indexing
the row dictionary with the keyfield.
I then use that as an index into my table to assign the row into the proper index,
so now I have the appropriate key value pair in my table dictionary.
Once I'm done, I can just return the table.
The function I used to print the table also has to change a little bit,
because the table is now a dictionary of dictionaries.
So, I want to first print out the header row, I know that the city is the name
of the first column, so I'm going to print that out.
But then I have a problem, I don't have the header row in order anywhere,
I need to actually know what are the names of columns.
So you might have noticed up at the top of this file,
I have a tuple that I've assigned to the variable MONTHS.
And this is just a tuple of the names of all the months that are basically
the names of the columns in my CSV file.