Getting ready
In order to conduct our analyses, we're going to create a few helper methods that we will use continually throughout this chapter. Application-oriented analyses typically produce reusable code that performs singular tasks in order to adapt quickly to changing data or analysis requirements. In particular, let's create two helper functions: one that extracts data by a particular country and one that creates a time series from a set of particular rows:
In [22]: def dataset(path, country="United States"):
...: """
...: Extract the data for the country provided. Default is United States.
...: """
...: with open(path, 'r') as csvfile:
...: reader = csv.DictReader(csvfile)
...: for row in filter(lambda row: row["Country"]==country,reader):
...: yield row
...:
In [23]: def timeseries(data, column):
...: """
...: Creates a year based time series for the given column.
...: """
...: for row in filter(lambda row: row[column], data):
...: yield (int(row["Year"]), row[column])
...:
The first function iterates through the dataset using the csv.DictReader filter on a particular country using Python's built-in filter function. The second function leverages the fact that there is a Year column to create a time series for the data, a generator that yields (year, value) tuples for a particular column in the dataset. Note that this function should be passed in a generator created by the dataset function. We can now utilize these two functions for a series of analyses across any column for a single country.