In Python, we can loop over sequences such as lists, tuples using for loop, or while loop using index position. For instance, the code listing below shows how we can loop over a list using the while loop.

>>> guests = ['Luffy', 'Zorro', 'Sanji' ]
>>> i = 0
>>> while i < len(guests):
...     print(guests[i])        # Iterating using index position
...     i += 1
Luffy
Zorro
Sanji

We can also use the for loop to iterate over unordered collections such as sets and dictionaries.

>>> guests = {'Luffy', 'Zorro', 'Sanji'} # Set Object
>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> for guest in guests:
...     print(guest)
...
Zorro
Luffy
Sanji

Unlike sequences, we cannot use the index position in a while loop to iterate over set and dict objects.

>>> guests = {'Luffy', 'Zorro', 'Sanji'}        # Set of Guests
>>> guests[0]
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'set' object does not support indexing

As you can see from the above code listing, the set objects don't support indexing.

However, there exists a way to iterate over collections such as set and dict using the while loop. To understand that, we need to know how Python implements iteration underneath.

The way Python implements iteration is called Iterator protocol.

Let's understand the iterator protocol in the next section.

The Iterator Protocol

Before we start looking at the iterator protocol, can you recall what iterables are?

Earlier, we defined that any object you can loop over with a for loop is called an iterable.

This definition doesn't explain much about what an iterable is. The proper definition of an iterable is as follows:

An object capable of returning its member one at a time is called an iter-able.
Fig 1: The Itertor Protocol

To get a member from an iterable, one at a time, Python provides a built-in function iter(). When we provide an iterable as an argument to the iter() function ( which in returns call the __iter__() dunder method on the object ), it returns something called iterator.

>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> iter(guests)
<set_iterator object at 0x7f7983d1dab0>

The above code listing provides the guests set to the iter() returns a set_iterator object. But what's an iterator?

  • An iterator is an object representing a stream of data.
    When you repeatedly call the built-in next() on the iterator object, it returns the stream's next items.
  • When no more items are available in the stream of data, the StopIteration exception is raised instead.

To understand this, let's continue our code from above.

>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> guest_iterator = iter(guests)
>>> next(guest_iterator)
'Zorro'                        # Sets are unordered
>>> next(guest_iterator)
'Luffy'
>>> next(guest_iterator)
'Sanji'
>>> next(guest_iterator)    # No more items left in the set
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

In the above code listing, we can see how the iterator returned from guests can provide one object at a time using built-in iter() and next() functions.

Do you recall what dunder methods are?

Earlier, we mentioned that objects could have special methods or dunder methods to support special operations or protocol. For example, for items to respond to the + operator, they need to have their __add__() dunder method defined.

>>> [1, 2] + [3]            # Same as [1, 2].__add__([3])
[1, 2, 3]

Similarly, for an object to be iterable, it needs to have the dunder method, __iter__() defined. It's the __iter__() method, which returns an iterator from an iterable.

The built-in objects such as sequences, sets, and dictionaries have the dunder method __iter__() defined, which lets us use in for loop.

Now that we have gained a bit of insight into how Python implements iterations let's try to write a while loop to iterate over a set of objects.

>>> guests = {'Luffy', 'Zorro', 'Sanji'}
>>> guest_iterator = iter(guests)    # Same as guests.__iter__()
>>> while True:
...     try:
    		# Same as guest_iterator.__next__()
...         guest = next(guest_iterator)
...         print(guest)
...     except StopIteration as e:
...         break
Zorro
Luffy
Sanji

The while loop we wrote is pretty close to what happens under the hood when we iterate on an iterable using a for loop. Although, when using iterables with the for loop, we don't need to call the iter() function or handle the StopIteration error as the for statement does that automatically for us.

We can describe the iterator protocol as the following:

  • First, obtain the iterator of the object using the iter() function or the dunder method <iterable>.__iter__().
  • Call the built-in function next() or the dunder method <iterator>.__next__() on the iterator object.
  • Run the code block inside the for block
  • Repeat the next invocation until the iterator raises StopIteration

Let's redefine our redefine the terms iterable and iterator.

Iterable: An object that allows the iteration is called an iter-able. The iterable is required to have an __iter__() method defined that returns an iterator.
Iterator : The iterator is required to have both an __iter__() method and a __next__() method.

Let's do a small exercise.

What's the output of the following code?

>>> person = {"name" : "John Doe", "age": 15, "country" : "Japan"}
>>> list(person)
['name', 'age', 'country']
>>> person_iterator = iter(person)
>>> list(person_iterator)
  1. ['name', 'age', 'country']
  2. ['John Doe', 15, 'Japan']
  3. Raises StopIterationError
  4. Raises ValueError

We can use container constructors such list(), tuple() and set() to create container filled with items from the iterator object.

This is because iterators return itself as an iterator.

When we call the iter() function on an iterator, we get the iterator back. This means iterators are also iterables.

>>> iterator1 = iter([1, 2, 3])
>>> iterator2 = iter(iterator1)
>>> iterator1 is iterator2
True

So far, we can summarise everything we have learned about iterables and iterators in the table 1.

Table 1: Summary of Iterables and Iterators
Iterables Iterators
Must have __iter__() method defined Must have __iter__() and __next__() method defined
Anything that can be passed to iter() without error is an iterable Can be passed to the next() function to get the next item until there are none left.
Iterators are also iterables Returns themselves when passed to the iter() function
Anything that can be used in for loop is iterable Raises StopIteration when passed to the next() function if no more elements left or the iterator is exhausted

In this section, we covered the difference between iterators and iterables. In the next topic, we will look into how we can create custom iterators.

Generator

The simplest way to create an iterator in Python is to create a generator function. Let's take a look in detail at a generator function.

Generator Function

A generator function is a function that uses the yield keyword.

A generator function can create the generator object that produces a sequence of values that we can use in an iteration.

For instance, let's say we have the following requirements:

  • we would like to create a function cube() which takes a list of numbers as an argument
  • it returns the cube of elements from the list, one at a time upon each invocation.

We can create a generator function that accepts a sequence of numbers and returns a generator object, which will return one number at a time.

>>> def cube(numbers):
...     for num in numbers:
...         yield num**3

Let's invoke our newly created generator function cube() to get a generator object.

>>> a = [2, 3, 4]
>>> cube(a)
<generator object cube at 0x7f9a414a0360>

The yield keyword's presence makes our function cube() into a generator function. Before explaining how the keyword yield works, let's see how we can use our generator object.

Can you guess as to how we get items from the generator object?

The generator object can be passed to the built-in function next() to get the next item of the sequence. When the generator object has no more items, it raises the StopIteration exception. Let's take a look at the generator cube() in action below.

We will assign the generator object the name cube_generator.

>>> cube_generator = cube(a)		# a = [2, 3, 4]
>>> next(cube_generator)
8
>>> next(cube_generator)
27
>>> next(cube_generator)
64
>>> next(cube_generator)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

As you can see, the object returned by a generator function is an iterator. We can confirm this by calling the iter() function on the generator object.

>>> cube_gen1 = cube(a)                # Create a generator object
>>> cube_gen1 is iter(cube_gen1)	   # An iterator returns itself when `iter()`
True

Because the generator object is an iterator, we can use the for loop to iterate over the generator object.

>>> b = [12, 14, 16]
>>> for num in cube(b):
...        print(num)
1728
2744
4096

We learned how to use the generator function to create a generator object, an iterator. Let's do a small exercise.

Which of the following will not be present in the code's output below?

>>> def square(numbers):
...     for num in numbers:
...             yield num*2
>>> for num in square([1, 2, 3, 4]):
    	print(num)
  1. 1
  2. 4
  3. 9
  4. 6

Calling a function square() doesn't automatically make it one.

The yield statement

If a function has a yield statement in it, it stops being a usual function. With the yield statement, the function becomes a generator function, and it will return a generator object when called. As we saw in the previous section, the generator object can be iterated to execute it until Python executes the yield statement.

The keyword return is used to return values from a function, at which time the function then loses its local state. Thus, the next time we call that function, it starts over from its first statement.

While yield maintains the state between function calls, it resumes from where it left off when we call the next() method again.

Suppose Python executes a yield statement in a generator function, then the next time we call the same generator. In that case, Python picks right back up after the last yield statement.

>>> def reveal_award_winner():                # Generator function
...     print("The award for the best ed-tech company goes to ...")
...     print("...drumroll sound...")
...     yield "Primerlabs"
>>> reveal = reveal_award_winner()            # Generator object
>>> winner = next(reveal)                    # Executes statements and yields
The award for the best ed-tech company goes to ...
...drumroll sound...
>>> winner                                    # The yielded value
'Primerlabs'
Figure 2: Normal & Generator Function

In the above code listing, the generator function reveal_award_winner() works in the following way:

  • The mere presence of the keyword yield makes the function reveal_award_winner() a generator function.
  • The generator function reveal_award_winner() when called returns a generator object which assigned the name reveal.
  • When passed to the next() function, the reveal generator object executes the code block inside the generator function and returns or yields the value Primerlabs, which Python assigns to the name winner.
  • We can check the value yielded using the name winner.
  • If you try to further pass the same reveal generator object to the next() function, Python raises the StopIteration exception indicating that the generator object is exhausted.
>>> next(reveal)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
StopIteration

What's the value of __A__ of the following code listing?

>>> def spoiler():
...     print("In the end, it is revealed that...")
...     yield, "Mrs. Spinster, the sweet old lady."
...     yield "..in-fact had a huge role,"
...     yield "in stopping the murder."
>>> spoil_me = spoiler()
>>> next(spoil_me)
In the end, it is revealed that...
Mrs. Spinster, the sweet old lady
>>> next(spoil_me)
__A__						# A
  1. 'Mrs. Spinster, the sweet old lady',
  2. 'in stopping the murder.'
  3. '..in-fact had a huge role.'
  4. Raises StopIterationError

Finite Sequence Generator

We can write a generator function that generates a sequence of numbers before it exhausts. These generator functions are called finite sequence generators.

Let's create a finite sequence generator to understand more.

A number x is a divisor of a number y if x perfectly divides y. Perfect division means y upon division by x leaves no remainder.

Let's create a generator function, generate_divisors(num), which returns the divisors of a given number num one at a time.

>>> def generate_divisors(num):                # Generator Function
...     for x in range(2, num):
...         if num % x == 0:
...             yield x     # Yield x if remainder is zero
>>> for x in generate_divisors(88):
...     print(x, end= " ")
2 4 8 11 22 44

The code listing above works in the following way:

  1. The generate_divisors(num) generator function loops for each number in the range(2, num) yields a number that perfectly divides the given num.
  2. When we call the generator function generate_divisors(88), it returns a generator object, which we can iterate over using a for loop.
  3. The for loop iterates over the generate_divisors(88) generator object and print the yielded divisor.

We can write a function that calls the generate_divisors(num) generator function. We won't need to write a for loop whenever we want to check out the divisors' list.

>>> def divisors(num):
...     divisors_list = []
...     for x in generate_divisors(num): # Loop over the generator
...         divisors_list.append(x)  # add to the list
...     if len(divisors_list) == 0:
...         print("The number {} is a prime".format(num))
...     else:
...         print("The divisors of {} are ".format(num))
...         print(divisors_list)
>>> divisors(102)
The divisors of 102 are
[2, 3, 6, 17, 34, 51]
>>> divisors(103)
The number 103 is a prime

We created a divisors(num) that returns a non-prime number's divisors or let us know if a given is prime. The working of a generator function is described in figure 3.

Figure 3: Generator Function

Let's continue with our generate_divisors(num) generator function. What is the output of the following code listing?

>>> next(generate_divisors(88))
2
>>> next(generate_divisors(88))
  1. 2
  2. 4
  3. 8
  4. Raises StopIterationError

Each time you invoke a generator function, you create a new generator object that executes the generator function code from the beginning. We can also write a generator function that can indefinitely generate a sequence. This generator acts as an infinite iterator or infinite sequence generator.

Infinite Sequence Generator

To illustrate, we can write a function that will indefinitely generate a prime number.

Earlier, we wrote a generator function to get the list of divisors for a given number. This time, we will write a helper function has_divisors() to check if a function has divisors other than 1 and itself. The function has_divisors() is shown in the code listing below.

>>> def has_divisors(num):
...     for x in range(2, num):
...         if num % x == 0:
...             return True		# if any divisor found
...
...     return False			# if no divisors found

To write a prime number generator, we would take a default starting value at $3$. We will raise ValueError if someone tries to use a float or a value less than three as an argument. Our generator function generate_prime() is as shown below.

>>> def generate_prime(start = 3):
...     if start < 3 :
...         raise ValueError("Number cannot be less than 3")
...     elif type(start) == float:
...         raise ValueError("Number cannot be float")
...     num  = start
...     while True:
...         if not has_divisors(num):
...             yield num
...         num += 1
Would you like to summarise the generate_prime() function?

Our little function generate_prime() is dependent on our previous function has_divisors(). The generate_prime() function can continuously return new prime numbers while saving the last number is returned.

The generator function generate_prime() works in the following way:

  • When we pass the generator object to the next() function, the generator function assigns the name num to the start argument, which defaults $3$.
  • Next, it starts a while loop, which checks the given number has divisors using the previously written function has_divisors() above.
  • If the given number num has a divisor, Python executes the rest of the while loop code, and the num is increased by $1$.
  • If the given number num doesn't have any divisor, Python stops the while loop and returns the num.
  • If we pass the same generator object again to the next function, the generator object resumes from the while loop after the yield statement. The value of the name num increases by $1$.
  • The above steps get repeated indefinitely.

Let's see our generator function in action.

>>> a = generate_prime()        # Generator Object
>>> next(a)
3
>>> next(a)
5
>>> next(a)
7
>>> next(a)
11
>>> next(a)
13

In the above code listing, we can see that the next() function, using the generator object returned by calling the generator function generate_prime(), returns a new prime number on every invocation.

If you wish to use the generate_prime() function with a for loop, you should ideally provide a break statement as well.

If you wish to use the generate_prime() function with a for loop, you should ideally provide a break statement as well. Why do you think that is the case?

We need to write the for loop with a break statement; otherwise, the for loop will run indefinitely. The code below has a break statement in the if condition, which results in the loop's exiting.

>>> for x in generate_prime():
...     if x == 307:            # We know that 307 is prime
...         break
...     print(x, end =" ")
...
3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167 173 179 181 191 193 197 199 211 223 227 229 233 239 241 251 257 263 269 271 277 281 283 293

What's the output of the following code listing, which uses the generate_prime() function we created earlier?

>>> next(generate_prime(33))
  1. 37
  2. 33
  3. 41
  4. 43

So far, we saw how we could create an iterator using a generator function. We can also create an iterator using something called generator expression. Let's look into that in the next section.

Generator Expressions

In Python, the following form’s generator functions can be written concisely using generator expressions syntax.

def generator_function(iterable):
    for item in iterable:
        if condition_expression:
            yield item

The corresponding generator expression looks like below:

generator_function = ( item for item in iterable if condition_expression )

The general syntax of generator expression is as follows:

( f(x) for x in iterable if cond(x) )

To understand, let's create a generator function square(), which returns each number's square in a given range one at a time.

>>> def square(start = 10):		# Generator Function
...     for num in range(start):
...         yield num**2
>>> for x in square(14):
...     print(x, end=" ")
0 1 4 9 16 25 36 49 64 81 100 121 144 169

We can write the same generator function square() in the generator expression format in the following way:

>>> a = (num**2 for num in range(14))	# Generator Expression
>>> for x in a:
...        print(x, end = " ")
0 1 4 9 16 25 36 49 64 81 100 121 144 169

We can see that the generator expression works in the same as the generator function square(). We can also include the if condition insides the generation expression as well.

The following generator returns the multiple of 7 within a particular range of numbers one at a time.

>>> def multiple_of_7(num):
...     for num in range(num):
...         if num % 7 == 0:
...             yield num
>>> for x in multiple_of_7(100):
...     print(x, end = " ")
0 7 14 21 28 35 42 49 56 63 70 77 84 91 98

We can rewrite the generator function multiple_of_7 in the generator function in the following way.

>>> for x in ___A____:			# A
...     print(x, end = " ")
0 7 14 21 28 35 42 49 56 63 70 77 84 91 98

What is __A__ in the above code?

  1. (num for num in range(100) )
  2. (if num % 7 == 0 num for num in range(100) )
  3. (for num in range(100) yield num if num % 7 == 0 )
  4. (num for num in range(100) if num % 7 == 0 )

As you see, the corresponding generator expression is much more concise than and can be directly be consumed by the for loop.

We can pass the generator expression as an argument to the built-in function all or any function as generator expressions are iterables. Let's take a look.

Generator expressions represent a generator object or iterator, which can be readily consumed by the for loop. We can check this by creating a small generator expression.

>>> a = (x for x in range(4))
>>> a
<generator object <genexpr> at 0x7f9a413d74c0>
>>> for x in a:
...     print(x, end=" ")
0 1 2 3

Generator expressions are iterators, which in turn are iterable. We can pass them to other objects which accept an iterable as an argument.

For instance, the built-in function sum() takes an iterable as an argument and returns the numbers' sum.

Which of the following is the generator expression to get the sum of squares for even numbers in the range (0, 25)?

  1. sum((x**2 for x in range(25) if x % 2 == 0 ))
  2. sum((x**2 for x in range(25)))
  3. sum((x*2 for x in range(25) if x % 2 == 0 ))
  4. sum((for x in range(25) x**2 if x % 2 == 0 ))

In the previous exercise, we wrote a generator that generates the square of even numbers and gets its square for a given range, and passes it to the sum() function to get the sum of all its number.

As you can see from the above code listing, generator expressions provide a concise way to generate numbers.

However, Python provides a way to makes it even more straightforward and compact.

We can drop the parentheses surrounding a generator expression if we use the generator expression as a single argument to a function.

So, we can rewrite the above code listing can by dropping the parenthesis surrounding the generator expression.

>>> sum(x**2 for x in range(25) if x % 2 == 0)
2600

Generator expressions allow for more complexity. For instance, the generator expression syntax below.

(f(x) for x in sequence if condition)

is identical to the generator function:

def generator_function():
    for x in sequence:
        if condition:
            yield f(x)

We can add the generator expression to multiple nesting levels to accommodate complicated logic requirements at the cost of readability.

# Multiple conditions in a generator expression
(f(x)
     for x in s1 if cond1
     for x in s2 if cond2
     for x in s3 if cond3
)

can be converted to generator function as

def generator_function():
    for x in s1:
        if cond1:
            for x in s2:
                if cond2:
                    for x in s3:
                        if cond3:
                            yield f(x)

However, we should avoid multiple-nested generator expressions or functions like the ones shown above at all costs. This code will quickly become too difficult to understand for others and, more importantly, yourself.

Generator expressions are pretty useful for writing short generators. In contrast, anything that requires sophisticated logic handling, generator functions might be a good idea.

At this point, can you rephrase the differences between generator expressions and generator functions?

We have covered iterators and how can we create one using generators. We also looked into the syntactic sugar of defining compact generators using generator expression. In the next section, we will cover some useful built-in iterators in Python.

Useful Iterators

We earlier encountered some built-in iterators such as enumerate() and zip() to iterate over multiple lists. Python provides other iterators in the module itertools that are particularly useful in many scenarios. In this section, we will take a look at the built-in iterators in a bit of detail.

Do you recall what enumerate iterator is used for?

Enumerate

The enumerate function is useful to loop over an iterable as it returns the index and its value.

For instance,

>>> name_list = ["Luffy", "Zorro", "Sanji"]
>>> for index, name in enumerate(name_list):
...     print("{}. {}".format(index + 1, name))
1. Luffy
2. Zorro
3. Sanji
How can we get the position as well item without enumerate() iterator?

We can use the indexing to get the position and the item in a container. Let's take a look.

Getting position as well as index is a common operation that people in other programming languages perform in the following way:

>>> name_list = ["Luffy", "Zorro", "Sanji"]
>>> for i in range(len(name_list)):
    	# Using index to iterate
...     print("{}. {}".format(i + 1, name_list[i]))
1. Luffy
2. Zorro
3. Sanji

We have seen that container objects such as dictionaries and sets don't support indexing but are iter-able. We can use any iterable using a for a loop. To get the index position and its value, the enumerate function is pretty useful.

>>> for index, name in enumerate({"Luffy", "Zorro", "Sanji"}):    # Set Enumeration
...     print("{}. {}".format(index + 1, name))
1. Sanji
2. Zorro
3. Luffy

We have covered earlier how we can create our iterators using generators. To understand how the enumerate iterator works underneath, let's write our generator, enum, which mimics the enumerate function.

Can you try to guess how enumerate iterator might work under the hood?

As the enumerate is an iterator, we have to use the yield statement. We will also need to hold the information about the item's position in an object, which we will increment after the yield. Let's take a look at how we can create our enumnerate iterator.

>>> def enum(iterable):
...     position = 0
...     for i in iterable:
...         yield (position, i)
...         position += 1

We can try to test our enum generator by iterating over a set.

>>> for index, name in enum({"Luffy", "Zorro", "Sanji"}):    # Using `enum` generator
...     print("{}. {}".format(index + 1, name))
1. Sanji
2. Zorro
3. Luffy

We can see that our generator function enum returns a result similar to that of the enumerate function.

The built-inenumerate function is implemented in c programming language in Python and is more optimized than our generator. But creating our iterator imitating the enumerate function gives us the idea of how we can make the same functionality.

Next, we will take a look at the zip() iterator. Do you recall what zip iterator does?

Zip

Earlier, we encountered the built-in function zip(), which we can use to iterate through two or more iterables simultaneously.

>>> a, b, c = [1, 2, 3, 100, 200], [4, 5, 6], [7, 8, 9, 1000]
>>> for x, y, z in zip(a, b, c):
...     print(x, y, z)
1 4 7
2 5 8
3 6 9
The zip iterator creates a zip_object which generates elements from two or more iterables for parallel iteration.

The zip function returns a iterator object whose __next__() method returns a tuple with i-th element comes from the i-th iterable argument. For instance, the first element comes from the first iterable; the second comes from the second iterable, and so on.

The __next__() method continues until the shortest iterable in the argument sequence is exhausted. Then it raises the StopIteration exception.

You can use the zip_longest function in the itertools module if you wish to iterate through the longest iterable sequence. Let's see how it works.

>>> a, b, c = [1, 2, 3, 100, 200], [4, 5, 6], [7, 8, 9, 1000]
>>> from itertools import zip_longest
>>> for x, y, z in zip_longest(a, b, c):
...     print(x, y, z)
1 4 7
2 5 8
3 6 9
100 None 1000
200 None None

The zip_longest() function works similar to the zip function. However, unlike zip, it is exhausted when the longest iterable object is exhausted. The missing values are, by default, entered as None. You can specify your default value for missing items using the fillvalue argument.

>>> for x, y, z in zip_longest(a, b, c, fillvalue=0):
...     print(x, y, z)
1 4 7
2 5 8
3 6 9
100 0 1000
200 0 0

In the above code sample, we can see how the missing values corresponding to the shorter iterable is replaced by the fillvalue 0.

Which of the following number will not be present in the code's output below?

>>> from itertools import zip_longest
>>> for x, y in zip_longest(range(5), range(15), fillvalue=4):
...     print(x+y)
  1. 17
  2. 18
  3. 24
  4. 16

Map, Filter, and Reduce

In Python, you can code using object-oriented style programming or functional programming as well. We will cover both of these styles in-depth in later courses. For now, you should be aware these are different styles of programming.

Python provides three special functions that are widely used in functional programming style.

  • map
  • filter
  • reduce

Let's start with map() function.

Map

We can use the map() function to apply a function on all the elements of a specified iterable. The map() function returns a map object, which is an iterator.

The general syntax of the map object is the following.

map(function, iterables, ...)

To understand the map function, let's look into an example.

Figure 4: Map Iterator

Suppose we have a list of string objects, and we wish to get a corresponding list with the string's length. In Python, we can find the string's length using the built-in len() function. We can use the map function by applying the len() function to all the list elements.

>>> guest_names = ["James", "Joffery", "Jack", "Jimmy", "Jenny"]
>>> map(len, guest_names)
<map object at 0x7f05dda32ef0>
>>> list(map(len, guest_names))
[5, 7, 4, 5, 5]                # List of length of each string

Similarly, let's change the case of each of the strings to the uppercase. String objects have the upper() method, which changes all characters' cases in a string to uppercase.

To access the method, we can create another function to apply to the elements of the list.

>>> def to_upper_case(x):
...     return x.upper()
>>> list(map(to_upper_case, guest_names))
['JAMES', 'JOFFERY', 'JACK', 'JIMMY', 'JENNY']

Sometimes, in an actual program, we have to write one-time use functions such as the to_upper_case() function.

You might recall that Python provides a particular tool for creating a one-time use function. Can you guess which one?

We can create a nameless or anonymous function called lambda functions, which we can use for one-time use. Let's write a lambda function to convert to the upper-case.

>>> list(map(lambda x: x.upper(), guest_names))
['JAMES', 'JOFFERY', 'JACK', 'JIMMY', 'JENNY']

The lambda functions start with the lambda keyword following by the argument and statement. In this example, we essentially created a function that applies the string method upper() to string object arguments. As you can see, lambda functions can be pretty useful in such scenarios.

The function passed to the map() function can also take more than one argument. In this case, we can pass another iterable with identical or greater size can to the map function.

We can create a short function add_numbers() to return the sum of the numbers.

>>> def add_numbers(x,y,z):
    	return x + y + z
>>> list(map(add_numbers, range(10, 20), range(20, 30), range(30, 40)))
[30, 32, 34, 36, 38, 40, 42, 44, 46, 48]

In the above map() function, there is a single function (add_numbers()), which can accept three arguments. These three arguments are provided by three iterables (range(10, 20), range(20, 30), range(30, 40)) provided as an argument to the map()function.

We can also create a concise version of the code above using the lambda function.

>>> list(map(lambda x, y, z: x+y+z, range(10, 20), range(20, 30), range(30, 40)))
[130, 132, 134, 136, 138, 140, 142, 144, 146, 148]

Let's say we have a list of names.

>>> names = ["Luffy", "Sanji", "Zorro", "Chopper"]

We want to add "Pirate" to each name in the list above. Which of the following code will let us do that?

  1. list(map(lambda x: "Pirate " + x, names))
  2. list(map(lambda x, y: x, y, "Pirate", names))
  3. list(map(lambda x: names + x, "Pirate"))
  4. list(map(lambda x: names + x, names))

We can filter elements in an iterable in Python using the built-in filter() function. Let's take a look at the filter() function next.

Filter

The filter(f, iterable) function takes a function and iterable as an argument. The filter() function returns an iterator for which f(item) is True. If the function f is None, the filter() function returns the item which evaluates to True.

![Filter Iterator](/home/primer/Documents/Typora/Primer Courses /Python - I/images/Chapter6/Iterators, Generators and Comprehensions - Filter Iterator.png)

Figure 5: Filter Iterator


For instance, we can create a list of numbers between 0 and 100, which are divisible by 7.

>>> list(filter(lambda x: x % 7 == 0, range(100)))
[0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]

In the above code listing, we provide the lambda function lambda x: x % 7 == 0, which returns True or False based on whether a given number is divisible by 7. It returns the list of numbers for which the above lambda function returns True.

Let's see what happens when we provide None instead of a function.

>>> a = [False, 1, 0, (), [], "", "Primerlabs"]
>>> list(filter(None, a))        # Convert the filter object to a list
[1, 'Primerlabs']

In the above code statement, we pass the None argument as function and the list object a. The filter function filters out all the items in the list a whose truth value is False.

What's the output of the following?

>>> words = ['glyph', 'flyby', 'hello', 'crypt', 'apple']
>>> vowels = set('aeiou')
>>> list(filter(lambda word: not bool(vowels.intersection(word)), words))
  1. ['glyph', 'flyby', 'hello', 'crypt', 'apple']
  2. ['glyph', 'flyby', 'crypt']
  3. ['hello', 'apple']
  4. []

Often you need to apply a function to an iterable and reduce it to a single cumulative value. Python offers the function reduce() to help you do that. Let's take a look next.

Reduce

We use the reduce(f, iterable) function to apply a particular function $f$ to all of the iterable's elements. We have to import it from the functools module to use the reduce function. Let's understand the reduce() function using an example.

>>> from functools import reduce
>>> reduce(lambda x, y: x + y, range(5))
10

In the above code sample, the function lambda x, y: x + y is applied to all the elements in the iterable range(101). In the above code, the range(5) has the elements {0, 1, 2, 3, 4} as elements. The reduce function works in the following way :

  • The first two elements, 0 and 1, are passed as arguments to the function f(0, 1), which results in 1.
  • The next element, 2 and the previous result, 1, are passed to the function f(1, 2), which results in 3.
  • The next element 3 and the previous result, 3, are passed to the function f(3, 3), which results in 6.
  • Finally, the last element 4 and the previous result 6 are passed to the function f(6, 4), which results in the 10.

We can understand the reduce iterator better using figure 7.

Figure 7: Reduce Iterator



The reduce function in the above gives the sum of the elements of iterable. To find the sum of the first $100$ numbers, we can check using the following code listing.

>>> reduce(lambda x, y: x + y, range(101))        # Sum of first 100 numbers
5050
>>> sum(range(101))                                # Using `sum` function
5050

Generally, the reduce() function accepts a function and an iterable and returns a single value calculated as follows:

  1. Initially, Python calls the function with the first two items from the iterable and returns a partial result
  2. Python again calls the function with the partial result obtained in step 1 and the next value in the iterable.
  3. Python repeats the process until there are no more items in the sequence

The syntax of the reduce function is as follows:

reduce( function, iterable, initial ) -> value

When we provide the initial value, Python calls the function with the initial value and the first item in the iterable.

In Python 2, the reduce() function was available as a built-in function. However, a decision was taken to move it the functools module as there were more optimised functions available for the use cases that reduce was earlier used to solve such as sum(), any(), all(), max(), min(), and len().

We will understand deeply about reduce() in an upcoming course. For now, let's take a popular use case of reduce. We will find the maximum and minimum element's value in an iterable.

Can you think of a way to write a program to find the maximum value of an integer in a list of integers?

One way is to start with the first two elements and store the greater value as a partial result. Then compare the partial result with the third element and store the greater value as the partial result. Then repeat the process until there are no more elements to compare. Let's take a look.

If you have been paying attention, you can see the similarities between the reduce function and the way of finding the max() value I described above. To use reduce(), we will first define our custom comparison function find_max(). We will add some print() function calls in the custom function to look at what the reduce() function is doing underneath.

>>> def find_max(a, b):
...		if a == b:
...			print(f"{a} is the same as {b}. Greatest value: {a}")
...			return a
...		greater, lesser = (a, b) if a > b else (b, a)
...		print(f"{lesser} is less than {greater}."
              f"Greatest value: {greater}")
...		return greater

Now, let's import reduce(), define a list of numbers, and use the reduce() function on the list of numbers with our custom defined find_max() function.

>>> from functools import reduce
>>> b = [10, 12, 45, 23, 23, -56, 43.50, 45, 5]
>>> reduce(find_max, b)
10 is less than 12. Greatest value: 12
12 is less than 45. Greatest value: 45
23 is less than 45. Greatest value: 45
23 is less than 45. Greatest value: 45
-56 is less than 45. Greatest value: 45
43.5 is less than 45. Greatest value: 45
45 is the same as 45. Greatest value: 45
5 is less than 45. Greatest value: 45
45		# Final result: Maximum value

The above code listing shows how the max() function works underneath.

Hopefully, the reduce() function has started to make a bit of sense to you. We will cover the functional programming tools such as map(), filter(), and reduce() in an upcoming course. So, for now, the basic idea of how these three works are quite adequate.

Before we move on next, can you rephrase the find_max() function in your own words?

We will continue looking at other iterators available in the next section.

Infinite Iterators

Infinite Iterators are iterators that cannot be exhausted or can generate new values indefinitely.

The itertools module provides the following infinite iterators:

  • count
  • cycle
  • repeat

Let's check these iterators in a bit of detail. We will begin with the count iterator.

Count

The count() function returns a count_object iterator which returns consecutive values when passed to the next() function. The count() function has the following syntax,

count(start=0, step=1) -> count object iterator

For instance,

>>> from itertools import count
>>> for i in count(100, -5):
...        if i = 0:
...            break
...     print(i, end=" ")
100 95 90 85 80 75 70 65 60 55 50 45 40 35 30 25 20 15 10 5

If you don't give a break statement in the above code, the for loop will run indefinitely.

The following generator function counter() is equivalent to the count infinite iterator. What are A, B, C?

>>> def counter(start = 0, step = 1):    # 0 1 2 3 4 5 ...
...     num = __A__
...     while True:
...         yield __B__
...         num += __C__
  1. A: start, B, step, C: num
  2. A: start, B, num, C: step
  3. A: num, B, step, C: step
  4. A: num, B, step, C: start

The count function is equivalent to the following generator function.

>>> def counter(start = 0, step = 1):    # 0 1 2 3 4 5 ...
...     num = start
...     while True:
...         yield num
...         num += 1

We can create an infinite iterator that cycles through a finite iterable. The cycle iterator from the itertools modules helps you do that. Let's take a look.

Cycle

The cycle(iterable) function accepts an iterable as an argument. It returns elements from the iterable while saving a copy of each element. When the iterable is exhausted, the function returns elements from the saved copy indefinitely.

For instance,

>>> from itertools import cycle
>>> s = ""
>>> for x in cycle('ABC'):        # A B C A B C A B C ...
...     s += x + " "
...     print(s)
...     if len(s) > 10:            # Breaking condition
...         break
A
A B
A B C
A B C A
A B C A B
A B C A B C
A B C A B C A
A B C A B C A B

If you don't give a break statement, the code will run indefinitely in the above code listing. We can implement the cycle iterator can using a while loop in the following way:

def cycle(iterable):
    saved = []
    for element in iterable:
        yield element
        saved.append(element)
    while saved:
        for element in saved:
            yield element

The way our implementation of cycle function, my_cycle() works is as follows:

  1. Initially, it returns elements from the iterable, one at a time, using the yield statement in a for loop.
  2. The for loop also saves elements into a list container saved.
  3. Eventually, there are no elements to iterate over in the for loop.
  4. Then, the my_function starts returning elements from the saved list using a while loop indefinitely.

Here is a script which tracks users mood each day. What is the value of A so that the loop breaks off after seven days of recording?

from itertools import cycle

days = ["Mon", "Tue", "Wed", "Thu", "Fri", "Sat", "Sun"]

for day in cycle(days):
	diary = []

    if len(diary) > 7:
		print("I don't want to know anymore")
		print(diary)
		__A__

    mood = input("What's your mood today? : ")
	diary.append({day, mood})
  1. continue
  2. return
  3. break
  4. The for loop won't break off

In the previous exercise, we re-initialize the name diary to be set to an empty list at the beginning of each iteration. Therefore, the for loop will never break.

The next infinite iterator from the itertools module is the repeat iterator. Let's take a look.

Repeat

The repeat(object, times) function accepts an object and times as arguments. It returns the object repeatedly corresponding to the times argument.

>>> for x in repeat(10, 3):            # Return the object 3 times
...     print(x)
10
10
10

We can use a repeat iterator to provide a constant stream of values for map and zip iterators.

>>> list((map(lambda x, y: x*y, range(11), repeat(3))))
[0, 3, 6, 9, 12, 15, 18, 21, 24, 27, 30]

In the above code,

  • the function supplied to the map function accepts two arguments.
  • The first iterable range(11) provides the first argument while the repeat(3) provides the second argument for the map function.
  • As the repeat(3) object is an infinite iterator, Python maps the shortest of the two iterable.
  • The resulting list has all the elements of range(11) multiplied by 3 as we specified by the lambda we provided.

What's the value of __A__, for which we get the following code output?

>>> list(zip(__A__, repeat('Mississipi')))		# A
[(-3, 'Mississipi'), (-2, 'Mississipi'), (-1, 'Mississipi'), (0, 'Mississipi'), (1, 'Mississipi'), (2, 'Mississipi'), (3, 'Mississipi')]

  1. range(-3, 4)
  2. [-3, -2, -1, 0, 1, 2, 3, 4]
  3. range(0, 7) - 3
  4. range(-4, 3)

That brings us to the end of the section on infinite iterators. Next, we will look at a select group of iterators called Combinatoric Iterators.

Combinatoric Iterators

The combinatoric iterators generate the sequence, which is a combination of items in the iterables subjected to some constraints.

To understand these types of iterators, let's look into the combinatoric iterators provided by Python: Product, Permutations, and Combinations.

Product

In mathematics, the cartesian product of two sets A and B, denoted by A x B, is the set of all ordered pairs (a, b) where a is an element of A and b is an element of B.

Mathematically, we can represent the cartesian product of two sets, A and B, as follows.

$$
A \times B = {(a, b) | a \in A \text{ and } b \in B }
$$

One can similarly define the cartesian product of n sets, also known as n-fold cartesian product.

To illustrate the usefulness of Cartesian product, let's create a list from a deck of cards. To do that, we need to define the suits or the type of cards and the ranks for the cards.

>>> suits = {'♠', '♥', '♦', '♣'}                 # Set of Suits
>>> ranks = {'A', 'K', 'Q', 'J', '10', '9',     # Set of Ranks
             '8', '7', '6', '5', '4', '3', '2'}

To create a list of cards in the form of a tuple of the format (rank, suit), we can use a nested for loop.

>>> deck = []
>>> for suit in suits:
        for rank in ranks:
            deck.append((rank, suit))
>>> len(deck)
52

The itertools module provides a function product(*iterables, repeat=1) equivalent to the nested for-loop.

We can use the product function to rewrite our above code for generating the deck.

>>> from itertools import product
>>> card_deck = []
>>> for suit, rank in product(suits, ranks):
        card_deck.append((rank, suit))
>>> len(card_deck)
52

The product function provides the cartesian product suits x ranks.

What's the output of the following code?

>>> from itertools import product
>>> males = ["Luffy", "Zorro", "Sanji"]
>>> females = ["Robin", "Nami", "Boa"]
>>> len(list(product(males, females)))
  1. 9
  2. 12
  3. 18
  4. 6

To calculate the product of an iterable with itself, we can provide an optional argument, repeat, to specify the number of repetitions.

For instance, product([1, 2, 3], repeat=2) means same as product([1, 2, 3], [1, 2, 3]).

Let's say we want to generate a list of coordinates, ranging in x, y, and z. We can use the product function to do that in the following way.

>>> coordinates = []
>>> for x, y, z in product(range(5), repeat=3):
        coordinates.append((x, y, z))
>>> print(coordinates)
[(0, 0, 0), (0, 0, 1), ..., (4, 4, 2), (4, 4, 3), (4, 4, 4)]    # Shortened for brevity
As you can see, the product function is quite useful in generating the Cartesian product of sets. Can you think of a use case for the product function?

We usually use nested for loops to generate pairs of elements from two or more iterables. We can use the product function to generate instead.

Next, let's look into another combinatoric iterator: Permutations.

Permutations

In mathematics, permutation refers to arranging the members of a set into a sequence or order.

To understand permutations better, let's take an example of 3 people running a race, the possible order of result if no two-person finishes at the same time.

>>> persons = ["Luffy", "Zorro", "Sanji"]
>>> from itertools import permutations
>>> list(permutations(persons))
[('Luffy', 'Zorro', 'Sanji'), ('Luffy', 'Sanji', 'Zorro'), ('Zorro', 'Luffy', 'Sanji'), ('Zorro', 'Sanji', 'Luffy'), ('Sanji', 'Luffy', 'Zorro'), ('Sanji', 'Zorro', 'Luffy')]

In the above code listing, we generated a list of possible combinations of different race results.

The item occupying the first position in the tuple represents the first position; the second position represents the second, and the third position represents the last position.

When the order matters in a combination of items, it is called permutations.

The permutations function of the itertools module provides an iterator that returns all the possible combinations when order matters.

The function permutations has the function signature permutations(iterable, r=None). We can provide the r argument to specify the length of possible combinations.

For example, suppose five persons are running a race. In that case, we can determine the number of possible first two positions by specifying the argument in the permutations function.

>>> persons = ["Luffy", "Zorro", "Sanji", "Usopp", "Chopper"]
>>> list(permutations(persons, 2))
[('Luffy', 'Zorro'), ('Luffy', 'Sanji'), ('Luffy', 'Usopp'), ('Luffy', 'Chopper'), ('Zorro', 'Luffy'), ('Zorro', 'Sanji'), ('Zorro', 'Usopp'), ('Zorro', 'Chopper'), ('Sanji', 'Luffy'), ('Sanji', 'Zorro'), ('Sanji', 'Usopp'), ('Sanji', 'Chopper'), ('Usopp', 'Luffy'), ('Usopp', 'Zorro'), ('Usopp', 'Sanji'), ('Usopp', 'Chopper'), ('Chopper', 'Luffy'), ('Chopper', 'Zorro'), ('Chopper', 'Sanji'), ('Chopper', 'Usopp')]
>>> len(list(permutations(persons, 2)))
20

Mathematically permutations of r length elements without repetitions from a set of n elements are represented as P(n, r).

$$
P(n, r) = \frac{n!}{(n-r)!}
$$

The ! denotes the factorial notation, which we covered in the previous chapter. We can check if the code is correct by putting the length of the list persons, which is 5, and the value of r as 2.

$$
P(5, 2) = \frac{5!}{(5-2)!} = \frac{5\times4\times3!}{3!} = 5\times4 = 20
$$

What's the value of the following code output?

>>> len(list(permutations('Fire', 2)))
  1. 12
  2. 24
  3. 18
  4. 6

Combinations

In mathematics, when the order of the elements while choosing doesn't matter, it is called combinations.

Suppose we need to choose some gifts from a list of gifts without repetitions. The number of possible gifts that we can select is called combinations.

We can use nested a for loop or even the product function we learned earlier.

>>> from itertools import product
>>> gift_options = ['Primer Subscription', 'RC Car', 'Lego', 'Crayons']
>>> possible_gifts = []
>>> for gift1, gift2 in product(gift_options, repeat=2):
        if gift1 != gift2 and (gift2, gift1) not in possible_gifts:
            possible_gifts.append((gift1, gift2))
>>> possible_gifts
[('Primer Subscription', 'RC Car'), ('Primer Subscription', 'Lego'), ('Primer Subscription', 'Crayons'), ('RC Car', 'Lego'), ('RC Car', 'Crayons'), ('Lego', 'Crayons')]
>>> len(possible_gifts)
6

In the above code listing,

  • we defined our possible gift options in the list gift_options and created an empty list possible_gifts to store possible combinations.
  • Then, we used the product function to generate possible combinations of the iterable gift_options with itself by using the repeat argument.
  • We unpack the values returned by the product iterator in the names gift1 and gift2.
  • At each iteration, we check if gift1 and gift2 are not equal and have not been already added to the list possible_gifts.
  • If a set of gifts satisfy the conditions, we add them to the list.

Python provides the combination(iterable, r) function in the itertools module to make the same thing more manageable.

>>> from itertools import combinations
>>> for x, y in combinations(gift_options, 2):
...     possible_gifts.append((x, y))
>>> possible_gifts
[('Primer Subscription', 'RC Car'), ('Primer Subscription', 'Lego'), ('Primer Subscription', 'Crayons'), ('RC Car', 'Lego'), ('RC Car', 'Crayons'), ('Lego', 'Crayons')]
>>> len(possible_gifts)
6

Mathematically, we can represent the combinations as C(n, r) where n is the length of the set of elements to choose from while r is the number of items to choose from.

$$
C(n, r) = \frac{P(n ,r)}{r!} = \frac{n!}{r!(n-r)!}
$$

We can check if we got the correct result by putting n as four and r as 2 in the above equation.

$$
C(4, 2) = \frac{4!}{2!(4-2)!} = \frac{4\times3\times2!}{2!2!} = 6
$$

We can see that we got the same result mathematically.

In the iterators, there are several other functions available that provide iterators useful for looping. You can check what these functions are and implement them at the official python documentation.

What's the output of the following?

>>> from itertools import combinations
>>> len(list(combinations('Luffy', 2)))
  1. 10
  2. 12
  3. 15
  4. 25

Why Iterators

So far, we have looked into the following:

  • What are iterators?
  • How can we create our iterators?
  • Which are all the functions available in Python, which return iterators?

Now, let's try to understand why iterators are particularly useful.

Why don't you try to guess what makes iterators particularly useful?

Clean Code

One of the primary usage of iterators that we can use them in a for loop. The Python syntax of a for statement is easier to read and understand than the index-based looping techniques.

For instance, let's compare the two ways of looping in the following code.

persons = [ # List of Persons ]

# Index Based Looping
for i in range(len(persons)):
    # Do Something using `persons[i]`

# The recommended way of looping in Python
for person in persons:
    # Do something with `person`

The second for loop is more readable in the above code, thus contributing to cleaner code. As we earlier, mentioned many container objects which are iterables but don't support indexing. Iterators allow iteration on those objects in a much painless way.

Iterators can generate infinite sequences

The other advantage of working with iterators is that they can generate sequences indefinitely. We earlier, saw that functions such as count(), cycle() and repeat() can provide infinite iterators. Being able to work with such infinite generating sequences can be an advantage in many programming scenarios.

Saving Resources by Lazy Evaluation

Another advantage of using iterators is that it saves memory resources. When we create a list of some objects, Python creates a list that contains the resulting data.

While with an iterator or generator, Python creates an object that knows how to produce the data on demand. There is not much visible difference in speed of execution and memory usage for smaller containers of objects. However, for more massive data, the iterators save both time and memory.

We can better understand the concept by checking out the range() function in more detail. Let's do a small exercise first.

What's the output of the following code?

>>> range(10)
range(0, 10)
>>> type(range(10))
  1. <class 'dict'>
  2. <class 'range_iterator'>
  3. <class 'tuple'>
  4. <class 'range'>

When we invoke the range function using valid arguments, it returns a range object. It has some interesting properties, so let's learn more about the range object.

The Curious Case of Range Iterable

Let's create a range object using the range() function.

>>> range(10)
range(0, 10)
>>> type(range(10))
<class 'range'>            # Range Object

When we invoke the range function, it returns a range object. You can convert the range iterable by passing it to the list constructor to construct a list.

>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

However, the range(0, 10) doesn't create the full list; instead, it knows how to get the sequence objects on-demand.

It is especially useful when you work with large numbers. For example, we can define a range object that can generate many numbers, as shown below.

>>> range(10**30)                # 10^18 trillion
range(0, 1000000000000000000000000000000)

It allows us to use the range object in the iteration.

>>> for x in range(10**30):
...     if x == 1000:
...             print("Reached 1000. Stopping now.")
...             break
Reached 1000. Stopping now.

In the above code object, we iterated on the range for over 1000 objects. The rest of the objects' memory would have gone to waste had Python created the entire list.

Interestingly, the range objects are lazy iterables but not iterators. We can check this by calling the next() function on a range object.

>>> a = range(10)
>>> next(a)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'range' object is not an iterator

When a range object is passed to the iter() function, it returns an range_iterator object.

>>> a = range(10)
>>> b = iter(a)
>>> next(b)
0

We can also get the length of the range using the len() function.

>>> a = range(10)
>>> len(a)
10

Whenever in doubt, if a function returns an iterator, you can call the next() function.

So far, we have covered the Iterators and Generators in detail. In the next section, we will look into constructing container objects such as lists, sets, and dictionaries.

Comprehensions

A standard operation in programming is to apply a function to all of the list items, creating a new list object. For instance, we want to create a new list with the cube of all containing items for a given list of numbers.

How will you proceed to solve this?

We can do this by using a for loop or using the map function. Let's take a look.

Let's say we have a list of integers with the name nums.

>>> nums = [1, 4, 5, 6, 7, 8, 9, 21]

We can create a for loop and append each integer cube to another list.

>>> cubes = []
>>> for num in nums:
...     cubes.append(num**3)
>>> cubes
[1, 64, 125, 216, 343, 512, 729, 9261]

We can also do the same using a map function in the following way.

>>> list(map(lambda x : x**3, nums))
[1, 64, 125, 216, 343, 512, 729, 9261]
As you can see, both for loops and the map() function gets the same result. Which of the two methods you like better, and why?

Applying a function and creating new lists is so common in programming that Python provides a special syntax for constructing lists. It is called list comprehension. Let's look into list comprehensions in the next section.

List Comprehension

List comprehensions are a way of transforming an iterable into a list. During the transformation of the iterable, you can choose which element to keep and transform by subjecting them to certain conditions.

Before diving deep into the how-list comprehensions work, let's dive right ahead and re-create the above cube list, which we created earlier using map and for loop using list comprehensions.

>>> nums = [1, 4, 5, 6, 7, 8, 9, 21]
>>> [num**3 for num in nums]            # List Comprehension
[1, 64, 125, 216, 343, 512, 729, 9261]

In the above code, we use list comprehension to create the required list. In Python, list comprehensions construct a list object using the following general syntax.

[ f(x) for x in iterable if condition ]

Figure 7 : List Comprehensions

Every comprehension is characterised by three features:

  • Transformation using f(x)
  • Iteration using for x in iterable
  • An optional condition using if statment

So, for our above code list comprehensions,

  • transformation: f(num) = num**3
  • iteration : num in nums
  • condition: N/A

The syntax of list comprehensions is pretty much similar to that of generator expressions. The difference between generator expressions and list comprehensions is that list comprehensions create a list object with the values. In contrast, generator expressions return a generator object that is capable of generating the value.

What is the output of the following code?

>>> [num + 2 for num in range(1, 9) if num % 2 == 0 ]
  1. [4, 6, 8, 10]
  2. [3, 5, 7, 9]
  3. [2, 3, 4, 5]
  4. [1, 2, 3, 4]

Let's explore list comprehension using another example. Earlier, we used the filter function to find all the 7 multiples within 0 and 100.

>>> list(filter(lambda x: x % 7 == 0, range(100)))
[0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]

Let's perform the same operation using list comprehensions.

>>> [num for num in range(100) if num % 7 == 0]
[0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98]

In the above, list comprehensions,

  • transformation f(num) = num
  • iteration num in range(100)
  • condition if num % 7 == 0
We can write a list comprehension corresponding to a for loop, but we cannot write every for loop in the form of list comprehension.

Take a look at the following code.

>>> names = ["Luffy","Zorro", "Siddharth", "Sanji", "Sid"]
>>> [f"{name} is invited" for name in names if "Sid" not in name ]

Which of the following is not invited?

  1. Siddharth
  2. Luffy
  3. Zorro
  4. Sanji

There are times where lists comprehension is an easier way to do things, while sometimes it makes sense to use something else. Let's check out in the next section when it is good to create an object in a list comprehension.

When to use List Comprehensions

Whenever you are required to construct a new list of objects using transformed items from an iterable optionally subjected to some conditions, you can think about using a list comprehensions to get the job done.

We can see the general syntax of this type of requirement in the following way.

new_list = []
for item in iterable:
    if condition :
        # trans_function is defined elsewhere
        new_list.append(trans_func(item))

We can rewrite the above for loop to a list comprehension in the following way.

new_list = [trans_func(item) for item in iterable if condition]

Below is a Python code written using the for loop.

>>> names = ["lUFFY", "sANJI", "zORRO", "sIDDHARTH", "sID"]
>>> invited_guests = []
>>> for name in names:
...     if "sID" not in name:
...             invited_guests.append(name.title())
>>> invited_guests
['Luffy', 'Sanji', 'Zorro']

What's the equivalent code using list comprehensions?

  1. [name.title() for name in names if "sId" not in name ]
  2. [name.title() for name in names if "sID" not in name ]
  3. [name.upper() for name in names if "sID" not in names ]
  4. [name.title() for names in name if "sID" not in names ]

As you can see, the list of comprehensions is often a good idea for creating a list instead of a single for loop. The list comprehensions also work for nested loops, which we will look at in the next section.

Nested Loops

Earlier, we created a deck of cards using a nested loop and the product function of the itertools module.

>>> suits = {'♠', '♥', '♦', '♣'}                 # Set of Suits
>>> ranks = {'A', 'K', 'Q', 'J', '10', '9',     # Set of Ranks
             '8', '7', '6', '5', '4', '3', '2'}
>>> deck = []

# Creating a deck of cards using `for`
>>> for suit in suits:
        for rank in ranks:
            deck.append((rank, suit))

We can recreate the same in list comprehensions in the following way using the product function:

# Using product function
>>> deck = [(rank, suit) for (rank, suit) in product(suits, ranks)]

As you can see, list comprehension is relatively compact using the product function.

We saw that we could use nested for loops to generate card decks.

>>> suits = {'♠', '♥', '♦', '♣'}                  # Set of Suits
>>> ranks = {'A', 'K', 'Q', 'J', '10', '9',       # Set of Ranks
             '8', '7', '6', '5', '4', '3', '2'}
>>> deck = []
>>> for suit in suits:                            # Creating a deck of cards using `for`
        for rank in ranks:
            deck.append((rank, suit))

What of the following is the equivalent code in list comprehension?

  1. [(rank, suit) for rank, suit in (suits, ranks)]
  2. [(rank, suit) for suit in suits for rank in ranks]
  3. [(ranks, suits) for suits in suits for ranks in ranks]
  4. [(rank, suit) for suit, rank in suits, ranks]

We can write equivalent nested for loops by writing them one after another in a list comprehension. In the above exercise, the for loop's ordering doesn't matter as the iteration occurs in two separate iterables, i.e., suits and ranks. If there is a nested for loop for nested sequences, the ordering of the for loop matters.

To understand, let's take a nested list of guests.

>>> hogwarts, westeros = ["Harry", "Ron", "Hermoine"], ["Tyrion", "Daenerys", "Jon"]
>>> enterprise = ["Picard", "Data", "Spock"]
>>> invited_guests = [hogwarts, westeros, enterprise]
>>> unwanted_guests = ['Ron', 'Jon', 'Data']
>>> invited_guests
[['Harry', 'Ron', 'Hermoine'], ['Tyrion', 'Daenerys', 'Jon'], ['Picard', 'Data', 'Spock']]

Our invited_guests object is a list of lists in the above code listing, while the unwanted_guests lists contain guests we don't want to invite. Now, we plan to flatten the list, which means to construct a single list of items of the nested elements in the list while filtering out unwanted_guests.

To do that, let's implement a nested for loop.

>>> filtered_guests = []
>>> for nested_guests in invited_guests:
        for guest in nested_guests:
            # To remove unwanted_guests
            if guest not in unwanted_guests:
                filtered_guests.append(guest)
>>> filtered_guests
['Harry', 'Hermoine', 'Tyrion', 'Daenerys', 'Picard', 'Spock']

We can rewrite the above nested for loop using list comprehensions.

>>> filtered_guests = [guest for nested_guests in invited_guests for guest in nested_guests if guest not in unwanted_guests]
>>> filtered_guests
['Harry', 'Hermoine', 'Tyrion', 'Daenerys', 'Picard', 'Spock']

In the above code listing, the ordering of the for is more relevant. Nested list comprehensions work in the following way:

  • The outermost for statement comes first, i.e. for nested_guests in invited_guests
  • The condition associated with the outermost for statement comes next, which is not present in our case.
  • The next to outermost for statement is written next
  • The condition associated with the for statement is written next if any
  • The above step is repeated until there aren't any more for loops.

We can also rewrite the nested list comprehension over multiple lines to make it more readable.

>>> filtered_guests = [
       guest
       for guest in nested_guests
       for nested_guests in guests
       if guest not in unwanted_guests
]

The general syntax of writing bunch of nested for loops is as follows:

[ f(x)
	for x in iterable1
	if condition1
	for y in x
 	if condition2
 	for z in y
 	if condition3
 	...
]

We have now covered the basic ideas relating to the list comprehensions. While using list comprehensions can seem a wonderfully lovely idea, you might be tempted to use it everywhere, which might result in messy unreadable code.

So exercise a bit of restraint for your new love of list comprehensions.

Let's do a small exercise.

We have the following for loop to get the list person's name and resident status.

>>> persons = [{"name": "John", "resident" :True},
               {"name": "Devi", "resident" :False},
               {"name": "Kelly", "resident" :True},
               {"name": "Mary", "resident" :False},
               {"name": "Ravi", "resident" :False}
              ]
>>> resident_names = []
>>> for person in persons:
...     if person["resident"]:
...             resident_names.append(person["name"])
>>> resident_names
['John', 'Kelly']

What is the equivalent list comprehension for the same?

  1. [person["name"] for person in persons if person["resident"]]
  2. [person["resident"] for person in persons
  3. [person["resident"] for person in persons if person["name"]]
  4. [person["name"] for person in persons

In Python, we can also use similar comprehensions to create set objects. We will look into them in the next section.

Set Comprehensions

Set Comprehensions to work similarly as List Comprehensions. The only difference is that we wrap them in curly braces {} instead of brackets [].

The general syntax for set comprehension is as follows.

{ f(x) for x in iterable if condition }

Let's create a set that squares the number for a range of numbers.

>>> {num**2 for num in range(10)}
{0, 1, 64, 4, 36, 9, 16, 49, 81, 25}

The above code listing creates a set of items from a number within the range(10). Let's do a small exercise.

What is the output of the following code?

>>> {abs(num) for num in range(-5, 5)}
  1. {0, 1, 2, 3, 4, 5}
  2. {0, 1, 2, 3, 4, -1, -5, -4, -3, -2}
  3. {0, 1, 2, 3, -1, -5, -4, -3, -2}
  4. {0, 1, 2, 3, 4, 5, -1, -4, -3, -2}

The above exercise will make sense if you recall that the set can only contain distinct objects.

Next, let's try to create a set of primes from a set of the first $100$ natural numbers using set comprehension. We had earlier created such a set by using a for loop.

>>> prime_numbers = set()
>>> for num in range(2, 101):
        for x in range(2, num):
            # Any divisors between 2 and the number?
            if num % x == 0:
                break
        else:
            # No divisors found, must be a prime
            prime_numbers.add(num)
>>> prime_numbers
{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97}

We can write the same thing in a set comprehension.

>>> {num for num in range(2, 101) if all( num % y for y in range(2, num) )}
{2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97}

The code might seem a bit confusing, so let's break it down.

  • transformation f(num) = num
  • iteration for num in range(2, 101)
  • condition if all(num %y for y in range(2, num))

Let's further break down the condition part as well:

  • the condition part consists of a generator expression (num % y for y in range(2, num)).
  • We are checking if a given number is divisible by any number less than itself and greater than two.
  • We can use the built-in all() function on the generator expression as generator expressions are iterables.
  • The all() function returns False if there are any False values present in the iterable
  • If there is a divisor present, the iterable will contain 0, making the all() function False.
  • Therefore, only those numbers that don't have divisors in the given range will be returned.
The above code for generating prime using set comprehensions is a bit complex. Why don't you try to write what you think the above code is doing in your own words?

We can also use comprehension to construct a dictionary. Let's look into it in the next section.

Dictionary Comprehensions

Dictionary comprehensions have the following syntax.

{f(key):f(value) for key, value in iterable(key,value) if condition }

The characteristics of the dictionary comprehensions are

  • transformation f(key): f(value)
  • iteration for key, value in iterable(key, value)
  • condition if condition

The iteration part usually uses an iterable which contains items of length two or more, although it is not required.

Let's create a dictionary to illustrate.

>>> squares = [(x, x**2) for x in range(10)] # Create a list of 2-length tuple                       
>>> squares_dict = {key: value for key, value in squares } # Squares dict
>>> squares_dict
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

We created an iterable squares consisting of tuple(num, num**2) using list comprehension in the above code. We then provided it to our dictionary comprehension. Note that it is the same as directly passing the iterable squares to the dict constructor.

>>> dict(squares)
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25, 6: 36, 7: 49, 8: 64, 9: 81}

Time for an exercise.

We have the following code.

>>> males = ["Luffy", "Zorro", "Sanji"]
>>> females = ["Boa", "Robin", "Nami"]
>>> pairing = dict(list(zip(males,females)))
>>> pairing["Robin"]

What's the output of the last line?

  1. Zorro
  2. Luffy
  3. Sanji
  4. Raises KeyError

So, let's do a more complicated comprehension to understand the dictionary comprehension. Let's calculate the number of vowels in a given text.

Let's say we have the following text.

>>> text = """Once upon a midnight dreary, while I pondered, weak and weary,
Over many a quaint and curious volume of forgotten lore,
While I nodded, nearly napping, suddenly there came a tapping,
As of some one gently rapping, rapping at my chamber door.
"""

We can get the number of vowels in the following way:

>>> vowels = {'a', 'e', 'i', 'o', 'u'}
>>> vowels_count = {vowel:text.count(vowel) for vowel in vowels }
{'o': 14, 'e': 22, 'a': 18, 'i': 10, 'u': 6}
In the above code listing, we used the string method count to count each vowel in the text. Why don't you rephrase the code in your own words?

Let's take another example where we use a condition in the dictionary comprehension. This time we will deal with stock data.

Suppose we have a bunch of stock data, and we want to extract only a subset of the stock data.

Date Open High Low Close
2013-02-08 15.07 15.12 14.63 14.75
2013-02-21 13.62 13.95 12.9 13.37
2013-03-05 14.01 14.05 13.71 14.05
2013-03-15 16.45 16.54 15.88 15.98
2013-03-27 16.48 16.77 16.33 16.65
2013-04-09 16.07 16.1 15.67 15.7

We represent the data as a python dictionary.

>>> stocks = {
'2013-02-08': {'open': 15.07, 'high': 15.12, 'low': 14.63, 'close': 14.75},
'2013-02-21': {'open': 13.62, 'high': 13.95, 'low': 12.9, 'close': 13.37},
'2013-03-05': {'open': 14.01, 'high': 14.05, 'low': 13.71, 'close': 14.05},
'2013-03-15': {'open': 16.45, 'high': 16.54, 'low': 15.88, 'close': 15.98},
'2013-03-27': {'open': 16.48, 'high': 16.77, 'low': 16.33, 'close': 16.65},
'2013-04-09': {'open': 16.07, 'high': 16.1, 'low': 15.67, 'close': 15.7}
}

What we want is the following:

  • the dates in which the stock closed at a higher value than the price at which they opened
  • we want only the opening and closing price.

Let's do this using dictionary comprehension.

>>> filtered_data = {
    # Transformation
    date: {"open" : stock["open"], "close" : stock["close"]}
    for date, stock in stocks.items()  # Iteration
    if stock["open"] < stock["close"]  # Condition
}
>>> filtered_data
{'2013-03-05': {'open': 14.01, 'close': 14.05},
 '2013-03-27': {'open': 16.48, 'close': 16.65}}
As you can see in the dictionary, comprehensions make it easier to transform and filter dictionaries. Write about the dictionary comprehension in your own words.

Constructing Tuples

So far, we have checked out the list, set, and dictionary comprehensions. You might be wondering if there is a similar way of constructing tuples. The answer is Python doesn't provide any tuple comprehension method. However, you can create a tuple using the tuple() constructor and passing a generator expression. Let's take a look.

Suppose we want to create a tuple of multiples of $7$ within the first $100$ natural numbers. We can do this by creating a generator expression and passing it to the tuple constructor.

>>> tuple(num for num in range(101) if num % 7 == 0)
(0, 7, 14, 21, 28, 35, 42, 49, 56, 63, 70, 77, 84, 91, 98)

As you might remember, generator expressions don't require parenthesis if we pass them as a single value to a function. This way of constructing tuples is as close to a tuple comprehension; you can get in Python.

In this chapter, we covered in detail, iterators, generators, and comprehensions. In the next chapter, we will look into getting data into/ from outside in Python.