Being lazy is not always a bad thing. Every line of code you write has at least one expression that Python needs to evaluate. Python lazy evaluation is when Python takes the lazy option and delays working out the value returned by an expression until that value is needed.
An expression in Python is a unit of code that evaluates to a value. Examples of expressions include object names, function calls, expressions with arithmetic operators, literals that create built-in object types such as lists, and more. However, not all statements are expressions. For example, if statements and for loop statements donβt return a value.
Python needs to evaluate every expression it encounters to use its value. In this tutorial, youβll learn about the different ways Python evaluates these expressions. Youβll understand why some expressions are evaluated immediately, while others are evaluated later in the programβs execution. So, whatβs lazy evaluation in Python?
Get Your Code: Click here to download the free sample code that shows you how to use lazy evaluation in Python.
Take the Quiz: Test your knowledge with our interactive βWhat's Lazy Evaluation in Python?β quiz. Youβll receive a score upon completion to help you track your learning progress:
Interactive Quiz
What's Lazy Evaluation in Python?In this quiz, you'll test your understanding of the differences between lazy and eager evaluation in Python. By working through this quiz, you'll revisit how Python optimizes memory use and computational overhead by deciding when to compute values.
In Short: Python Lazy Evaluation Generates Objects Only When Needed
An expression evaluates to a value. However, you can separate the type of evaluation of expressions into two types:
- Eager evaluation
- Lazy evaluation
Eager evaluation refers to those cases when Python evaluates an expression as soon as it encounters it. Here are some examples of expressions that are evaluated eagerly:
1>>> 5 + 10
215
3
4>>> import random
5>>> random.randint(1, 10)
64
7
8>>> [2, 4, 6, 8, 10]
9[2, 4, 6, 8, 10]
10>>> numbers = [2, 4, 6, 8, 10]
11>>> numbers
12[2, 4, 6, 8, 10]
Interactive environments, such as the standard Python REPL used in this example, display the value of an expression when the line only contains the expression. This code section shows a few examples of statements and expressions:
- Lines 1 and 2: The first example includes the addition operator
+, which Python evaluates as soon as it encounters it. The REPL shows the value15. - Lines 4 to 6: The second example includes two lines:
- The import statement includes the keyword
importfollowed by the name of a module. The module namerandomis evaluated eagerly. - The function call
random.randint()is evaluated eagerly, and its value is returned immediately. All standard functions are evaluated eagerly. Youβll learn about generator functions later, which behave differently.
- The import statement includes the keyword
- Lines 8 to 12: The final example has three lines of code:
- The literal to create a list is an expression thatβs evaluated eagerly. This expression contains several integer literals, which are themselves expressions evaluated immediately.
- The assignment statement assigns the object created by the list literal to the name
numbers. This statement is not an expression and doesnβt return a value. However, it includes the list literal on the right-hand side, which is an expression thatβs evaluated eagerly. - The final line contains the name
numbers, which is eagerly evaluated to return the list object.
The list you create in the final example is created in full when you define it. Python needs to allocate memory for the list and all its elements. This memory wonβt be freed as long as this list exists in your program. The memory allocation in this example is small and wonβt impact the program. However, larger objects require more memory, which can cause performance issues.
Lazy evaluation refers to cases when Python doesnβt work out the values of an expression immediately. Instead, the values are returned at the point when theyβre required in the program. Lazy evaluation can also be referred to as call-by-need.
This delay of when the program evaluates an expression delays the use of resources to create the value, which can improve the performance of a program by spreading the time-consuming process across a longer time period. It also prevents values that will not be used in the program from being generated. This can occur when the program terminates or moves to another part of its execution before all the generated values are used.
When large datasets are created using lazily-evaluated expressions, the program doesnβt need to use memory to store the data structureβs contents. The values are only generated when theyβre needed.
An example of lazy evaluation occurs within the for loop when you iterate using range():
for index in range(1, 1_000_001):
print(f"This is iteration {index}")
The built-in range() is the constructor for Pythonβs range object. The range object does not store all of the one million integers it represents. Instead, the for loop creates a range_iterator from the range object, which generates the next number in the sequence when itβs needed. Therefore, the program never needs to have all the values stored in memory at the same time.
Lazy evaluation also allows you to create infinite data structures, such as a live stream of audio or video data that continuously updates with new information, since the program doesnβt need to store all the values in memory at the same time. Infinite data structures are not possible with eager evaluation since they canβt be stored in memory.
There are disadvantages to deferred evaluation. Any errors raised by an expression are also deferred to a later point in the program. This delay can make debugging harder.
The lazy evaluation of the integers represented by range() in a for loop is one example of lazy evaluation. Youβll learn about more examples in the following section of this tutorial.
What Are Examples of Lazy Evaluation in Python?
In the previous section, you learned about using range() in a for loop, which leads to lazy evaluation of the integers represented by the range object. There are other expressions in Python that lead to lazy evaluation. In this section, youβll explore the main ones.
Other Built-In Data Types
The Python built-ins zip() and enumerate() create two powerful built-in data types. Youβll explore how these data types are linked to lazy evaluation with the following example. Say you need to create a weekly schedule, or rota, that shows which team members will bring coffee in the morning.
However, the coffee shop is always busy on Monday mornings, and no one wants to be responsible for Mondays. So, you decide to randomize the rota every week. You start with a list containing the team membersβ names:
>>> names = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> import random
>>> random.shuffle(names)
>>> names
['Sarah', 'Jim', 'Denise', 'Matt', 'Kate']
You also shuffle the names using random.shuffle(), which changes the list in place. Itβs time to create a numbered list to pin to the notice board every week:
>>> for index, name in enumerate(names, start=1):
... print(f"{index}. {name}")
...
1. Sarah
2. Jim
3. Denise
4. Matt
5. Kate
You use enumerate() to iterate through the list of names and also access an index as you iterate. By default, enumerate() starts counting from zero. However, you use the start argument to ensure the first number is one.
But what does enumerate() do behind the scenes? To explore this, you can call enumerate() and assign the object it returns to a variable name:
>>> numbered_names = enumerate(names, start=1)
>>> numbered_names
<enumerate object at 0x11b26ae80>
The object created is an enumerate object, which is an iterator. Iterators are one of the key tools that allow Python to be lazy since their values are created on demand. The call to enumerate() pairs each item in names with an integer.
However, it doesnβt create those pairs immediately. The pairs are not stored in memory. Instead, theyβre generated when you need them. One way to evaluate a value from an iterator is to call the built-in function next():
>>> next(numbered_names)
(1, 'Sarah')
>>> next(numbered_names)
(2, 'Jim')
The numbered_names object doesnβt contain all the pairs within it. When it needs to create the next pair of values, it fetches the next name from the original list names and pairs it up with the next integer. You can confirm this by changing the third name in the list names before fetching the next value in numbered_names:
>>> names[2] = "The Coffee Robot"
>>> next(numbered_names)
(3, 'The Coffee Robot')
Even though you created the enumerate object numbered_names before you changed the contents of the list, you fetch the third item in names after you made the change. This behavior is possible because Python evaluates the enumerate object lazily.
Look back at the numbered list you created earlier with the for loop, which shows Sarah is due to buy coffee first. Sarah is a Python programmer, so she enquired whether the 1 next to her name means she should buy coffee on Tuesday since Monday ought to be 0.
You decide not to get angry. Instead, you update your code to use zip() to pair names with weekdays instead of numbers. Note that you recreate and shuffle the list again since you have made changes to it:
>>> names = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> weekdays = ["Monday", "Tuesday", "Wednesday", "Thursday", "Friday"]
>>> random.shuffle(names)
>>> names
['Denise', 'Jim', 'Sarah', 'Matt', 'Kate']
>>> for day, name in zip(weekdays, names):
... print(f"{day}: {name}")
...
Monday: Denise
Tuesday: Jim
Wednesday: Sarah
Thursday: Matt
Friday: Kate
When you call zip(), you create a zip object, which is another iterator. The program doesnβt create copies of the data in weekdays and names to create the pairs. Instead, it creates the pairs on demand. This is another example of lazy evaluation. You can explore the zip object directly as you did with the enumerate object:
>>> day_name_pairs = zip(weekdays, names)
>>> next(day_name_pairs)
('Monday', 'Denise')
>>> next(day_name_pairs)
('Tuesday', 'Jim')
>>> # Modify the third item in 'names'
>>> names[2] = "The Coffee Robot"
>>> next(day_name_pairs)
('Wednesday', 'The Coffee Robot')
The program didnβt need to create and store copies of the data when you call enumerate() and zip() because of lazy evaluation. Another consequence of this type of evaluation is that the data is not fixed when you create the enumerate or zip objects. Instead, the program uses the data present in the original data structures when a value is needed from the enumerate or zip objects.
Iterators in itertools
Iterators are lazy data structures since their values are evaluated when theyβre needed and not immediately when you define the iterator. There are many more iterators in Python besides enumerate and zip. Every iterable is either an iterator itself or can be converted into an iterator using iter().
However, in this section, youβll explore Pythonβs itertools module, which has several of these data structures. Youβll learn about two of these tools now, and then you can try some of the others after you finish this tutorial.
In the previous section, you worked with a list of team members. Now, you join forces with another team to participate in a quiz, and you want to print out the list of names of the entire quiz team:
>>> import itertools
>>> first_team = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> second_team = ["Mark", "Zara", "Mo", "Jennifer", "Owen"]
>>> for name in itertools.chain(first_team, second_team):
... print(name)
...
Sarah
Matt
Jim
Denise
Kate
Mark
Zara
Mo
Jennifer
Owen
The iterable you use in the for loop is the object created by itertools.chain(), which chains the two lists together into a single iterable. However, itertools.chain() doesnβt create a new list but an iterator, which is evaluated lazily. Therefore, the program doesnβt create copies of the strings with the names, but it fetches the strings when theyβre needed from the lists first_name and second_name.
Hereβs another way to observe the relationship between the iterator and the original data structures:
>>> first_team = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> second_team = ["Mark", "Zara", "Mo", "Jennifer", "Owen"]
>>> import sys
>>> sys.getrefcount(first_team)
2
>>> quiz_team = itertools.chain(first_team, second_team)
>>> sys.getrefcount(first_team)
3
The function sys.getrefcount() counts the number of times an object is referenced in the program. Note that sys.getrefcount() always shows one more reference to the object that comes from the call to sys.getrefcount() itself. Therefore, when thereβs only one reference to an object in the rest of the program, sys.getrefcount() shows two references.
When you create the chain object, you create another reference to the two lists since quiz_team needs a reference to where the original data is stored. Therefore, sys.getrefcount() shows an extra reference to first_team. But this reference disappears when you exhaust the iterator:
>>> for name in quiz_team:
... print(name)
...
Sarah
Matt
Jim
Denise
Kate
Mark
Zara
Mo
Jennifer
Owen
>>> sys.getrefcount(first_team)
2
Lazy evaluation of data structures such as itertools.chain rely on this reference between the iterator, such as itertools.chain, and the structure containing the data, such as first_team.
Another tool in itertools that highlights the difference between eager and lazy evaluation is itertools.islice(), which is the lazy evaluation version of Pythonβs slice. Create a list of numbers and a standard slice of that list:
>>> numbers = [2, 4, 6, 8, 10]
>>> standard_slice = numbers[1:4]
>>> standard_slice
[4, 6, 8]
Now, you can create an iterator version of the slice using itertools.islice():
>>> iterator_slice = itertools.islice(numbers, 1, 4)
>>> iterator_slice
<itertools.islice object at 0x117c93650>
The arguments in itertools.islice() include the iterable you want to slice and the integers to determine the start and stop indices of the slice, just like in a standard slice. You can also include an extra argument representing the step size. The final output doesnβt show the values in the slice since these havenβt been generated yet. Theyβll be created when needed.
Finally, change one of the values in the list and loop through the standard slice and the iterator slice to compare the outputs:
>>> numbers[2] = 999
>>> numbers
[2, 4, 999, 8, 10]
>>> for number in standard_slice:
... print(number)
...
4
6
8
>>> for number in iterator_slice:
... print(number)
...
4
999
8
You modify the third element in the list numbers. This change doesnβt affect the standard slice, which still contains the original numbers. When you create a standard slice, Python evaluates that slice eagerly and creates a new list containing the subset of data from the original sequence.
However, the iterator slice is evaluated lazily. Therefore, as you change the third value in the list before you loop through the iterator slice, the value in iterator_slice is also affected.
Youβll visit the itertools module again later in this tutorial to explore a few more of its iterators.
Generator Expressions and Generator Functions
Expressions that create built-in data structures, such as lists, tuples, or dictionaries, are evaluated eagerly. They generate and store all of the items in these data structures immediately. An example of this kind of expression is a list comprehension:
>>> import random
>>> coin_toss = [
... "Heads" if random.random() > 0.5 else "Tails"
... for _ in range(10)
... ]
>>> coin_toss
['Heads', 'Heads', 'Tails', 'Tails', 'Heads', 'Tails', 'Tails', 'Heads', 'Heads', 'Heads']
The expression on the right-hand side of the assignment operator (=) creates a list comprehension. This expression is evaluated eagerly, and the ten heads or tails values are created and stored in the new list.
The list comprehension includes a conditional expression that returns either the string "Heads" or "Tails" depending on the value of the condition between the if and else keywords. The random.random() function creates a random float between 0 and 1. Therefore, thereβs a 50 percent chance for the value created to be "Heads" or "Tails".
You can replace the square brackets with parentheses on the right-hand side of the assignment operator:
>>> coin_toss = (
... "Heads" if random.random() > 0.5 else "Tails"
... for _ in range(10)
... )
>>> coin_toss
<generator object <genexpr> at 0x117a43440>
The expression in parentheses is a generator expression. Even though it looks similar to the list comprehension, this expression is not evaluated eagerly. It creates a generator object. A generator object is a type of iterator that generates values when theyβre needed.
The generator object coin_toss doesnβt store any of the string values. Instead, it will generate each value when itβs needed. You can generate and fetch the next value using the built-in next():
>>> next(coin_toss)
Tails
>>> next(coin_toss)
Heads
The expression that generates "Heads" or "Tails" is only evaluated when you call next(). This generator will generate ten values since you use range(10) in the generatorβs for clause. As you called next() twice, there are eight values left to generate:
>>> for toss_result in coin_toss:
... print(toss_result)
...
Heads
Heads
Heads
Tails
Tails
Heads
Tails
Heads
The for loop iterates eight times, once for each of the remaining items in the generator. A generator expression is the lazy evaluation alternative to creating a list or a tuple. Itβs intended to be used once, unlike its eager counterparts like lists and tuples.
You can also create a generator object using a generator function. A generator function is a function definition that has a yield statement instead of a return statement. You can define a generator function to create a generator object similar to the one you used in the coin toss example above:
>>> def generate_coin_toss(number):
... for _ in range(number):
... yield "Heads" if random.random() > 0.5 else "Tails"
...
>>> coin_toss = generate_coin_toss(10)
>>> next(coin_toss)
'Heads'
>>> next(coin_toss)
'Tails'
>>> for toss_result in coin_toss:
... print(toss_result)
...
Tails
Heads
Tails
Heads
Tails
Tails
Heads
Tails
You create a new generator object each time you call the generator function. Unlike standard functions with return, which are evaluated in full, a generator is evaluated lazily. Therefore, when the first value is needed, the code in the generator function executes code up to the first yield statement. It yields this value and pauses, waiting for the next time a value is needed.
This process keeps running until there are no more yield statements and the generator function terminates, raising a StopIteration exception. The iteration protocol in the for loop catches this StopIteration error, which is used to signal the end of the for loop.
Lazy iteration in Python also allows you to create multiple versions of the data structure that are independent of each other:
>>> first_coin_tosses = generate_coin_toss(10)
>>> second_coin_tosses = generate_coin_toss(10)
>>> next(first_coin_tosses)
'Tails'
>>> next(first_coin_tosses)
'Tails'
>>> next(first_coin_tosses)
'Heads'
>>> second_as_list = list(second_coin_tosses)
>>> second_as_list
['Heads', 'Heads', 'Heads', 'Heads', 'Heads', 'Tails', 'Tails', 'Tails', 'Tails', 'Heads']
>>> next(second_coin_tosses)
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
StopIteration
>>> next(first_coin_tosses)
'Tails'
The two generators, first_coin_tosses and second_coin_tosses, are separate generators created from the same generator function. You evaluate the first three values of first_coin_tosses. This leaves seven values in the first generator.
Next, you convert the second generator into a list. This evaluates all its values to store them in the second_as_list. There are ten values since the values you got from the first generator have no effect on the second one.
You confirm there are no more values left in the second generator when you call next() and get a StopIteration error. However, the first generator, first_coin_tosses, still has values to evaluate since itβs independent of the second generator.
Generators, and iterators in general, are central tools when dealing with lazy evaluation in Python. This is because they only yield values when theyβre needed and donβt store all their values in memory.
Short-Circuit Evaluation
The examples of lazy evaluation youβve seen so far focused on expressions that create data structures. However, these are not the only types of expressions that can be evaluated lazily. Consider the and and or operators. A common misconception is that these operators return True or False. In general, they donβt.
You can start to explore and with a few examples:
>>> True and True
True
>>> True and False
False
>>> 1 and 0
0
>>> 0 and 1
0
>>> 1 and 2
2
>>> 42 and "hello"
'hello'
The first two examples have Boolean operands and return a Boolean. The result is True only when both operands are True. However, the third example doesnβt return a Boolean. Instead, it returns 0, which is the second operand in 1 and 0. And 0 and 1 also returns 0, but this time, itβs the first operand. The integer 0 is falsy, which means that bool(0) returns False.
Similarly, the integer 1 is truthy, which means that bool(1) returns True. All non-zero integers are truthy. When Python needs a Boolean value, such as in an if statement or with operators such as and and or, it converts the object to a Boolean to determine whether to treat it as true or false.
When you use the and operator, the program evaluates the first operand and checks whether itβs truthy or falsy. If the first operand is falsy, thereβs no need to evaluate the second operand since both need to be truthy for the overall result to be truthy. This is what occurs in the expression 0 and 1 where the and operator returns the first value, which is falsy. Therefore, the whole expression is falsy.
Python doesnβt evaluate the second operand when the first one is falsy. This is called short-circuit evaluation and itβs an example of lazy evaluation. Python only evaluates the second operand if it needs it.
If the first operand is truthy, Python evaluates and returns the second operand, whatever its value. If the first operand is truthy, the truthiness of the second operand determines the overall truthiness of the and expression.
The final two examples include operands that are truthy. The second operand is returned in both cases to make the whole expression truthy. You can confirm that Python doesnβt evaluate the second operand if the first is falsy with the following examples:
>>> 0 and print("Do you see this text?")
0
>>> 1 and print("Do you see this text?")
Do you see this text?
In the first example, the first operand is 0, and the print() function is never called. In the second example, the first operand is truthy. Therefore, Python evaluates the second operand, calling the print() function. Note that the result of the and expression is the value returned by print(), which is None.
Another striking demonstration of short-circuiting is when you use an invalid expression as the second operand in an and expression:
>>> 0 and int("python")
0
>>> 1 and int("python")
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'python'
The call int("python") raises a ValueError since the string "python" canβt be converted into an integer. However, in this first example, the and expression returns 0 without raising the error. The second operand was never evaluated!
The or operator works similarly. However, only one operand needs to be truthy for the entire expression to evaluate as truthy. Therefore, if the first operand is truthy, itβs returned, and the second operand isnβt evaluated:
>>> 1 or 2
1
>>> 1 or 0
1
>>> 1 or int("python")
1
In all these examples, the first operand is returned since itβs truthy. The second operand is ignored and is never evaluated. You confirm this with the final example, which doesnβt raise a ValueError. This is short-circuit evaluation in the or expression. Python is lazy and doesnβt evaluate expressions that have no effect on the final outcome.
However, if the first operand is falsy, the result of the or expression is determined by the second operand:
>>> 0 or 1
1
>>> 0 or int("python")
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
ValueError: invalid literal for int() with base 10: 'python'
The built-in any() and all() functions are also evaluated lazily using short-circuit evaluation. The any() function returns True if any of the elements in an iterable is truthy:
>>> any([0, False, ""])
False
>>> any([0, False, "hello"])
True
The list you use in the first call to any() contains the integer 0, the Boolean False, and an empty string. All three objects are falsy and any() returns False. In the second example, the final element is a non-empty string, which is truthy. The function returns True.
The function stops evaluating elements of the iterable when it finds the first truthy value. You can confirm this using a trick similar to the one you used with the and and or operators with help from a generator function, which you learned about in the previous section:
>>> def lazy_values():
... yield 0
... yield "hello"
... yield int("python")
... yield 1
...
>>> any(lazy_values())
True
You define the generator function lazy_values() with four yield statements. The third statement is invalid since "python" canβt be converted into an integer. You create a generator when you call this function in the call to any().
The program doesnβt raise any errors, and any() returns True. The evaluation of the generator stopped when any() encountered the string "hello", which is the first truthy value in the generator. The function any() performs lazy evaluation.
However, if the invalid expression doesnβt have any truthy values ahead of it, itβs evaluated and raises an error:
>>> def lazy_values():
... yield 0
... yield ""
... yield int("python")
... yield 1
...
>>> any(lazy_values())
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
File "<input>", line 4, in lazy_values
ValueError: invalid literal for int() with base 10: 'python'
The first two values are falsy. Therefore, any() evaluates the third value, which raises the ValueError.
The function all() behaves similarly. However, all() requires all the elements of the iterable to be truthy. Therefore, all() short-circuits when it encounters the first falsy value. You update the generator function lazy_values() to verify this behavior:
>>> def lazy_values():
... yield 1
... yield ""
... yield int("python")
... yield 1
...
>>> all(lazy_values())
False
This code doesnβt raise an error since all() returns False when it evaluates the empty string, which is the second element in the generator.
Short-circuiting, like other forms of lazy evaluation, prevents unnecessary evaluation of expressions when these expressions are not required at run time.
Functional Programming Tools
Functional programming is a programming paradigm in which functions only have access to data input as arguments and do not alter the state of objects, returning new objects instead. A program written in this style consists of a series of these functions, often with the output from a function used as an input for another function.
Since data is often passed from one function to another, itβs convenient to use lazy evaluation of data structures to avoid storing and moving large datasets repeatedly.
Three of the principle tools in functional programming are Pythonβs built-in map() and filter() functions and reduce(), which is part of the functools module. Technically, the first two are not functions but constructors of the map and filter classes. However, you use them in the same way you use functions, especially in the functional programming paradigm.
You can explore map() and filter() with the following example. Create a list of strings containing names. First, you want to convert all names to uppercase:
>>> original_names = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> names = map(str.upper, original_names)
>>> names
<map object at 0x117ad31f0>
The map() function applies the function str.upper() against each item in the iterable. Each name in the list is passed to str.upper(), and the value returned is used.
However, map() doesnβt create a new list. Instead, it creates a map object, which is an iterator. Itβs not surprising that iterators appear often in a tutorial about lazy evaluation since theyβre one of the main tools for the lazy evaluation of values!
You can evaluate each value, one at a time, using next():
>>> next(names)
'SARAH'
>>> next(names)
'MATT'
You can also convert the map object into a list. This evaluates the values so they can be stored in the list:
>>> list(names)
['JIM', 'DENISE', 'KATE']
There are only three names in this list. You already evaluated and used the first two names when you called next() twice. Since values are evaluated when theyβre needed and not stored in the data structure, you can only use them once.
Now, you only want to keep names that contain at least one letter a. You can use filter() for this task. First, youβll need to recreate the map object representing the uppercase letters since you already exhausted this generator in the REPL session:
>>> names = map(str.upper, original_names)
>>> names = filter(lambda x: "A" in x, names)
>>> names
<filter object at 0x117ad0610>
Each item in the second argument in filter(), which is the map object names, is passed to the lambda function you include as the first argument. Only the values for which the lambda function returns True are kept. The rest are discarded.
You reuse the variable called names at each stage. If you prefer, you can use different variable identifiers, but if you donβt need to keep the intermediate results, itβs best to use the same variable. The object that filter() returns is another iterator, a filter object. Therefore, its values havenβt been evaluated yet.
You can cast the filter object to a list as you did in the previous example. But in this case, try looping using a for loop instead:
>>> for name in names:
... print(name)
...
SARAH
MATT
KATE
The first function call to map() converts the names to uppercase. The second call, this time to filter(), only keeps the names that include the letter a. You use uppercase A in the code since youβve already converted all the names to uppercase.
Finally, you only keep names that are four letters long. The code below shows all the map() and filter() operations since you need to recreate these iterators each time:
>>> names = map(str.upper, original_names)
>>> names = filter(lambda x: "A" in x, names)
>>> names = filter(lambda x: len(x) == 4, names)
>>> list(names)
['MATT', 'KATE']
You can reorder the operations to make the overall evaluation lazier. The first operation converts all names to uppercase, but since you discard some of these names later, it would be best to avoid converting these names. You can filter the names first and convert them to uppercase in the final step. You add "Andy" to the list of names to ensure that your code works whether the required letter is uppercase or lowercase:
>>> original_names = ["Sarah", "Matt", "Jim", "Denise", "Kate", "Andy"]
>>> names = filter(lambda x: ("a" in x) or ("A" in x), original_names)
>>> names = filter(lambda x: len(x) == 4, names)
>>> names = map(str.upper, names)
>>> list(names)
['MATT', 'KATE', 'ANDY']
The first call to filter() now checks if either uppercase or lowercase a is in the name. Since itβs more likely that the letter a is not the first letter in the name, you set the first operand to ("a" in x) in the or expression to take advantage of short-circuiting with the or operator.
The lazy evaluation obtained from using map and filter iterators means that temporary data structures containing all the data are not needed in each function call. This wonβt have a significant impact in this case since the list only contains six names, but it can affect performance with large sets of data.
File Reading Operations
The final example of expressions that are evaluated lazily will focus on reading data from a comma-separated values file, usually referred to as a CSV file. CSV files are a basic spreadsheet file format. They are text files with the .csv file extension that have commas separating values to denote values that belong to different cells in the spreadsheet. Each line ends with the newline character "\n" to show where each row ends.
You can use any CSV file you wish for this section, or you can copy the data below and save it as a new text file with the .csv extension. Name the CSV file superhero_pets.csv and place it in your project folder:
superhero_pets.csv
Pet Name,Species,Superpower,Favorite Snack,Hero Owner
Whiskertron,Cat,Teleportation,Tuna,Catwoman
Flashpaw,Dog,Super Speed,Peanut Butter,The Flash
Mystique,Squirrel,Illusion,Nuts,Doctor Strange
Quackstorm,Duck,Weather Control,Bread crumbs,Storm
Bark Knight,Dog,Darkness Manipulation,Bacon,Batman
Youβll explore two ways of reading data from this CSV file. In the first version, youβll open the file and use the .readlines() method for file objects:
>>> import pprint
>>> with open("superhero_pets.csv", encoding="utf-8") as file:
... data = file.readlines()
...
>>> pprint.pprint(data)
['Pet Name,Species,Superpower,Favorite Snack,Hero Owner\n',
'Whiskertron,Cat,Teleportation,Tuna,Catwoman\n',
'Flashpaw,Dog,Super Speed,Peanut Butter,The Flash\n',
'Mystique,Squirrel,Illusion,Nuts,Doctor Strange\n',
'Quackstorm,Duck,Weather Control,Bread crumbs,Storm\n',
'Bark Knight,Dog,Darkness Manipulation,Bacon,Batman\n']
>>> print(type(data))
<class 'list'>
You import pprint to enable pretty printing of large data structures. Once you open the CSV file using the with context manager, specifying the fileβs encoding, you call the .readlines() method for the open file. This method returns a list that contains all the data in the spreadsheet. Each item in the list is a string containing all the elements in a row.
This evaluation is eager since .readlines() extracts all the contents of the spreadsheet and stores them in a list. This spreadsheet doesnβt contain a lot of data. However, this route could lead to significant pressure on memory resources if youβre reading large amounts of data.
Instead, you can use Pythonβs csv module, which is part of the standard library. To simplify this code in the REPL, you can open the file without using a with context manager. However, you should remember to close the file when you do so. In general, you should use with to open files whenever possible:
>>> import csv
>>> file = open("superhero_pets.csv", encoding="utf-8", newline="")
>>> data = csv.reader(file)
>>> data
<_csv.reader object at 0x117a830d0>
You add the named argument newline="" when opening the file to use with the csv module to ensure that any newlines within fields are dealt with correctly. The object returned by csv.reader() is not a list but an iterator. Youβve encountered iterators enough times already in this article to know what to expect.
The contents of the spreadsheet arenβt stored in a data structure in the Python program. Instead, Python will lazily fetch each line when itβs needed, getting the data directly from the file, which is still open:
>>> next(data)
['Pet Name', 'Species', 'Superpower', 'Favorite Snack', 'Hero Owner']
>>> next(data)
['Whiskertron', 'Cat', 'Teleportation', 'Tuna', 'Catwoman']
>>> next(data)
['Flashpaw', 'Dog', 'Super Speed', 'Peanut Butter', 'The Flash']
The first call to next() triggers the evaluation of the first item of the data iterator. This is the first row of the spreadsheet, which is the header row. You call next() another two times to fetch the first two rows of data.
You can use a for loop to iterate through the rest of the iterator, and evaluate the remaining items:
>>> for row in data:
... print(row)
...
['Mystique', 'Squirrel', 'Illusion', 'Nuts', 'Doctor Strange']
['Quackstorm', 'Duck', 'Weather Control', 'Bread crumbs', 'Storm']
['Bark Knight', 'Dog', 'Darkness Manipulation', 'Bacon', 'Batman']
>>> file.close()
You evaluated the header and the first two rows in earlier code. Therefore, the for loop only has the final three rows to iterate through. And itβs good practice to close the file since youβre not using a with statement.
The reader() function in the csv module enables you to evaluate the spreadsheet rows lazily by fetching each row only when itβs needed. However, calling .readlines() on an open file evaluates the rows eagerly by fetching them all immediately.
How Can a Data Structure Have Infinite Elements?
Lazy evaluation of expressions also enables data structures with infinite elements. Infinite data structures canβt be achieved through eager evaluation since itβs not possible to generate and store infinite elements in memory! However, when elements are generated on demand, as in lazy evaluation, itβs possible to have an object that represents an infinite number of elements.
The itertools module has several tools that can be used to create infinite iterables. One of these is itertools.count(), which yields sequential numbers indefinitely. You can set the starting value and the step size when you create a count iterator:
>>> import itertools
>>> quarters = itertools.count(start=0, step=0.25)
>>> for _ in range(8):
... print(next(quarters))
...
0
0.25
0.5
0.75
1.0
1.25
1.5
1.75
The iterator quarters will yield values 0.25 larger than the previous one and will keep yielding values forever. However, none of these values is generated when you define quarters. Each value is generated when itβs needed, such as by calling next() or as part of an iteration process, such as a for loop.
Another tool you can use to create infinite iterators is itertools.cycle(). You can explore this tool with the list of team member names you used earlier in this tutorial to create a rota for whoβs in charge of getting coffee in the morning. You decide you donβt want to regenerate the rota every week, so you create an infinite iterator that cycles through the names:
>>> names = ["Sarah", "Matt", "Jim", "Denise", "Kate"]
>>> rota = itertools.cycle(names)
>>> rota
<itertools.cycle object at 0x1156be340>
The object returned by itertools.cycle() is an iterator. Therefore, it doesnβt create all its elements when itβs first created. Instead, it generates values when theyβre needed:
>>> next(rota)
'Sarah'
>>> next(rota)
'Matt'
>>> next(rota)
'Jim'
>>> next(rota)
'Denise'
>>> next(rota)
'Kate'
>>> next(rota)
'Sarah'
>>> next(rota)
'Matt'
The cycle iterator rota starts yielding each name from the original list names. When all names have been yielded once, the iterator starts yielding names from the beginning of the list again. This iterator will never run out of values to yield since it will restart from the beginning of the list each time it reaches the last name.
This is an object with an infinite number of elements. However, only five strings are stored in memory since there are only five names in the original list.
The iterator rota is iterable, like all iterators. Therefore, you can use it as part of a for loop statement. However, this now creates an infinite loop since the for loop never receives a StopIteration exception to trigger the end of the loop.
You can also achieve infinite data structures using generator functions. You can recreate the rota iterator by first defining the generator function generate_rota():
>>> def generate_rota(iterable):
... index = 0
... length = len(iterable)
... while True:
... yield iterable[index]
... if index == length - 1:
... index = 0
... else:
... index += 1
...
>>> rota = generate_rota(names)
>>> for _ in range(12):
... print(next(rota))
...
Sarah
Matt
Jim
Denise
Kate
Sarah
Matt
Jim
Denise
Kate
Sarah
Matt
In the generator function generate_rota(), you manually manage the index to fetch items from the iterable, increasing the value after each item is yielded and resetting it to zero when you reach the end of the iterable. The generator function includes a while True statement, which makes this an infinite data structure.
In this example, the generator function replicates behavior you can achieve with itertools.cycle(). However, you can create any generator with custom requirements using this technique.
What Are the Advantages of Lazy Evaluation in Python?
You can revisit an earlier example to explore one of the main advantages of lazy evaluation. You created a list and a generator object with several outcomes from a coin toss earlier in this tutorial. In this version, youβll create one million coin tosses in each one:
>>> import random
>>> coin_toss_list = [
... "Heads" if random.random() > 0.5 else "Tails"
... for _ in range(1_000_000)
... ]
>>> coin_toss_gen = (
... "Heads" if random.random() > 0.5 else "Tails"
... for _ in range(1_000_000)
... )
>>> import sys
>>> sys.getsizeof(coin_toss_list)
8448728
>>> sys.getsizeof(coin_toss_gen)
200
You create a list and a generator object. Both objects represent one million strings with either "Heads" or "Tails". However, the list takes up over eight million bytes of memory, whereas the generator uses only 200 bytes. You may get a slightly different number of bytes depending on the Python version youβre using.
The list contains all of the one million strings, whereas the generator doesnβt since it will generate these values when theyβre needed. When you have large amounts of data, using eager evaluation to define data structures may put pressure on memory resources in your program and affect performance.
This example also shows another advantage of using lazy evaluation when you create a data structure. You could use the conditional expression that returns "Heads" or "Tails" at random directly in the code whenever you need it. However, creating a generator might be a better option.
Since you included the logic of how to create the values you need in the generator expression, you can use a more declarative style of coding in the rest of your code. You state what you want to achieve without focusing on how to achieve it. This can make your code more readable.
Another advantage of lazy evaluation is the performance gains you could achieve by avoiding the evaluation of expressions that you donβt need. These benefits become noticeable in programs that evaluate large numbers of expressions.
You can demonstrate this performance benefit using the timeit module in Pythonβs standard library. You can explore this with the short-circuit evaluation when you use the and operator. The following two expressions are similar and return the same truthiness:
>>> import random
>>> random.randint(0, 1) and random.randint(0, 10)
1
>>> random.randint(0, 10) and random.randint(0, 1)
8
These expressions return a truthy value if both calls to random.randint() return non-zero values. They will return 0 if at least one function returns 0. However, itβs more likely for random.randint(0, 1) to return 0 compared with random.randint(0, 10).
Therefore, if you need to evaluate this expression repeatedly in your code, the first version is more efficient due to short-circuit evaluation. You can time how long it takes to evaluate these expressions many times:
>>> import timeit
>>> timeit.repeat(
... "random.randint(0, 1) and random.randint(0, 10)",
... number=1_000_000,
... globals=globals(),
... )
[0.39701350000177626, 0.37251866700171377, 0.3730850419997296,
0.3731833749989164, 0.3740811660027248]
>>> timeit.repeat(
... "random.randint(0, 10) and random.randint(0, 1)",
... number=1_000_000,
... globals=globals(),
... )
[0.504747375001898, 0.4694556670001475, 0.4706860409969522,
0.4841222920003929, 0.47349566599950776]
The output shows the time it takes for one million evaluations of each expression. There are five separate timings for each expression. The first version is the one that has random.randint(0, 1) as its first operand, and it runs quicker than the second one, which has the operands switched around.
The evaluation of the and expression short-circuits when the first random.randint() call returns 0. Since random.randint(0, 1) has a 50 percent chance of returning 0, roughly half the evaluations of the and expression will only call the first random.randint().
When random.randint(0, 10) is the first operand, the expressionβs evaluation will only short-circuit once out of every eleven times it runs since there are eleven possible values returned by random.randint(0, 10).
The advantages of reducing memory consumption and improving performace can be significant in some projects where demands on resources matter. However, there are some disadvantages to lazy evaluation. Youβll explore these in the next section.
What Are the Disadvantages of Lazy Evaluation in Python?
Lazy evaluation reduces memory requirements and unnecessary operations by delaying the evaluation. However, this delay can also make debugging harder. If thereβs an error in an expression thatβs evaluated lazily, the exception is not raised right away. Instead, youβll only encounter the error at a later stage of the codeβs execution when the expression is evaluated.
To demonstrate this, you can return to the list of team members you used earlier in this tutorial. On this occasion, you want to keep track of the points they gained during a team-building exercise:
>>> players = [
... {"Name": "Sarah", "Games": 4, "Points": 23},
... {"Name": "Matt", "Games": 7, "Points": 42},
... {"Name": "Jim", "Games": 1, "Points": 7},
... {"Name": "Denise", "Games": 0, "Points": 0},
... {"Name": "Kate", "Games": 5, "Points": 33},
... ]
You create a list of players, and each item in the list is a dictionary. Each dictionary contains three key-value pairs to store the playerβs name, the number of games they play, and the total number of points they scored.
Youβre interested in the average number of points per game for each player, so you create a generator with this value for each player:
>>> average_points_per_game = (
... item["Points"] / item["Games"]
... for item in players
... )
>>> average_points_per_game
<generator object <genexpr> at 0x11566a880>
The generator expression is evaluated lazily. Therefore, the required values are not evaluated right away. Now, you can start calling next() to fetch the average number of points per game for each player:
>>> next(average_points_per_game)
5.75
>>> next(average_points_per_game)
6.0
>>> next(average_points_per_game)
7.0
>>> next(average_points_per_game)
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
File "<input>", line 1, in <genexpr>
ZeroDivisionError: division by zero
Your code evaluates and returns the values for the first three players. However, it raises a ZeroDivisionError when it tries to evaluate the fourth value. Denise didnβt enjoy the team-building event and didnβt participate in any of the games. Therefore, she played zero games and scored zero points. The division operation in your generator expression raises an exception in this case.
Eager evaluation would raise this error at the point you create the object. You can replace the parentheses with square brackets to create a list comprehension instead of a generator:
>>> average_points_per_game = [
... item["Points"] / item["Games"]
... for item in players
... ]
Traceback (most recent call last):
...
File "<input>", line 1, in <module>
File "<input>", line 1, in <listcomp>
ZeroDivisionError: division by zero
The error is raised immediately in this scenario. Delayed errors can make them harder to identify and fix, leading to increased difficulty with debugging code. A popular third-party Python library, TensorFlow, shifted from lazy evaluation to eager evaluation as the default option to facilitate debugging. Users can then turn on lazy evaluation using a decorator once they complete the debugging process.
Conclusion
In this tutorial, you learned what lazy evaluation in Python is and how itβs different from eager evaluation. Some expressions arenβt evaluated when the program first encounters them. Instead, theyβre evaluated when the values are needed in the program.
This type of evaluation is referred to as lazy evaluation and can lead to more readable code thatβs also more memory-efficient and performant. In contrast, eager evaluation is when an expression is evaluated in full immediately.
The ideal evaluation mode depends on several factors. For small data sets, there are no noticeable benefits to using lazy evaluation for memory efficiency and performance. However, the advantages of lazy evaluation become more important for large amounts of data. Lazy evaluation can also make errors and bugs harder to spot and fix.
Lazy evaluation is also not ideal when youβre generating data structures such as iterators and need to use the values repeatedly in your program. This is because youβll need to generate the values again each time you need them.
In Python, lazy evaluation often occurs behind the scenes. However, youβll also need to decide when to use expressions that are evaluated eagerly or lazily, like when you need to create a list or generator object. Now, youβre equipped with the knowledge to understand how to deal with both types of evaluation.
Get Your Code: Click here to download the free sample code that shows you how to use lazy evaluation in Python.
Take the Quiz: Test your knowledge with our interactive βWhat's Lazy Evaluation in Python?β quiz. Youβll receive a score upon completion to help you track your learning progress:
Interactive Quiz
What's Lazy Evaluation in Python?In this quiz, you'll test your understanding of the differences between lazy and eager evaluation in Python. By working through this quiz, you'll revisit how Python optimizes memory use and computational overhead by deciding when to compute values.



