Table of Contents
Being lazy is not always a bad thing. Every line of code you write has at least one expression that Python needs to evaluate. Python lazy evaluation is when Python takes the lazy option and delays working out the value returned by an expression until that value is needed.
Anexpression in Python is a unit of code that evaluates to a value. Examples of expressions include object names, function calls, expressions with arithmetic operators, literals that create built-in object types such as lists, and more. However, not all statements are expressions. For example,if
statements andfor
loop statements don’t return a value.
Python needs to evaluate every expression it encounters to use its value. In this tutorial, you’ll learn about the different ways Python evaluates these expressions. You’ll understand why some expressions are evaluated immediately, while others are evaluated later in the program’s execution. So,what’s lazy evaluation in Python?
Get Your Code:Click here to download the free sample code that shows you how to use lazy evaluation in Python.
Take the Quiz: Test your knowledge with our interactive “What's Lazy Evaluation in Python?” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
What's Lazy Evaluation in Python?In this quiz, you'll test your understanding of the differences between lazy and eager evaluation in Python. By working through this quiz, you'll revisit how Python optimizes memory use and computational overhead by deciding when to compute values.
An expression evaluates to a value. However, you can separate the type of evaluation of expressions into two types:
Eager evaluation refers to those cases when Python evaluates an expression as soon as it encounters it. Here are some examples of expressions that are evaluated eagerly:
1>>>5+10 215 3 4>>>importrandom 5>>>random.randint(1,10) 64 7 8>>>[2,4,6,8,10] 9[2, 4, 6, 8, 10]10>>>numbers=[2,4,6,8,10]11>>>numbers12[2, 4, 6, 8, 10]
Interactive environments, such as thestandard Python REPL used in this example, display the value of an expression when the line only contains the expression. This code section shows a few examples of statements and expressions:
+
, which Python evaluates as soon as it encounters it. The REPL shows the value15
.import
followed by the name of a module. The module namerandom
is evaluated eagerly.random.randint()
is evaluated eagerly, and its value is returned immediately. All standard functions are evaluated eagerly. You’ll learn about generator functions later, which behave differently.numbers
. This statement is not an expression and doesn’t return a value. However, it includes the list literal on the right-hand side, which is an expression that’s evaluated eagerly.numbers
, which is eagerly evaluated to return the list object.The list you create in the final example is created in full when you define it. Python needs to allocate memory for the list and all its elements. This memory won’t be freed as long as this list exists in your program. The memory allocation in this example is small and won’t impact the program. However, larger objects require more memory, which can cause performance issues.
Lazy evaluation refers to cases when Python doesn’t work out the values of an expression immediately. Instead, the values are returned at the point when they’re required in the program. Lazy evaluation can also be referred to ascall-by-need.
This delay of when the program evaluates an expression delays the use of resources to create the value, which can improve the performance of a program by spreading the time-consuming process across a longer time period. It also prevents values that will not be used in the program from being generated. This can occur when the program terminates or moves to another part of its execution before all the generated values are used.
When large datasets are created using lazily-evaluated expressions, the program doesn’t need to use memory to store the data structure’s contents. The values are only generated when they’re needed.
An example of lazy evaluation occurs within thefor
loop when you iterate usingrange()
:
forindexinrange(1,1_000_001):print(f"This is iteration{index}")
Thebuilt-inrange()
is the constructor for Python’srange
object. Therange
object does not store all of the one million integers it represents. Instead, thefor
loop creates arange_iterator
from therange
object, which generates the next number in the sequence when it’s needed. Therefore, the program never needs to have all the values stored in memory at the same time.
Lazy evaluation also allows you to create infinite data structures, such as a live stream of audio or video data that continuously updates with new information, since the program doesn’t need to store all the values in memory at the same time. Infinite data structures are not possible with eager evaluation since they can’t be stored in memory.
There are disadvantages to deferred evaluation. Any errors raised by an expression are also deferred to a later point in the program. This delay can make debugging harder.
The lazy evaluation of the integers represented byrange()
in afor
loop is one example of lazy evaluation. You’ll learn about more examples in the following section of this tutorial.
In the previous section, you learned about usingrange()
in afor
loop, which leads to lazy evaluation of the integers represented by therange
object. There are other expressions in Python that lead to lazy evaluation. In this section, you’ll explore the main ones.
The Python built-inszip()
andenumerate()
create two powerful built-in data types. You’ll explore how these data types are linked to lazy evaluation with the following example. Say you need to create a weekly schedule, or rota, that shows which team members will bring coffee in the morning.
However, the coffee shop is always busy on Monday mornings, and no one wants to be responsible for Mondays. So, you decide to randomize the rota every week. You start with a list containing the team members’ names:
>>>names=["Sarah","Matt","Jim","Denise","Kate"]>>>importrandom>>>random.shuffle(names)>>>names['Sarah', 'Jim', 'Denise', 'Matt', 'Kate']
You also shuffle the names usingrandom.shuffle()
, which changes the list in place. It’s time to create a numbered list to pin to the notice board every week:
>>>forindex,nameinenumerate(names,start=1):...print(f"{index}.{name}")...1. Sarah2. Jim3. Denise4. Matt5. Kate
You useenumerate()
to iterate through the list of names and also access an index as you iterate. By default,enumerate()
starts counting from zero. However, you use thestart
argument to ensure the first number is one.
But what doesenumerate()
do behind the scenes? To explore this, you can callenumerate()
and assign the object it returns to a variable name:
>>>numbered_names=enumerate(names,start=1)>>>numbered_names<enumerate object at 0x11b26ae80>
The object created is anenumerate
object, which is aniterator. Iterators are one of the key tools that allow Python to be lazy since their values are created on demand. The call toenumerate()
pairs each item innames
with an integer.
However, it doesn’t create those pairs immediately. The pairs are not stored in memory. Instead, they’re generated when you need them. One way to evaluate a value from an iterator is to call the built-in functionnext()
:
>>>next(numbered_names)(1, 'Sarah')>>>next(numbered_names)(2, 'Jim')
Thenumbered_names
object doesn’t contain all the pairs within it. When it needs to create the next pair of values, it fetches the next name from the original listnames
and pairs it up with the next integer. You can confirm this by changing the third name in the listnames
before fetching the next value innumbered_names
:
>>>names[2]="The Coffee Robot">>>next(numbered_names)(3, 'The Coffee Robot')
Even though you created theenumerate
objectnumbered_names
before you changed the contents of the list, you fetch the third item innames
after you made the change. This behavior is possible because Python evaluates theenumerate
object lazily.
Look back at the numbered list you created earlier with thefor
loop, which shows Sarah is due to buy coffee first. Sarah is a Python programmer, so she enquired whether the1
next to her name means she should buy coffee on Tuesday since Monday ought to be0
.
You decide not to get angry. Instead, you update your code to usezip()
to pair names with weekdays instead of numbers. Note that you recreate and shuffle the list again since you have made changes to it:
>>>names=["Sarah","Matt","Jim","Denise","Kate"]>>>weekdays=["Monday","Tuesday","Wednesday","Thursday","Friday"]>>>random.shuffle(names)>>>names['Denise', 'Jim', 'Sarah', 'Matt', 'Kate']>>>forday,nameinzip(weekdays,names):...print(f"{day}:{name}")...Monday: DeniseTuesday: JimWednesday: SarahThursday: MattFriday: Kate
When you callzip()
, you create azip
object, which is another iterator. The program doesn’t create copies of the data inweekdays
andnames
to create the pairs. Instead, it creates the pairs on demand. This is another example of lazy evaluation. You can explore thezip
object directly as you did with theenumerate
object:
>>>day_name_pairs=zip(weekdays,names)>>>next(day_name_pairs)('Monday', 'Denise')>>>next(day_name_pairs)('Tuesday', 'Jim')>>># Modify the third item in 'names'>>>names[2]="The Coffee Robot">>>next(day_name_pairs)('Wednesday', 'The Coffee Robot')
The program didn’t need to create and store copies of the data when you callenumerate()
andzip()
because of lazy evaluation. Another consequence of this type of evaluation is that the data is not fixed when you create theenumerate
orzip
objects. Instead, the program uses the data present in the original data structures when a value is needed from theenumerate
orzip
objects.
itertools
Iterators are lazy data structures since their values are evaluated when they’re needed and not immediately when you define the iterator. There are many more iterators in Python besidesenumerate
andzip
. Every iterable is either an iterator itself or can be converted into an iterator usingiter()
.
However, in this section, you’ll explorePython’sitertools
module, which has several of these data structures. You’ll learn about two of these tools now, and then you can try some of the others after you finish this tutorial.
In the previous section, you worked with a list of team members. Now, you join forces with another team to participate in a quiz, and you want to print out the list of names of the entire quiz team:
>>>importitertools>>>first_team=["Sarah","Matt","Jim","Denise","Kate"]>>>second_team=["Mark","Zara","Mo","Jennifer","Owen"]>>>fornameinitertools.chain(first_team,second_team):...print(name)...SarahMattJimDeniseKateMarkZaraMoJenniferOwen
The iterable you use in thefor
loop is the object created byitertools.chain()
, which chains the two lists together into a single iterable. However,itertools.chain()
doesn’t create a new list but an iterator, which is evaluated lazily. Therefore, the program doesn’t create copies of the strings with the names, but it fetches the strings when they’re needed from the listsfirst_name
andsecond_name
.
Here’s another way to observe the relationship between the iterator and the original data structures:
>>>first_team=["Sarah","Matt","Jim","Denise","Kate"]>>>second_team=["Mark","Zara","Mo","Jennifer","Owen"]>>>importsys>>>sys.getrefcount(first_team)2>>>quiz_team=itertools.chain(first_team,second_team)>>>sys.getrefcount(first_team)3
The functionsys.getrefcount()
counts the number of times an object is referenced in the program. Note thatsys.getrefcount()
always shows one more reference to the object that comes from the call tosys.getrefcount()
itself. Therefore, when there’s only one reference to an object in the rest of the program,sys.getrefcount()
shows two references.
When you create thechain
object, you create another reference to the two lists sincequiz_team
needs a reference to where the original data is stored. Therefore,sys.getrefcount()
shows an extra reference tofirst_team
. But this reference disappears when you exhaust the iterator:
>>>fornameinquiz_team:...print(name)...SarahMattJimDeniseKateMarkZaraMoJenniferOwen>>>sys.getrefcount(first_team)2
Lazy evaluation of data structures such asitertools.chain
rely on this reference between the iterator, such asitertools.chain
, and the structure containing the data, such asfirst_team
.
Another tool initertools
that highlights the difference between eager and lazy evaluation isitertools.islice()
, which is the lazy evaluation version of Python’s slice. Create a list of numbers and a standard slice of that list:
>>>numbers=[2,4,6,8,10]>>>standard_slice=numbers[1:4]>>>standard_slice[4, 6, 8]
Now, you can create an iterator version of the slice usingitertools.islice()
:
>>>iterator_slice=itertools.islice(numbers,1,4)>>>iterator_slice<itertools.islice object at 0x117c93650>
The arguments initertools.islice()
include the iterable you want to slice and the integers to determine the start and stop indices of the slice, just like in a standard slice. You can also include an extra argument representing the step size. The final output doesn’t show the values in the slice since these haven’t been generated yet. They’ll be created when needed.
Finally, change one of the values in the list and loop through the standard slice and the iterator slice to compare the outputs:
>>>numbers[2]=999>>>numbers[2, 4, 999, 8, 10]>>>fornumberinstandard_slice:...print(number)...468>>>fornumberiniterator_slice:...print(number)...49998
You modify the third element in the listnumbers
. This change doesn’t affect the standard slice, which still contains the original numbers. When you create a standard slice, Python evaluates that slice eagerly and creates a new list containing the subset of data from the original sequence.
However, the iterator slice is evaluated lazily. Therefore, as you change the third value in the listbefore you loop through the iterator slice, the value initerator_slice
is also affected.
You’ll visit theitertools
module again later in this tutorial to explore a few more of its iterators.
Expressions that create built-in data structures, such as lists, tuples, or dictionaries, are evaluated eagerly. They generate and store all of the items in these data structures immediately. An example of this kind of expression is a list comprehension:
>>>importrandom>>>coin_toss=[..."Heads"ifrandom.random()>0.5else"Tails"...for_inrange(10)...]>>>coin_toss['Heads', 'Heads', 'Tails', 'Tails', 'Heads', 'Tails', 'Tails', 'Heads', 'Heads', 'Heads']
The expression on the right-hand side of the assignment operator (=
) creates a list comprehension. This expression is evaluated eagerly, and the ten heads or tails values are created and stored in the new list.
The list comprehension includes aconditional expression that returns either the string"Heads"
or"Tails"
depending on the value of the condition between theif
andelse
keywords. Therandom.random()
function creates a randomfloat
between 0 and 1. Therefore, there’s a 50 percent chance for the value created to be"Heads"
or"Tails"
.
You can replace the square brackets with parentheses on the right-hand side of the assignment operator:
>>>coin_toss=(..."Heads"ifrandom.random()>0.5else"Tails"...for_inrange(10)...)>>>coin_toss<generator object <genexpr> at 0x117a43440>
The expression in parentheses is agenerator expression. Even though it looks similar to the list comprehension, this expression is not evaluated eagerly. It creates a generator object. A generator object is a type of iterator that generates values when they’re needed.
The generator objectcoin_toss
doesn’t store any of the string values. Instead, it will generate each value when it’s needed. You can generate and fetch the next value using the built-innext()
:
>>>next(coin_toss)Tails>>>next(coin_toss)Heads
The expression that generates"Heads"
or"Tails"
is only evaluated when you callnext()
. This generator will generate ten values since you userange(10)
in the generator’sfor
clause. As you callednext()
twice, there are eight values left to generate:
>>>fortoss_resultincoin_toss:...print(toss_result)...HeadsHeadsHeadsTailsTailsHeadsTailsHeads
Thefor
loop iterates eight times, once for each of the remaining items in the generator. A generator expression is the lazy evaluation alternative to creating a list or a tuple. It’s intended to be used once, unlike its eager counterparts like lists and tuples.
You can also create a generator object using agenerator function. A generator function is a function definition that has ayield
statement instead of areturn
statement. You can define a generator function to create a generator object similar to the one you used in the coin toss example above:
>>>defgenerate_coin_toss(number):...for_inrange(number):...yield"Heads"ifrandom.random()>0.5else"Tails"...>>>coin_toss=generate_coin_toss(10)>>>next(coin_toss)'Heads'>>>next(coin_toss)'Tails'>>>fortoss_resultincoin_toss:...print(toss_result)...TailsHeadsTailsHeadsTailsTailsHeadsTails
You create a new generator object each time you call the generator function. Unlike standard functions withreturn
, which are evaluated in full, a generator is evaluated lazily. Therefore, when the first value is needed, the code in the generator function executes code up to the firstyield
statement. It yields this value and pauses, waiting for the next time a value is needed.
This process keeps running until there are no moreyield
statements and the generator function terminates, raising aStopIteration
exception. The iteration protocol in thefor
loop catches thisStopIteration
error, which is used to signal the end of thefor
loop.
Lazy iteration in Python also allows you to create multiple versions of the data structure that are independent of each other:
>>>first_coin_tosses=generate_coin_toss(10)>>>second_coin_tosses=generate_coin_toss(10)>>>next(first_coin_tosses)'Tails'>>>next(first_coin_tosses)'Tails'>>>next(first_coin_tosses)'Heads'>>>second_as_list=list(second_coin_tosses)>>>second_as_list['Heads', 'Heads', 'Heads', 'Heads', 'Heads', 'Tails', 'Tails', 'Tails', 'Tails', 'Heads']>>>next(second_coin_tosses)Traceback (most recent call last):... File"<input>", line1, in<module>StopIteration>>>next(first_coin_tosses)'Tails'
The two generators,first_coin_tosses
andsecond_coin_tosses
, are separate generators created from the same generator function. You evaluate the first three values offirst_coin_tosses
. This leaves seven values in the first generator.
Next, you convert the second generator into a list. This evaluates all its values to store them in thesecond_as_list
. There are ten values since the values you got from the first generator have no effect on the second one.
You confirm there are no more values left in the second generator when you callnext()
and get aStopIteration
error. However, the first generator,first_coin_tosses
, still has values to evaluate since it’s independent of the second generator.
Generators, and iterators in general, are central tools when dealing with lazy evaluation in Python. This is because they only yield values when they’re needed and don’t store all their values in memory.
The examples of lazy evaluation you’ve seen so far focused on expressions that create data structures. However, these are not the only types of expressions that can be evaluated lazily. Consider theand
andor
operators. A common misconception is that these operators returnTrue
orFalse
. In general, they don’t.
You can start to exploreand
with a few examples:
>>>TrueandTrueTrue>>>TrueandFalseFalse>>>1and00>>>0and10>>>1and22>>>42and"hello"'hello'
The first two examples have Boolean operands and return a Boolean. The result isTrue
only when both operands areTrue
. However, the third example doesn’t return a Boolean. Instead, it returns0
, which is the second operand in1 and 0
. And0 and 1
also returns0
, but this time, it’s the first operand. The integer0
is falsy, which means thatbool(0)
returnsFalse
.
Similarly, the integer1
is truthy, which means thatbool(1)
returnsTrue
. All non-zero integers are truthy. When Python needs a Boolean value, such as in anif
statement or with operators such asand
andor
, it converts the object to a Boolean to determine whether to treat it as true or false.
When you use theand
operator, the program evaluates the first operand and checks whether it’s truthy or falsy. If the first operand is falsy, there’s no need to evaluate the second operand since both need to be truthy for the overall result to be truthy. This is what occurs in the expression0 and 1
where theand
operator returns the first value, which is falsy. Therefore, the whole expression is falsy.
Python doesn’t evaluate the second operand when the first one is falsy. This is calledshort-circuit evaluation and it’s an example of lazy evaluation. Python only evaluates the second operand if it needs it.
If the first operand is truthy, Python evaluates and returns the second operand, whatever its value. If the first operand is truthy, the truthiness of the second operand determines the overall truthiness of theand
expression.
The final two examples include operands that are truthy. The second operand is returned in both cases to make the whole expression truthy. You can confirm that Python doesn’t evaluate the second operand if the first is falsy with the following examples:
>>>0andprint("Do you see this text?")0>>>1andprint("Do you see this text?")Do you see this text?
In the first example, the first operand is0
, and theprint()
function is never called. In the second example, the first operand is truthy. Therefore, Python evaluates the second operand, calling theprint()
function. Note that the result of theand
expression is the value returned byprint()
, which isNone
.
Another striking demonstration of short-circuiting is when you use an invalid expression as the second operand in anand
expression:
>>>0andint("python")0>>>1andint("python")Traceback (most recent call last):... File"<input>", line1, in<module>ValueError:invalid literal for int() with base 10: 'python'
The callint("python")
raises aValueError
since the string"python"
can’t be converted into an integer. However, in this first example, theand
expression returns0
without raising the error. The second operand was never evaluated!
Theor
operator works similarly. However, only one operand needs to be truthy for the entire expression to evaluate as truthy. Therefore, if the first operand is truthy, it’s returned, and the second operand isn’t evaluated:
>>>1or21>>>1or01>>>1orint("python")1
In all these examples, the first operand is returned since it’s truthy. The second operand is ignored and is never evaluated. You confirm this with the final example, which doesn’t raise aValueError
. This is short-circuit evaluation in theor
expression. Python is lazy and doesn’t evaluate expressions that have no effect on the final outcome.
However, if the first operand is falsy, the result of theor
expression is determined by the second operand:
>>>0or11>>>0orint("python")Traceback (most recent call last):... File"<input>", line1, in<module>ValueError:invalid literal for int() with base 10: 'python'
The built-inany()
andall()
functions are also evaluated lazily using short-circuit evaluation. Theany()
function returnsTrue
if any of the elements in an iterable is truthy:
>>>any([0,False,""])False>>>any([0,False,"hello"])True
The list you use in the first call toany()
contains the integer0
, the BooleanFalse
, and an empty string. All three objects are falsy andany()
returnsFalse
. In the second example, the final element is a non-empty string, which is truthy. The function returnsTrue
.
The function stops evaluating elements of the iterable when it finds the first truthy value. You can confirm this using a trick similar to the one you used with theand
andor
operators with help from a generator function, which you learned about in the previous section:
>>>deflazy_values():...yield0...yield"hello"...yieldint("python")...yield1...>>>any(lazy_values())True
You define the generator functionlazy_values()
with fouryield
statements. The third statement is invalid since"python"
can’t be converted into an integer. You create a generator when you call this function in the call toany()
.
The program doesn’t raise any errors, andany()
returnsTrue
. The evaluation of the generator stopped whenany()
encountered the string"hello"
, which is the first truthy value in the generator. The functionany()
performs lazy evaluation.
However, if the invalid expression doesn’t have any truthy values ahead of it, it’s evaluated and raises an error:
>>>deflazy_values():...yield0...yield""...yieldint("python")...yield1...>>>any(lazy_values())Traceback (most recent call last):... File"<input>", line1, in<module> File"<input>", line4, inlazy_valuesValueError:invalid literal for int() with base 10: 'python'
The first two values are falsy. Therefore,any()
evaluates the third value, which raises theValueError
.
The functionall()
behaves similarly. However,all()
requires all the elements of the iterable to be truthy. Therefore,all()
short-circuits when it encounters the first falsy value. You update the generator functionlazy_values()
to verify this behavior:
>>>deflazy_values():...yield1...yield""...yieldint("python")...yield1...>>>all(lazy_values())False
This code doesn’t raise an error sinceall()
returnsFalse
when it evaluates the empty string, which is the second element in the generator.
Short-circuiting, like other forms of lazy evaluation, prevents unnecessary evaluation of expressions when these expressions are not required at run time.
Functional programming is a programming paradigm in which functions only have access to data input as arguments and do not alter the state of objects, returning new objects instead. A program written in this style consists of a series of these functions, often with the output from a function used as an input for another function.
Since data is often passed from one function to another, it’s convenient to use lazy evaluation of data structures to avoid storing and moving large datasets repeatedly.
Three of the principle tools in functional programming are Python’s built-inmap()
andfilter()
functions andreduce()
, which is part of thefunctools
module. Technically, the first two are not functions but constructors of themap
andfilter
classes. However, you use them in the same way you use functions, especially in the functional programming paradigm.
You can exploremap()
andfilter()
with the following example. Create a list of strings containing names. First, you want to convert all names to uppercase:
>>>original_names=["Sarah","Matt","Jim","Denise","Kate"]>>>names=map(str.upper,original_names)>>>names<map object at 0x117ad31f0>
Themap()
function applies the functionstr.upper()
against each item in the iterable. Each name in the list is passed tostr.upper()
, and the value returned is used.
However,map()
doesn’t create a new list. Instead, it creates amap
object, which is an iterator. It’s not surprising that iterators appear often in a tutorial about lazy evaluation since they’re one of the main tools for the lazy evaluation of values!
You can evaluate each value, one at a time, usingnext()
:
>>>next(names)'SARAH'>>>next(names)'MATT'
You can also convert themap
object into a list. This evaluates the values so they can be stored in the list:
>>>list(names)['JIM', 'DENISE', 'KATE']
There are only three names in this list. You already evaluated and used the first two names when you callednext()
twice. Since values are evaluated when they’re needed and not stored in the data structure, you can only use them once.
Now, you only want to keep names that contain at least one lettera. You can usefilter()
for this task. First, you’ll need to recreate themap
object representing the uppercase letters since you already exhausted this generator in the REPL session:
>>>names=map(str.upper,original_names)>>>names=filter(lambdax:"A"inx,names)>>>names<filter object at 0x117ad0610>
Each item in the second argument infilter()
, which is themap
objectnames
, is passed to thelambda
function you include as the first argument. Only the values for which thelambda
function returnsTrue
are kept. The rest are discarded.
You reuse the variable callednames
at each stage. If you prefer, you can use different variable identifiers, but if you don’t need to keep the intermediate results, it’s best to use the same variable. The object thatfilter()
returns is another iterator, afilter
object. Therefore, its values haven’t been evaluated yet.
You can cast thefilter
object to a list as you did in the previous example. But in this case, try looping using afor
loop instead:
>>>fornameinnames:...print(name)...SARAHMATTKATE
The first function call tomap()
converts the names to uppercase. The second call, this time tofilter()
, only keeps the names that include the lettera. You use uppercaseA in the code since you’ve already converted all the names to uppercase.
Finally, you only keep names that are four letters long. The code below shows all themap()
andfilter()
operations since you need to recreate these iterators each time:
>>>names=map(str.upper,original_names)>>>names=filter(lambdax:"A"inx,names)>>>names=filter(lambdax:len(x)==4,names)>>>list(names)['MATT', 'KATE']
You can reorder the operations to make the overall evaluation lazier. The first operation converts all names to uppercase, but since you discard some of these names later, it would be best to avoid converting these names. You can filter the names first and convert them to uppercase in the final step. You add"Andy"
to the list of names to ensure that your code works whether the required letter is uppercase or lowercase:
>>>original_names=["Sarah","Matt","Jim","Denise","Kate","Andy"]>>>names=filter(lambdax:("a"inx)or("A"inx),original_names)>>>names=filter(lambdax:len(x)==4,names)>>>names=map(str.upper,names)>>>list(names)['MATT', 'KATE', 'ANDY']
The first call tofilter()
now checks if either uppercase or lowercasea is in the name. Since it’s more likely that the lettera is not the first letter in the name, you set the first operand to("a" in x)
in theor
expression to take advantage of short-circuiting with theor
operator.
The lazy evaluation obtained from usingmap
andfilter
iterators means that temporary data structures containing all the data are not needed in each function call. This won’t have a significant impact in this case since the list only contains six names, but it can affect performance with large sets of data.
The final example of expressions that are evaluated lazily will focus on reading data from a comma-separated values file, usually referred to as a CSV file. CSV files are a basic spreadsheet file format. They are text files with the.csv
file extension that have commas separating values to denote values that belong to different cells in the spreadsheet. Each line ends with the newline character"\n"
to show where each row ends.
You can use any CSV file you wish for this section, or you can copy the data below and save it as a new text file with the.csv
extension. Name the CSV filesuperhero_pets.csv
and place it in your project folder:
superhero_pets.csv
Pet Name,Species,Superpower,Favorite Snack,Hero OwnerWhiskertron,Cat,Teleportation,Tuna,CatwomanFlashpaw,Dog,Super Speed,Peanut Butter,The FlashMystique,Squirrel,Illusion,Nuts,Doctor StrangeQuackstorm,Duck,Weather Control,Bread crumbs,StormBark Knight,Dog,Darkness Manipulation,Bacon,Batman
You’ll explore two ways ofreading data from this CSV file. In the first version, you’ll open the file and use the.readlines()
method for file objects:
>>>importpprint>>>withopen("superhero_pets.csv",encoding="utf-8")asfile:...data=file.readlines()...>>>pprint.pprint(data)['Pet Name,Species,Superpower,Favorite Snack,Hero Owner\n', 'Whiskertron,Cat,Teleportation,Tuna,Catwoman\n', 'Flashpaw,Dog,Super Speed,Peanut Butter,The Flash\n', 'Mystique,Squirrel,Illusion,Nuts,Doctor Strange\n', 'Quackstorm,Duck,Weather Control,Bread crumbs,Storm\n', 'Bark Knight,Dog,Darkness Manipulation,Bacon,Batman\n']>>>print(type(data))<class 'list'>
You importpprint
to enable pretty printing of large data structures. Once you open the CSV file usingthewith
context manager, specifying the file’s encoding, you call the.readlines()
method for the open file. This method returns a list that contains all the data in the spreadsheet. Each item in the list is a string containing all the elements in a row.
This evaluation is eager since.readlines()
extracts all the contents of the spreadsheet and stores them in a list. This spreadsheet doesn’t contain a lot of data. However, this route could lead to significant pressure on memory resources if you’re reading large amounts of data.
Instead, you can use Python’scsv
module, which is part of the standard library. To simplify this code in the REPL, you can open the file without using awith
context manager. However, you should remember to close the file when you do so. In general, you should usewith
to open files whenever possible:
>>>importcsv>>>file=open("superhero_pets.csv",encoding="utf-8",newline="")>>>data=csv.reader(file)>>>data<_csv.reader object at 0x117a830d0>
You add the named argumentnewline=""
when opening the file to use with thecsv
module to ensure that any newlines within fields are dealt with correctly. The object returned bycsv.reader()
is not a list but an iterator. You’ve encountered iterators enough times already in this article to know what to expect.
The contents of the spreadsheet aren’t stored in a data structure in the Python program. Instead, Python will lazily fetch each line when it’s needed, getting the data directly from the file, which is still open:
>>>next(data)['Pet Name', 'Species', 'Superpower', 'Favorite Snack', 'Hero Owner']>>>next(data)['Whiskertron', 'Cat', 'Teleportation', 'Tuna', 'Catwoman']>>>next(data)['Flashpaw', 'Dog', 'Super Speed', 'Peanut Butter', 'The Flash']
The first call tonext()
triggers the evaluation of the first item of thedata
iterator. This is the first row of the spreadsheet, which is the header row. You callnext()
another two times to fetch the first two rows of data.
You can use afor
loop to iterate through the rest of the iterator, and evaluate the remaining items:
>>>forrowindata:...print(row)...['Mystique', 'Squirrel', 'Illusion', 'Nuts', 'Doctor Strange']['Quackstorm', 'Duck', 'Weather Control', 'Bread crumbs', 'Storm']['Bark Knight', 'Dog', 'Darkness Manipulation', 'Bacon', 'Batman']>>>file.close()
You evaluated the header and the first two rows in earlier code. Therefore, thefor
loop only has the final three rows to iterate through. And it’s good practice toclose the file since you’re not using awith
statement.
Thereader()
function in thecsv
module enables you to evaluate the spreadsheet rows lazily by fetching each row only when it’s needed. However, calling.readlines()
on an open file evaluates the rows eagerly by fetching them all immediately.
Lazy evaluation of expressions also enables data structures with infinite elements. Infinite data structures can’t be achieved through eager evaluation since it’s not possible to generate and store infinite elements in memory! However, when elements are generated on demand, as in lazy evaluation, it’s possible to have an object that represents an infinite number of elements.
Theitertools
module has several tools that can be used to create infinite iterables. One of these isitertools.count()
, which yields sequential numbers indefinitely. You can set the starting value and the step size when you create acount
iterator:
>>>importitertools>>>quarters=itertools.count(start=0,step=0.25)>>>for_inrange(8):...print(next(quarters))...00.250.50.751.01.251.51.75
The iteratorquarters
will yield values 0.25 larger than the previous one and will keep yielding values forever. However, none of these values is generated when you definequarters
. Each value is generated when it’s needed, such as by callingnext()
or as part of an iteration process, such as afor
loop.
Another tool you can use to create infinite iterators isitertools.cycle()
. You can explore this tool with the list of team member names you used earlier in this tutorial to create a rota for who’s in charge of getting coffee in the morning. You decide you don’t want to regenerate the rota every week, so you create an infinite iterator that cycles through the names:
>>>names=["Sarah","Matt","Jim","Denise","Kate"]>>>rota=itertools.cycle(names)>>>rota<itertools.cycle object at 0x1156be340>
The object returned byitertools.cycle()
is an iterator. Therefore, it doesn’t create all its elements when it’s first created. Instead, it generates values when they’re needed:
>>>next(rota)'Sarah'>>>next(rota)'Matt'>>>next(rota)'Jim'>>>next(rota)'Denise'>>>next(rota)'Kate'>>>next(rota)'Sarah'>>>next(rota)'Matt'
Thecycle
iteratorrota
starts yielding each name from the original listnames
. When all names have been yielded once, the iterator starts yielding names from the beginning of the list again. This iterator will never run out of values to yield since it will restart from the beginning of the list each time it reaches the last name.
This is an object with an infinite number of elements. However, only five strings are stored in memory since there are only five names in the original list.
The iteratorrota
is iterable, like all iterators. Therefore, you can use it as part of afor
loop statement. However, this now creates an infinite loop since thefor
loop never receives aStopIteration
exception to trigger the end of the loop.
You can also achieve infinite data structures using generator functions. You can recreate therota
iterator by first defining the generator functiongenerate_rota()
:
>>>defgenerate_rota(iterable):...index=0...length=len(iterable)...whileTrue:...yielditerable[index]...ifindex==length-1:...index=0...else:...index+=1...>>>rota=generate_rota(names)>>>for_inrange(12):...print(next(rota))...SarahMattJimDeniseKateSarahMattJimDeniseKateSarahMatt
In the generator functiongenerate_rota()
, you manually manage the index to fetch items from the iterable, increasing the value after each item is yielded and resetting it to zero when you reach the end of the iterable. The generator function includes awhile True
statement, which makes this an infinite data structure.
In this example, the generator function replicates behavior you can achieve withitertools.cycle()
. However, you can create any generator with custom requirements using this technique.
You can revisit an earlier example to explore one of the main advantages of lazy evaluation. You created a list and a generator object with several outcomes from a coin toss earlier in this tutorial. In this version, you’ll create one million coin tosses in each one:
>>>importrandom>>>coin_toss_list=[..."Heads"ifrandom.random()>0.5else"Tails"...for_inrange(1_000_000)...]>>>coin_toss_gen=(..."Heads"ifrandom.random()>0.5else"Tails"...for_inrange(1_000_000)...)>>>importsys>>>sys.getsizeof(coin_toss_list)8448728>>>sys.getsizeof(coin_toss_gen)200
You create a list and a generator object. Both objects represent one million strings with either"Heads"
or"Tails"
. However, the list takes up over eight million bytes of memory, whereas the generator uses only 200 bytes. You may get a slightly different number of bytes depending on the Python version you’re using.
The list contains all of the one million strings, whereas the generator doesn’t since it will generate these values when they’re needed. When you have large amounts of data, using eager evaluation to define data structures may put pressure on memory resources in your program and affect performance.
This example also shows another advantage of using lazy evaluation when you create a data structure. You could use the conditional expression that returns"Heads"
or"Tails"
at random directly in the code whenever you need it. However, creating a generator might be a better option.
Since you included the logic of how to create the values you need in the generator expression, you can use a moredeclarative style of coding in the rest of your code. You statewhat you want to achieve without focusing onhow to achieve it. This can make your code more readable.
Another advantage of lazy evaluation is the performance gains you could achieve by avoiding the evaluation of expressions that you don’t need. These benefits become noticeable in programs that evaluate large numbers of expressions.
You can demonstrate this performance benefit using thetimeit
module in Python’s standard library. You can explore this with the short-circuit evaluation when you use theand
operator. The following two expressions are similar and return the same truthiness:
>>>importrandom>>>random.randint(0,1)andrandom.randint(0,10)1>>>random.randint(0,10)andrandom.randint(0,1)8
These expressions return a truthy value if both calls torandom.randint()
return non-zero values. They will return0
if at least one function returns0
. However, it’s more likely forrandom.randint(0, 1)
to return0
compared withrandom.randint(0, 10)
.
Therefore, if you need to evaluate this expression repeatedly in your code, the first version is more efficient due to short-circuit evaluation. You can time how long it takes to evaluate these expressions many times:
>>>importtimeit>>>timeit.repeat(..."random.randint(0, 1) and random.randint(0, 10)",...number=1_000_000,...globals=globals(),...)[0.39701350000177626, 0.37251866700171377, 0.3730850419997296, 0.3731833749989164, 0.3740811660027248]>>>timeit.repeat(..."random.randint(0, 10) and random.randint(0, 1)",...number=1_000_000,...globals=globals(),...)[0.504747375001898, 0.4694556670001475, 0.4706860409969522, 0.4841222920003929, 0.47349566599950776]
The output shows the time it takes for one million evaluations of each expression. There are five separate timings for each expression. The first version is the one that hasrandom.randint(0, 1)
as its first operand, and it runs quicker than the second one, which has the operands switched around.
The evaluation of theand
expression short-circuits when the firstrandom.randint()
call returns0
. Sincerandom.randint(0, 1)
has a 50 percent chance of returning0
, roughly half the evaluations of theand
expression will only call the firstrandom.randint()
.
Whenrandom.randint(0, 10)
is the first operand, the expression’s evaluation will only short-circuit once out of every eleven times it runs since there are eleven possible values returned byrandom.randint(0, 10)
.
The advantages of reducing memory consumption and improving performace can be significant in some projects where demands on resources matter. However, there are some disadvantages to lazy evaluation. You’ll explore these in the next section.
Lazy evaluation reduces memory requirements and unnecessary operations by delaying the evaluation. However, this delay can also make debugging harder. If there’s an error in an expression that’s evaluated lazily, the exception is not raised right away. Instead, you’ll only encounter the error at a later stage of the code’s execution when the expression is evaluated.
To demonstrate this, you can return to the list of team members you used earlier in this tutorial. On this occasion, you want to keep track of the points they gained during a team-building exercise:
>>>players=[...{"Name":"Sarah","Games":4,"Points":23},...{"Name":"Matt","Games":7,"Points":42},...{"Name":"Jim","Games":1,"Points":7},...{"Name":"Denise","Games":0,"Points":0},...{"Name":"Kate","Games":5,"Points":33},...]
You create a list of players, and each item in the list is adictionary. Each dictionary contains three key-value pairs to store the player’s name, the number of games they play, and the total number of points they scored.
You’re interested in the average number of points per game for each player, so you create a generator with this value for each player:
>>>average_points_per_game=(...item["Points"]/item["Games"]...foriteminplayers...)>>>average_points_per_game<generator object <genexpr> at 0x11566a880>
The generator expression is evaluated lazily. Therefore, the required values are not evaluated right away. Now, you can start callingnext()
to fetch the average number of points per game for each player:
>>>next(average_points_per_game)5.75>>>next(average_points_per_game)6.0>>>next(average_points_per_game)7.0>>>next(average_points_per_game)Traceback (most recent call last):... File"<input>", line1, in<module> File"<input>", line1, in<genexpr>ZeroDivisionError:division by zero
Your code evaluates and returns the values for the first three players. However, it raises aZeroDivisionError
when it tries to evaluate the fourth value. Denise didn’t enjoy the team-building event and didn’t participate in any of the games. Therefore, she played zero games and scored zero points. The division operation in your generator expression raises an exception in this case.
Eager evaluation would raise this error at the point you create the object. You can replace the parentheses with square brackets to create a list comprehension instead of a generator:
>>>average_points_per_game=[...item["Points"]/item["Games"]...foriteminplayers...]Traceback (most recent call last):... File"<input>", line1, in<module> File"<input>", line1, in<listcomp>ZeroDivisionError:division by zero
The error is raised immediately in this scenario. Delayed errors can make them harder to identify and fix, leading to increased difficulty withdebugging code. A popular third-party Python library, TensorFlow,shifted from lazy evaluation to eager evaluation as the default option to facilitate debugging. Users can then turn on lazy evaluation using a decorator once they complete the debugging process.
In this tutorial, you learned what lazy evaluation in Python is and how it’s different from eager evaluation. Some expressions aren’t evaluated when the program first encounters them. Instead, they’re evaluated when the values are needed in the program.
This type of evaluation is referred to as lazy evaluation and can lead to more readable code that’s also more memory-efficient and performant. In contrast, eager evaluation is when an expression is evaluated in full immediately.
The ideal evaluation mode depends on several factors. For small data sets, there are no noticeable benefits to using lazy evaluation for memory efficiency and performance. However, the advantages of lazy evaluation become more important for large amounts of data. Lazy evaluation can also make errors and bugs harder to spot and fix.
Lazy evaluation is also not ideal when you’re generating data structures such as iterators and need to use the values repeatedly in your program. This is because you’ll need to generate the values again each time you need them.
In Python, lazy evaluation often occurs behind the scenes. However, you’ll also need to decide when to use expressions that are evaluated eagerly or lazily, like when you need to create a list or generator object. Now, you’re equipped with the knowledge to understand how to deal with both types of evaluation.
Get Your Code:Click here to download the free sample code that shows you how to use lazy evaluation in Python.
Take the Quiz: Test your knowledge with our interactive “What's Lazy Evaluation in Python?” quiz. You’ll receive a score upon completion to help you track your learning progress:
Interactive Quiz
What's Lazy Evaluation in Python?In this quiz, you'll test your understanding of the differences between lazy and eager evaluation in Python. By working through this quiz, you'll revisit how Python optimizes memory use and computational overhead by deciding when to compute values.
🐍 Python Tricks 💌
Get a short & sweetPython Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.
AboutStephen Gruppetta
Stephen obtained a PhD in physics and worked as a physicist in academia for over a decade before becoming a Python educator. He's constantly looking for simple ways to explain complex things in Python.
» More about StephenMasterReal-World Python Skills With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
MasterReal-World Python Skills
With Unlimited Access to Real Python
Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:
What Do You Think?
What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.
Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students.Get tips for asking good questions andget answers to common questions in our support portal.
Keep Learning
Already have an account?Sign-In
Almost there! Complete this form and click the button below to gain instant access:
What's Lazy Evaluation in Python? (Sample Code)