Movatterモバイル変換


[0]ホーム

URL:


— FREE Email Series —

🐍 Python Tricks 💌

Python Tricks Dictionary Merge

🔒 No spam. Unsubscribe any time.

Browse TopicsGuided Learning Paths
Basics Intermediate Advanced
aialgorithmsapibest-practicescareercommunitydatabasesdata-sciencedata-structuresdata-vizdevopsdjangodockereditorsflaskfront-endgamedevguimachine-learningnewsnumpyprojectspythonstdlibtestingtoolsweb-devweb-scraping

Table of Contents

Recommended Course

Sort a Python Dictionary by Value, Key, and More

Sorting Dictionaries in Python: Keys, Values, and More

46m · 9 lessons

Sort a Python Dictionary by Value, Key, and More

Sorting a Python Dictionary: Values, Keys, and More

byIan CurrieReading time estimate 35mintermediatedata-structuresstdlib

Table of Contents

Remove ads

Recommended Course

Sorting Dictionaries in Python: Keys, Values, and More(46m)

Sorting a Python dictionary involves organizing its key-value pairs in a specific order. To sort a Python dictionary by its keys, you use thesorted() function combined with.items(). This approach returns a list of tuples sorted by keys, which you can convert back to a dictionary using thedict() constructor. Sorting by values requires specifying a sort key using a lambda function oritemgetter().

By the end of this tutorial, you’ll understand that:

  • You cansort a dictionary by its keys usingsorted() with.items() anddict().
  • Tosort by values, you usesorted() with a key function likelambda oritemgetter().
  • Sorting indescending order is possible by settingreverse=True insorted().
  • Fornon-comparable keys or values, you use default values or custom sort keys.
  • Pythondictionaries can’t be sorted in-place, so you need to create a new sorted dictionary.

Read on to learn how to effectively sort dictionaries using these techniques and the strategic implications of choosing the right data structure for your key-value data. But first, you’ll learn some foundational knowledge that will help you understand how to sort a dictionary in Python.

Free Download:Click here to download the code that you’ll use to sort key-value pairs in this tutorial.

Rediscovering Dictionary Order in Python

Before Python 3.6, dictionaries were inherentlyunordered. A Python dictionary is an implementation of thehash table, which is traditionally an unordered data structure.

As a side effect of thecompact dictionary implementation in Python 3.6, dictionaries started to conserveinsertion order. From 3.7, that insertion order has beenguaranteed.

If you wanted to keep an ordered dictionary as a data structure before compact dictionaries, then you could useOrderedDict from thecollections module. Similar to the modern compact dictionary, it also keeps insertion order, but neither type of dictionary sorts itself.

Another alternative for storing an ordered key-value pair data is to store the pairs as a list of tuples. As you’ll seelater in the tutorial, using a list of tuples could be the best choice for your data.

An essential point to understand when sorting dictionaries is that even though they conserve insertion order, they’re not considered asequence. A dictionary is like aset of key-value pairs, and sets are unordered.

Dictionaries also don’t have much reordering functionality. They’re not like lists, where you caninsert elements at any position. In the next section, you’ll explore the consequences of this limitation further.

Understanding What Sorting a Dictionary Really Means

Because dictionaries don’t have much reordering functionality, when sorting a dictionary, it’s rarely donein-place. In fact, there are no methods for explicitly moving items in a dictionary.

If you wanted to sort a dictionary in-place, then you’d have to use thedel keyword to delete an item from the dictionary and then add it again. Deleting and then adding again effectively moves the key-value pair to the end.

TheOrderedDict class has aspecific method to move an item to the end or the start, which may makeOrderedDict preferable for keeping a sorted dictionary. However, it’s still not very common and isn’t very performant, to say the least.

The typical method for sorting dictionaries is to get a dictionaryview, sort it, and then cast the resulting list back into a dictionary. So you effectively go from a dictionary to a list and back into a dictionary. Depending on your use case, you may not need to convert the list back into a dictionary.

Note: Sorted dictionaries aren’t a very common pattern. You’ll explore more about that topiclater in the tutorial.

With those preliminaries out of the way, you’ll get to sorting dictionaries in the next section.

Sorting Dictionaries in Python

In this section, you’ll be putting together the components of sorting a dictionary so that, in the end, you can master the most common way of sorting a dictionary:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>># Sort by key>>>dict(sorted(people.items())){1: 'Jill', 2: 'Jack', 3: 'Jim', 4: 'Jane'}>>># Sort by value>>>dict(sorted(people.items(),key=lambdaitem:item[1])){2: 'Jack', 4: 'Jane', 1: 'Jill', 3: 'Jim'}

Don’t worry if you don’t understand the snippets above—you’ll review it all step-by-step in the following sections. Along the way, you’ll learn how to use thesorted() function with sort keys,lambda functions, and dictionary constructors.

Using thesorted() Function

The critical function that you’ll use to sort dictionaries is the built-insorted() function. This function takes aniterable as the main argument, with two optionalkeyword-only arguments—akey function and areverse Boolean value.

To illustrate thesorted() function’s behavior in isolation, examine its use on alist of numbers:

Python
>>>numbers=[5,3,4,3,6,7,3,2,3,4,1]>>>sorted(numbers)[1, 2, 3, 3, 3, 3, 4, 4, 5, 6, 7]

As you can see, thesorted() function takes an iterable, sortscomparable elements like numbers inascending order, and returns a new list. With strings, it sorts them inalphabetical order:

Python
>>>words=["aa","ab","ac","ba","cb","ca"]>>>sorted(words)['aa', 'ab', 'ac', 'ba', 'ca', 'cb']

Sorting by numerical or alphabetical precedence is the most common way to sort elements, but maybe you need more control.

Say you want to sort on thesecond character of each word in the last example. To customize what thesorted() function uses to sort the elements, you can pass in acallback function to thekey parameter.

A callback function is a function that’s passed as an argument to another function. Forsorted(), you pass it a function that acts as a sort key. Thesorted() function will thencall back the sort key for every element.

In the following example, the function passed as the key accepts a string and will return the second character of that string:

Python
>>>defselect_second_character(word):...returnword[1]...>>>sorted(words,key=select_second_character)['aa', 'ba', 'ca', 'ab', 'cb', 'ac']

Thesorted() function passes every element of thewords iterable to thekey function and uses the return value for comparison. Using the key means that thesorted() function will compare the second letter instead of comparing the whole string directly.

More examples and explanations of thekey parameter will comelater in the tutorial when you use it to sort dictionaries by values or nested elements.

If you take another look at the results of this last sorting, you may notice thestability of thesorted() function. The three elements,aa,ba andca, are equivalent when sorted by their second character. Because they’re equal, thesorted() function conserves theiroriginal order. Python guarantees this stability.

Note: Every list also has a.sort() method, which has the same signature as thesorted() function. The main difference is that the.sort() method sorts the listin-place. In contrast, thesorted() function returns a new list, leaving the original list unmodified.

You can also passreverse=True to the sorting function or method to return the reverse order. Alternatively, you can use thereversed() function to invert the iterable after sorting:

Python
>>>list(reversed([1,2,3]))[3, 2, 1]

If you want to dive deeper into the mechanics of sorting in Python and learn how to sort data types other than dictionaries, then check out the tutorial onhow to usesorted() and.sort()

So, how about dictionaries? You can actually take the dictionary and feed it straight into thesorted() function:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>sorted(people)[1, 2, 3, 4]

But the default behavior of passing in a dictionary directly to thesorted() function is to take thekeys of the dictionary, sort them, and return alist of the keysonly. That’s probably not the behavior you had in mind! To preserve all the information in a dictionary, you’ll need to be acquainted withdictionary views.

Getting Keys, Values, or Both From a Dictionary

If you want to conserve all the information from a dictionary when sorting it, the typical first step is to call the.items() method on the dictionary. Calling.items() on the dictionary will provide an iterable oftuples representing the key-value pairs:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>people.items()dict_items([(3, 'Jim'), (2, 'Jack'), (4, 'Jane'), (1, 'Jill')])

The.items() method returns a read-onlydictionary view object, which serves as a window into the dictionary. This view isnot a copy or a list—it’s a read-onlyiterable that’s actuallylinked to the dictionary it was generated from:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>view=people.items()>>>people[2]="Elvis">>>viewdict_items([(3, 'Jim'), (2, 'Elvis'), (4, 'Jane'), (1, 'Jill')])

You’ll notice that any updates to the dictionary also get reflected in the view because they’re linked. A view represents a lightweight way to iterate over a dictionary without generating a list first.

Note: You can use.values() to get a view of the values only and.keys() to get one with only the keys.

Crucially, you can use thesorted() function with dictionary views. You call the.items() method and use the result as an argument to thesorted() function. Using.items() keeps all the information from the dictionary:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>sorted(people.items())[(1, 'Jill'), (2, 'Jack'), (3, 'Jim'), (4, 'Jane')]

This example results in a sorted list of tuples, with each tuple representing a key-value pair of the dictionary.

If you want to end up with a dictionary sorted by values, then you’ve still got two issues. The default behavior still seems to sort bykey and not value. The other issue is that you end up with alist of tuples, not a dictionary. First, you’ll figure out how to sort by value.

Understanding How Python Sorts Tuples

When using the.items() method on a dictionary and feeding it into thesorted() function, you’re passing in an iterable of tuples, and thesorted() function compares the entire tuple directly.

When comparing tuples, Python behaves a lot like it’s sorting strings alphabetically. That is, it sorts themlexicographically.

Lexicographical sorting means that if you have two tuples,(1, 2, 4) and(1, 2, 3), then you start by comparing the first item of each tuple. The first item is1 in both cases, which is equal. The second element,2, is also identical in both cases. The third elements are4 and3, respectively. Since3 is less than4, you’ve found which item isless than the other.

So, to order the tuples(1, 2, 4) and(1, 2, 3) lexicographically, you would switch their order to(1, 2, 3) and(1, 2, 4).

Because of Python’s lexicographic sorting behavior for tuples, using the.items() method with thesorted() function will always sort by keys unless you use something extra.

Using thekey Parameter and Lambda Functions

For example, if you want to sort by value, then you have to specify asort key. A sort key is a way to extract a comparable value. For instance, if you have a pile of books, then you might use the author surname as the sort key. With thesorted() function, you can specify a sort key by passing a callback function as akey argument.

Note: Thekey argument has nothing to do with a dictionary key!

To see a sort key in action, take a look at this example, which is similar to the one you saw in thesection introducing thesorted() function:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>># Sort key>>>defvalue_getter(item):...returnitem[1]...>>>sorted(people.items(),key=value_getter)[(2, 'Jack'), (4, 'Jane'), (1, 'Jill'), (3, 'Jim')]>>># Or with a lambda function>>>sorted(people.items(),key=lambdaitem:item[1])[(2, 'Jack'), (4, 'Jane'), (1, 'Jill'), (3, 'Jim')]

In this example, you try out two ways of passing akey parameter. Thekey parameter accepts a callback function. The function can be a normal function identifier or alambda function. The lambda function in the example is the exact equivalent of thevalue_getter() function.

Note: Lambda functions are also known asanonymous functions because they don’t have a name. Lambda functions are standard for functions that you’re only using once in your code.

Lambda functions confer no benefit apart from making things more compact, eliminating the need to define a function separately. They keep things nicely contained on the same line:

Python
# With a normal functiondefvalue_getter(item):returnitem[1]sorted(people.items(),key=value_getter)# With a lambda functionsorted(people.items(),key=lambdaitem:item[1])

For basic getter functions like the one in the example, lambdas can come in handy. But lambdas can make your code less readable for anything more complex, so use them with care.

Lambdas can also only ever contain exactly oneexpression, making any multilinestatements likeif statements orfor loops off limits. You can work around this by using comprehensions andif expressions, for example, but those can make for long and cryptic one-liners.

Thekey callback function will receive each element of the iterable that it’s sorting. The callback function’s job is toreturn something that can be compared, such as a number or a string. In this example, you named the functionvalue_getter() because all it does is get the value from a key-value tuple.

Since the default behavior ofsorted() with tuples is to sort lexicographically, thekey parameter allows you to select a value from the element that it’s comparing.

In the next section, you’ll take sort keys a bit further and use them to sort by a nested value.

Selecting a Nested Value With a Sort Key

You can also go further and use a sort key to select nested values that may or may not be present and return a default value if they’re not present:

Python
data={193:{"name":"John","age":30,"skills":{"python":8,"js":7}},209:{"name":"Bill","age":15,"skills":{"python":6}},746:{"name":"Jane","age":58,"skills":{"js":2,"python":5}},109:{"name":"Jill","age":83,"skills":{"java":10}},984:{"name":"Jack","age":28,"skills":{"c":8,"assembly":7}},765:{"name":"Penelope","age":76,"skills":{"python":8,"go":5}},598:{"name":"Sylvia","age":62,"skills":{"bash":8,"java":7}},483:{"name":"Anna","age":24,"skills":{"js":10}},277:{"name":"Beatriz","age":26,"skills":{"python":2,"js":4}},}defget_relevant_skills(item):"""Get the sum of Python and JavaScript skill"""skills=item[1]["skills"]# Return default value that is equivalent to no skillreturnskills.get("python",0)+skills.get("js",0)print(sorted(data.items(),key=get_relevant_skills,reverse=True))

In this example, you have a dictionary with numeric keys and a nested dictionary as a value. You want to sort by the combined Python and JavaScript skills, attributes found in theskills subdictionary.

Part of what makes sorting by the combined skill tricky is that thepython andjs keys aren’t present in theskills dictionary for all people. Theskills dictionary is also nested. You use.get() to read the keys and provide0 as a default value that’s used for missing skills.

You’ve also used thereverse argument because you want the top Python skills to appear first.

Note: You didn’t use a lambda function in this example. While it’s possible, it would make for a long line of potentially cryptic code:

Python
sorted(data.items(),key=lambdaitem:(item[1]["skills"].get("python",0)+item[1]["skills"].get("js",0)),reverse=True,)

A lambda function can only contain one expression, so you repeat the full look-up in the nestedskills subdictionary. This inflates the line length considerably.

The lambda function also requires multiple chained square bracket ([]) indices, making it harder to read than necessary. Using a lambda in this example only saves a few lines of code, and the performance difference is negligible. So, in these cases, it usually makes more sense to use a normal function.

You’ve successfully used a higher-order function as a sort key to sort a dictionary view by value. That was the hard part. Now there’s only one issue left to solve—converting the list thatsorted() yields back into a dictionary.

Converting Back to a Dictionary

The only issue left to address with the default behavior ofsorted() is that it returns a list, not a dictionary. There are a few ways to convert a list of tuples back into a dictionary.

You can iterate over the result with afor loop and populate a dictionary on each iteration:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>sorted_people=sorted(people.items(),key=lambdaitem:item[1])>>>sorted_people_dict={}>>>forkey,valueinsorted_people:...sorted_people_dict[key]=value...>>>sorted_people_dict{2: 'Jack', 4: 'Jane', 1: 'Jill', 3: 'Jim'}

This method gives you absolute control and flexibility in deciding how you want to construct your dictionary. This method can be quite lengthy to type out, though. If you don’t have any special requirements for constructing your dictionary, then you may want to go for adictionary constructor instead:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>sorted_people=sorted(people.items(),key=lambdaitem:item[1])>>>dict(sorted_people){2: 'Jack', 4: 'Jane', 1: 'Jill', 3: 'Jim'}

That’s nice and compact! You could also use adictionary comprehension, but that only makes sense if you want to change the shape of the dictionary or swap the keys and values, for example. In the following comprehension, you swap the keys and values:

Python
>>>{...value:key...forkey,valueinsorted(people.items(),key=lambdaitem:item[1])...}...{'Jack': 2, 'Jane': 4, 'Jill': 1, 'Jim': 3}

Depending on how familiar you or your team are with comprehensions, this may be less readable than just using a normalfor loop.

Congratulations, you’ve got your sorted dictionary! You can now sort it by any criteria that you’d like.

Now that you can sort your dictionary, you might be interested in knowing if there are any performance implications to using a sorted dictionary, or whether there are alternative data structures for key-value data.

Considering Strategic and Performance Issues

In this section, you’ll be taking a quick peek at some performance tweaks, strategic considerations, and questions to ask yourself about how you’ll use your key-value data.

Note: If you decide to go for an ordered collection, check out theSorted Containers package, which includes aSortedDict.

You’ll be leveraging thetimeit module to get some metrics to work with. It’s important to bear in mind that to make any solid conclusions about performance, you need to test on a variety of hardware, and with a variety of sample types and sizes.

Finally, note that you won’t be going into detail about how to usetimeit. For that, check out thetutorial on Python timers. You’ll have some examples to play with, though.

Using Special Getter Functions to Increase Performance and Readability

You may have noticed that most of the sort key functions that you’ve used so far aren’t doing very much. All the function does is get a value from a tuple. Making a getter function is such a common pattern that Python has a special way to create special functions that get values more quickly than regular functions.

Theitemgetter() function can produce highly efficient versions of getter functions.

You passitemgetter() an argument, which is typically the key or index position that you want to select. Theitemgetter() function will then return a getter object that you call like a function.

That’s right, it’s a function that returns a function. Using theitemgetter() function is another example of working with higher-order functions.

The getter object fromitemgetter() will call the.__getitem__() method on the item that’s passed to it. When something makes a call to.__getitem__(), it needs to pass in the key or index of what to get. The argument that’s used for.__getitem__() is the same argument that you passed toitemgetter():

Python
>>>item=("name","Guido")>>>fromoperatorimportitemgetter>>>getter=itemgetter(0)>>>getter(item)'name'>>>getter=itemgetter(1)>>>getter(item)'Guido'

In the example, you start off with a tuple, similar to one that you might get as part of a dictionary view.

You make the first getter by passing0 as an argument toitemgetter(). When the resultant getter receives the tuple, it returns the first item in the tuple—the value at index0. If you callitemgetter() with an argument of1, then it gets the value at index position1.

You can use this itemgetter as a key for thesorted() function:

Python
>>>fromoperatorimportitemgetter>>>fruit_inventory=[...("banana",5),("orange",15),("apple",3),("kiwi",0)...]>>># Sort by key>>>sorted(fruit_inventory,key=itemgetter(0))[('apple', 3), ('banana', 5), ('kiwi', 0), ('orange', 15)]>>># Sort by value>>>sorted(fruit_inventory,key=itemgetter(1))[('kiwi', 0), ('apple', 3), ('banana', 5), ('orange', 15)]>>>sorted(fruit_inventory,key=itemgetter(2))Traceback (most recent call last):  File"<input>", line1, in<module>sorted(fruit_inventory,key=itemgetter(2))IndexError:tuple index out of range

In this example, you start by usingitemgetter() with0 as an argument. Since it’s operating on each tuple from thefruit_inventory variable, it gets the first element from each tuple. Then the example demonstrates initializing anitemgetter with1 as an argument, which selects the second item in the tuple.

Finally, the example shows what would happen if you useditemgetter() with2 as an argument. Since these tuples only have two index positions, trying to get the third element, with index2, results in aIndexError.

You can use the function produced byitemgetter() in place of the getter functions that you’ve been using up until now:

Python
>>>people={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}>>>fromoperatorimportitemgetter>>>sorted(people.items(),key=itemgetter(1))[(2, 'Jack'), (4, 'Jane'), (1, 'Jill'), (3, 'Jim')]

Theitemgetter() function produces a function that has exactly the same effect as thevalue_getter() function from previous sections. The main reason you’d want to use the function fromitemgetter() is because it’s more efficient. In the next section, you’ll start to put some numbers on just how much more efficient it is.

Measuring Performance When Usingitemgetter()

So, you end up with a function that behaves like the originalvalue_getter() from the previous sections, except that the version returned fromitemgetter() is more efficient. You can use thetimeit module to compare their performance:

Pythoncompare_lambda_vs_getter.py
fromtimeitimporttimeitdict_to_order={1:"requests",2:"pip",3:"jinja",4:"setuptools",5:"pandas",6:"numpy",7:"black",8:"pillow",9:"pyparsing",10:"boto3",11:"botocore",12:"urllib3",13:"s3transfer",14:"six",15:"python-dateutil",16:"pyyaml",17:"idna",18:"certifi",19:"typing-extensions",20:"charset-normalizer",21:"awscli",22:"wheel",23:"rsa",}sorted_with_lambda="sorted(dict_to_order.items(), key=lambda item: item[1])"sorted_with_itemgetter="sorted(dict_to_order.items(), key=itemgetter(1))"sorted_with_lambda_time=timeit(stmt=sorted_with_lambda,globals=globals())sorted_with_itemgetter_time=timeit(stmt=sorted_with_itemgetter,setup="from operator import itemgetter",globals=globals(),)print(f"""\{sorted_with_lambda_time=:.2f} seconds{sorted_with_itemgetter_time=:.2f} secondsitemgetter is{(sorted_with_lambda_time/sorted_with_itemgetter_time):.2f} times faster""")

This code uses thetimeit module to compare the sorting processes of the function fromitemgetter() and a lambda function.

Running this script from the shell should give you similar results to what’s below:

Shell
$pythoncompare_lambda_vs_getter.pysorted_with_lambda_time=1.81 secondssorted_with_itemgetter_time=1.29 secondsitemgetter is 1.41 times faster

A savings of around 40 percent is significant!

Bear in mind that when timing code execution, times can vary significantly between systems. That said, in this case, the ratio should be relatively stable across systems.

From the results of this test, you can see that usingitemgetter() is preferable from a performance standpoint. Plus, it’s part of the Python standard library, so there’s no cost to using it.

Note: The difference between using a lambda and a normal function as the sort key is negligible in this test.

Do you want to compare the performance of some operations that you haven’t covered here? Be sure to share the results by posting them in the comments!

Now you can squeeze a bit more performance out of your dictionary sorting, but it’s worth taking a step back and considering whether using a sorted dictionary as your preferred data structure is the best choice. A sorted dictionary isn’t a very common pattern, after all.

Coming up, you’ll be asking yourself some questions about what you what you want to do with your sorted dictionary and whether it’s the best data structure for your use case.

Judging Whether You Want to Use a Sorted Dictionary

If you’re considering making a sorted key-value data structure, then there are a few things you might want to take into consideration.

If you’re going to be adding data to a dictionary, and you want it to stay sorted, then you might be better off using a structure like a list of tuples or a list of dictionaries:

Python
# Dictionarypeople={3:"Jim",2:"Jack",4:"Jane",1:"Jill"}# List of tuplespeople=[(3,"Jim"),(2,"Jack"),(4,"Jane"),(1,"Jill"),]# List of dictionariespeople=[{"id":3,"name":"Jim"},{"id":2,"name":"Jack"},{"id":4,"name":"Jane"},{"id":1,"name":"Jill"},]

A list of dictionaries is the most widespread pattern because of its cross-language compatibility, known aslanguage interoperability.

Language interoperability is especially relevant if you create anHTTP REST API, for instance. Making your data available over the Internet will likely mean serializing it inJSON.

If someone usingJavaScript were to consume JSON data from a REST API, then the equivalent data structure would be anobject. The kicker is that JavaScript objects arenot ordered, so the order would end up scrambled!

This scrambling behavior would be true for many languages, and objects are even defined in theJSON specification as an unordered data structure. So, if you took care to order your dictionary before serializing to JSON, it wouldn’t matter by the time it got into most other environments.

Note: Signposting an ordered sequence of key-value pairs may not only be relevant for serializing Python dictionaries into JSON. Imagine you have people on your team who are used to other languages. An ordered dictionary might be a foreign concept to them, so you may need to be explicit about the fact that you’ve created an ordered data structure.

One way to be explicit about having an ordered dictionary in Python is to use the aptly namedOrderedDict.

Another option is to simply not worry about ordering the data if you don’t need to. Includingid,priority, or other equivalent attributes for each object can be enough to express order. If the ordering gets mixed up for any reason, then there’ll always be an unambiguous way to sort it:

Python
people={3:{"priority":2,"name":"Jim"},2:{"priority":4,"name":"Jack"},4:{"priority":1,"name":"Jane"},1:{"priority":2,"name":"Jill"},}

With apriority attribute, for instance, it’s clear thatJane should be first in line. Being clear about your intended ordering is nicely in agreement with the old Python adage ofexplicit is better than implicit, from theZen of Python.

What are the performance trade-offs with using a list of dictionaries versus a dictionary of dictionaries, though? In the next section, you’ll start to get some data on just that very question.

Comparing the Performance of Different Data Structures

If performance is a consideration—maybe you’ll be working with large datasets, for example—then you should carefully consider what you’ll be doing with the dictionary.

The two main questions you’ll seek to answer in the next few sections are:

  1. Will you be sorting once and then making lots of lookups?
  2. Will you be sorting many times and making very few lookups?

Once you’ve decided what usage patterns you’ll be subjecting your data structure to, then you can use thetimeit module to test the performance. These measurements can vary a lot with the exact shape and size of the data being tested.

In this example, you’ll be pitting a dictionary of dictionaries against a list of dictionaries to see how they differ in terms of performance. You’ll be timing sorting operations and lookup operations with the following sample data:

Pythonsamples.py
dictionary_of_dictionaries={1:{"first_name":"Dorthea","last_name":"Emmanuele","age":29},2:{"first_name":"Evelina","last_name":"Ferras","age":91},3:{"first_name":"Frederica","last_name":"Livesay","age":99},4:{"first_name":"Murray","last_name":"Linning","age":36},5:{"first_name":"Annette","last_name":"Garioch","age":93},6:{"first_name":"Rozamond","last_name":"Todd","age":36},7:{"first_name":"Tiffi","last_name":"Varian","age":28},8:{"first_name":"Noland","last_name":"Cowterd","age":51},9:{"first_name":"Dyana","last_name":"Fallows","age":100},10:{"first_name":"Diahann","last_name":"Cutchey","age":44},11:{"first_name":"Georgianne","last_name":"Steinor","age":32},12:{"first_name":"Sabina","last_name":"Lourens","age":31},13:{"first_name":"Lynde","last_name":"Colbeck","age":35},14:{"first_name":"Abdul","last_name":"Crisall","age":84},15:{"first_name":"Quintus","last_name":"Brando","age":95},16:{"first_name":"Rowena","last_name":"Geraud","age":21},17:{"first_name":"Maurice","last_name":"MacAindreis","age":83},18:{"first_name":"Pall","last_name":"O'Cullinane","age":79},19:{"first_name":"Kermie","last_name":"Willshere","age":20},20:{"first_name":"Holli","last_name":"Tattoo","age":88},}list_of_dictionaries=[{"id":1,"first_name":"Dorthea","last_name":"Emmanuele","age":29},{"id":2,"first_name":"Evelina","last_name":"Ferras","age":91},{"id":3,"first_name":"Frederica","last_name":"Livesay","age":99},{"id":4,"first_name":"Murray","last_name":"Linning","age":36},{"id":5,"first_name":"Annette","last_name":"Garioch","age":93},{"id":6,"first_name":"Rozamond","last_name":"Todd","age":36},{"id":7,"first_name":"Tiffi","last_name":"Varian","age":28},{"id":8,"first_name":"Noland","last_name":"Cowterd","age":51},{"id":9,"first_name":"Dyana","last_name":"Fallows","age":100},{"id":10,"first_name":"Diahann","last_name":"Cutchey","age":44},{"id":11,"first_name":"Georgianne","last_name":"Steinor","age":32},{"id":12,"first_name":"Sabina","last_name":"Lourens","age":31},{"id":13,"first_name":"Lynde","last_name":"Colbeck","age":35},{"id":14,"first_name":"Abdul","last_name":"Crisall","age":84},{"id":15,"first_name":"Quintus","last_name":"Brando","age":95},{"id":16,"first_name":"Rowena","last_name":"Geraud","age":21},{"id":17,"first_name":"Maurice","last_name":"MacAindreis","age":83},{"id":18,"first_name":"Pall","last_name":"O'Cullinane","age":79},{"id":19,"first_name":"Kermie","last_name":"Willshere","age":20},{"id":20,"first_name":"Holli","last_name":"Tattoo","age":88},]

Each data structure has the same information, except one is structured as a dictionary of dictionaries, and the other is a list of dictionaries. First up, you’ll be getting some metrics on the performance of sorting these two data structures.

Comparing the Performance of Sorting

In the following code, you’ll be usingtimeit to compare the time it takes to sort the two data structures by theage attribute:

Pythoncompare_sorting_dict_vs_list.py
fromtimeitimporttimeitfromsamplesimportdictionary_of_dictionaries,list_of_dictionariessorting_list="sorted(list_of_dictionaries, key=lambda item:item['age'])"sorting_dict="""dict(    sorted(        dictionary_of_dictionaries.items(), key=lambda item: item[1]['age']    ))"""sorting_list_time=timeit(stmt=sorting_list,globals=globals())sorting_dict_time=timeit(stmt=sorting_dict,globals=globals())print(f"""\{sorting_list_time=:.2f} seconds{sorting_dict_time=:.2f} secondslist is{(sorting_dict_time/sorting_list_time):.2f} times faster""")

This code imports the sample data structures for sorting on theage attribute. It may seem like you aren’t using the imports fromsamples, but it’s necessary for these samples to be in the globalnamespace so that thetimeit context has access to them.

Running the code for this test on the command line should provide you with some interesting results:

Shell
$pythoncompare_sorting_dict_vs_list.pysorting_list_time=1.15 secondssorting_dict_time=2.26 secondslist is 1.95 times faster

Sorting a list can be almost twice as fast as the process required to sort a dictionary view and then create a new sorted dictionary. So, if you plan on sorting your data very regularly, then a list of tuples might be better than a dictionary for you.

Note: Not many solid conclusions can be drawn from a single dataset like this. Additionally, results can vary wildly with differently sized or shaped data.

These examples are a way for you to dip your toes into thetimeit module and start to see how and why you might use it. This will give you some of the tools necessary to benchmark your data structures, to help you decide which data structure to settle on for your key-value pairs.

If you need the extra performance, then go ahead and time your specific data structures. That said, beware ofpremature optimization!

One of the main overheads when sorting a dictionary, as opposed to a list, is reconstructing the dictionary after sorting it. If you were to take out the outerdict() constructor, then you’d significantly cut the execution time.

In the next section, you’ll be looking at the time it takes to look up values in a dictionary of dictionaries versus in a list of dictionaries.

Comparing the Performance of Lookups

However, if you plan to use the dictionary to sort your data once and use that dictionary mainly for lookups, then a dictionary will definitely make more sense than a list:

Pythoncompare_lookup_dict_vs_list.py
fromtimeitimporttimeitfromsamplesimportdictionary_of_dictionaries,list_of_dictionarieslookups=[15,18,19,16,6,12,5,3,9,20,2,10,13,17,4,14,11,7,8]list_setup="""def get_key_from_list(key):    for item in list_of_dictionaries:        if item["id"] == key:            return item"""lookup_list="""for key in lookups:    get_key_from_list(key)"""lookup_dict="""for key in lookups:    dictionary_of_dictionaries[key]"""lookup_list_time=timeit(stmt=lookup_list,setup=list_setup,globals=globals())lookup_dict_time=timeit(stmt=lookup_dict,globals=globals())print(f"""\{lookup_list_time=:.2f} seconds{lookup_dict_time=:.2f} secondsdict is{(lookup_list_time/lookup_dict_time):.2f} times faster""")

This code makes a series of lookups to both the list and the dictionary. You’ll note that with the list, you have to write a special function to make a lookup. The function to make the list lookup involves going through all the list elements one by one until you find the target element, which isn’t ideal.

Running this comparison script from the command line should yield a result showing that dictionary lookups are significantly faster:

Shell
$pythoncompare_lookup_dict_vs_list.pylookup_list_time=6.73 secondslookup_dict_time=0.38 secondsdict is 17.83 times faster

Nearly eighteen times faster! That’s a whole bunch. So, you certainly want to weigh the blazing speed of dictionary lookups against the data structure’s slower sorting. Bear in mind that this ratio can vary significantly from system to system, not to mention the variation that might come from differently sized dictionaries or lists.

Dictionary lookups are certainly faster, though, no matter how you slice it. That said, if you’re just doing lookups, then you could just as easily do that with a regular unsorted dictionary. Why would you need a sorted dictionary in that case? Leave your use case in the comments!

Note: You could try and optimize list lookups, for example by implementing abinary search algorithm to cut time off the list lookup. However, any benefit will only become noticeable at substantial list sizes.

With list sizes like the ones tested here, using a binary search with thebisect module is significantly slower than a regularfor loop.

Now you should have a relatively good idea of some trade-offs between two ways to store your key-value data. The conclusion that you can reach is that, most of the time, if you want a sorted data structure, then you should probably steer clear of the dictionary, mainly for language interoperability reasons.

That said, giveGrant Jenks’ aforementionedsorted dictionary a try. It uses some ingenious strategies to get around typical performance drawbacks.

Do you have any interesting or performant implementations of a sorted key-value data structure? Share them in the comments, along with your use cases for a sorted dictionary!

Conclusion

You’ve gone from the most basic way to sort a dictionary to a few advanced alternatives that consider performance in sorting key-value pairs.

In this tutorial, you’ve:

  • Reviewed thesorted() function
  • Discovered dictionaryviews
  • Understood how dictionaries are cast tolists during sorting
  • Specifiedsort keys to sort a dictionary by value, key, or nested attribute
  • Used dictionarycomprehensions and thedict()constructor to rebuild your dictionaries
  • Considered whether a sorted dictionary is the rightdata structure for yourkey-value data

You’re now ready to not only sort dictionaries by any criteria you might think of, but also to judge whether the sorted dictionary is the best choice for you.

Share your sorted dictionary use cases and performance comparisons in the comments below!

Frequently Asked Questions

Now that you have some experience with sorting dictionaries in Python, you can use the questions and answers below to check your understanding and recap what you’ve learned.

These FAQs are related to the most important concepts you’ve covered in this tutorial. Click theShow/Hide toggle beside each question to reveal the answer.

You can sort a dictionary by its keys using thesorted() function with the dictionary’s.items() method, and then convert the result back to a dictionary.

To sort a dictionary by its values, use thesorted() function with the.items() method and specify akey parameter with a lambda function to extract the value.

Yes, you can sort a Python dictionary in descending order by passingreverse=True to thesorted() function.

When sorting a dictionary with non-comparable keys or values, you need to provide a customkey function that returns a comparable element for each item.

No, sorting a dictionary in-place isn’t possible because dictionaries can’t be reordered. Instead, you need to create a new sorted dictionary.

Free Download:Click here to download the code that you’ll use to sort key-value pairs in this tutorial.

Recommended Course

Sorting Dictionaries in Python: Keys, Values, and More(46m)

🐍 Python Tricks 💌

Get a short & sweetPython Trick delivered to your inbox every couple of days. No spam ever. Unsubscribe any time. Curated by the Real Python team.

Python Tricks Dictionary Merge

AboutIan Currie

Ian is a Python nerd who relies on it for work and much enjoyment.

» More about Ian

Each tutorial at Real Python is created by a team of developers so that it meets our high quality standards. The team members who worked on this tutorial are:

MasterReal-World Python Skills With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

MasterReal-World Python Skills
With Unlimited Access to Real Python

Locked learning resources

Join us and get access to thousands of tutorials, hands-on video courses, and a community of expert Pythonistas:

Level Up Your Python Skills »

What Do You Think?

Rate this article:

What’s your #1 takeaway or favorite thing you learned? How are you going to put your newfound skills to use? Leave a comment below and let us know.

Commenting Tips: The most useful comments are those written with the goal of learning from or helping out other students.Get tips for asking good questions andget answers to common questions in our support portal.


Looking for a real-time conversation? Visit theReal Python Community Chat or join the next“Office Hours” Live Q&A Session. Happy Pythoning!

Keep Learning

Related Topics:intermediatedata-structuresstdlib

Related Courses:

Related Tutorials:

Keep reading Real Python by creating a free account or signing in:

Already have an account?Sign-In

Almost there! Complete this form and click the button below to gain instant access:

Sort a Python Dictionary by Value, Key, and More

Sorting a Python Dictionary: Values, Keys, and More (Sample Code)

🔒 No spam. We take your privacy seriously.


[8]ページ先頭

©2009-2026 Movatter.jp