avatarBetter Everything

Summary

The undefined website article discusses the Python defaultdict data structure, explaining its purpose, usage, and advantages over standard dictionaries, particularly in handling missing keys.

Abstract

Python's defaultdict is a subclass of the built-in dict type that provides a default value for missing keys. Unlike regular dictionaries, which raise a KeyError when an undefined key is accessed, defaultdict automatically initializes the key with a default value. This behavior is particularly useful in scenarios where one might otherwise have to check for key existence before proceeding with operations. The article illustrates how to create a defaultdict using either a datatype or a lambda function to specify the default value, and demonstrates its practical applications such as counting occurrences or accumulating values without the need for explicit initialization. The defaultdict can be initialized with various default values, such as an integer for counting, a string for a default message, or a list for collecting items, and it supports all the methods and operations of a regular dictionary. The article also shows how to migrate existing dictionaries into a defaultdict and how to update values for keys that may not exist yet, all without causing a KeyError.

Opinions

  • The author suggests that using a defaultdict is advantageous over a normal dictionary when dealing with missing keys, as it avoids errors and simplifies code.
  • The article conveys that the defaultdict is a powerful tool for programmers who frequently encounter scenarios where they need to manage dynamic key-value pairs without explicit initialization.
  • The use of lambda functions for more complex default values is presented as a flexible feature of defaultdict.
  • The author emphasizes the practicality of defaultdict in real-world applications, such as tracking team budgets or maintaining customer notes, by showcasing examples where it streamlines the code and enhances readability.
  • The article implies that the defaultdict is underutilized, despite its potential to make code more efficient and robust, and encourages readers to consider incorporating it into their Python programming toolkit.

What is a Python defaultdict and when to use it?

Python has a datatype called dictionary (or dict) in which you can map values to keys. But you can also import the similar defaultdict datatype from the standard collections library.

What is a defaultdict in Python and when to use it? Image by catalyststuff on Freepik

A defaultdict is a dictionary that can be given a default value. Which means that when you try to work with a key that is not in the dictionary yet, a default value is used.

Missing keys in a dictionary

Suppose we have a normal Python dictionary that maps languages to countries:

language_per_country = {"USA":"English",
                        "Mexico":"Spanish",
                        "France":"French",
                        "UK":"English",
                        "Brazil":"Portuguese"}

Then we can get a country’s language by looking up the country. In a broader sense, we get a value by looking up a key.

print(language_per_country["USA"])
print(language_per_country["Mexico"])

The above code prints:

English
Spanish

But what happens when we try to look up a key that is not in a dictionary:

print(language_per_country["Australia"])

Well, we get an error. More specifically a KeyError, because the key is missing:

Traceback (most recent call last):
  File "C:\Users\BE\test\main.py", line 9, in <module>
    print(language_per_country["Australia"])
          ~~~~~~~~~~~~~~~~~~~~^^^^^^^^^
KeyError: 'Australia'

The advantage of a defaultdict over a normal Python dictionary

It would be nice if instead of an error occuring we could just get a default value when a key is missing. That is exactly what a defaultdict does.

How to create a Python defaultdict

To be able to use a defaultdict you have to import it from the standard collections library. This library comes with Python by default and doesn’t have to be installed. To import it, just add:

from collections import defaultdict

Now you can create a defaultdict.

There are basically 2 ways to initialize a Python defaultdict. Either with a datatype or with a lambda function as an argument.

1 — Using a datatype to initialize a defaultdict

By using a datatype when initializing a defaultdict, the default value will be an empty value of that datatype.

Here are some examples:

  • Initializing a defaultdict with int makes 0 the default value. The following code prints 0:
from collections import defaultdict

wins_per_driver_2023 = defaultdict(int)

print(wins_per_driver_2023["Hamilton"])
  • Initializing a defaultdict with str makes the default value an empty string "". The following code prints an empty string:
from collections import defaultdict

f1_champions_per_season = defaultdict(str)

print(f1_champions_per_season[2023])
  • Initializing a defaultdict with list makes the default value an empty list []. The following code prints an empty list:
from collections import defaultdict

wins_list_per_driver_2023 = defaultdict(list)

print(wins_list_per_driver_2023["Hamilton"])

2 — Using a lambda function to initialize a defaultdict

Apart from empty values you can also give normal values as a default for missing keys in a defaultdict. This can be done with lambda functions.

For instance, to make "English" the default value of a defaultdict you can initialize it with the argument: lambda: "English".

from collections import defaultdict

language_per_country = defaultdict(lambda: "English")

print(language_per_country["Australia"])

The code above prints: English.

Here is an example of making a defaultdict with 1000 as its default value:

from collections import defaultdict

budgets_per_team = defaultdict(lambda: 1000)

print(budgets_per_team[1])

The code above prints: 1000.

How to work with a defaultdict

After initializing a defaultdict with a default value, you can fill it with values like a normal dictionary:

language_per_country["Mexico"] = "Spanish"

If you already have a normal dictionary you can put its keys and values in a defaultdict by looping over its items with the items method:

from collections import defaultdict

language_per_country = defaultdict(lambda: "English")

original_dict = {"USA":"English",
                "Mexico":"Spanish",
                "France":"French",
                "UK":"English",
                "Brazil":"Portuguese"}

for key, value in original_dict.items():
    language_per_country[key] = value

Of course you can look up keys like with regular dictionaries like we have already seen. And missing keys won’t raise errors:

print(language_per_country['Brazil'])
print(language_per_country['India'])

The above code prints:

Portuguese
English

But you can also easily update values stored at keys.

When you try to update a value at a key that doesn’t exist, the defaultdict will create it before the update is carried out and store the new key-value pair.

Here is an example of giving teams a start budget of 1000. Without explicitly giving team 2 and 5 a budget of 1000 we can update their budgets based on the default value: 1000:

from collections import defaultdict

budgets_per_team = defaultdict(lambda: 1000)

budgets_per_team[2] -= 200
budgets_per_team[5] += 500

for team in [1,2,3,4,5]:
    print(budgets_per_team[team])

This code prints:

1000
800
1000
1000
1500

Likewise, we can also append items to a list at a key that has not yet been entered into a defaultdict without the problems of KeyError:

from collections import defaultdict

wins_per_team = defaultdict(list)

wins_per_team['Red Bull'].append('Bahrain')

You can also work with defaultdicts that have dictionaries as their values:

from collections import defaultdict

notes_per_customer = defaultdict(dict)

notes_per_customer['0001'].update({"delivery":"Monday",
                                   "helpdeskPhone":"12-345-678"})
notes_per_customer['0344'].update({"warning":"Has not paid in 3 months."})
notes_per_customer['0001'].update({"delivery":"Friday"})

print(notes_per_customer)

In the above example we make a defaultdict called notes_per_customer. The default value is an empty dictionary. Without explicitly initializing a key-value pair for customer 0001 we can update its notes-dictionary with the update method.

When a key in a customer’s dictionary-value is used again, the old value is updated.

The above code example prints:

defaultdict(<class 'dict'>, {'0001': {'delivery': 'Friday', 'helpdeskPhone':
 '12-345-678'}, '0344': {'warning': 'Has not paid in 3 months.'}})

Thank you for reading!

You can get full access to all my posts by joining Medium. Your membership fee directly supports me and other writers you read. You’ll also get full access to every story on Medium:

You might also like:

Programming
Software Development
Python
Defaultdict
Data Structures
Recommended from ReadMedium