Does Python still need the map() function?
Having various alternatives, Python’s map() seems to be redundant. So, does Python need it at all?
Don’t worry, this will not be the millionth article on how to use map()
in Python. I am not going to tell you that it’s better or worse than a list comprehension or a for
loop. I am not going to benchmark it against its corresponding generator or list comprehension. I am not going to claim that using map()
will make you look like an advanced Python developer…
You can read about all these things in other articles published on Medium. Even though the built-in map()
function is not too popular among Python developers (you will seldom find it in production code), it has gained much popularity among Python authors (see, e.g., here, here and here). Why is that? Maybe because it’s an interesting function; resembles functional programming; or can be benchmarked against its alternatives, and benchmarking usually draws attention?
Most articles on Python’s map()
merely show the how, but not the why: While showing how to use it, they usually fail to explain why one should use it. No wonder, then, that despite its popularity among Python authors, map()
seems to be underappreciated among intermediate-level developers.
If you want to learn more about map()
, this article is for you. We will discuss the map()
function’s place on Python’s code road map, and why it’s worth it to know this function, irrespective of whether you will ever use it or not.
A few words about map()
map()
does something that most Python developers do quite often: It calls a function (actually, a callable) for each element of an iterable.
It takes a callable as an argument, so it is a higher-order function. Since that is a typical feature of functional programming languages, map()
conforms to the functional-programming style. For example, you will find a lot of map()
’s applications in Lott’s book Functional Python Programming. I’d say map()
uses a similar API, rather than being truly functional programming. This is because we can use map()
for impure functions, that is, functions that have side effects; this is unacceptable in true functional programming.
It’s time to see map()
in action.
>>> numbers = [1, 4, 5, 10, 17]
>>> def double(x):
... return x*2
So, we have a list of numbers, and we have a function that doubles a number. double()
works for a single number:
>>> double(10)
20
What will happen if we use double()
for numbers
?
>>> double(numbers)
[1, 4, 5, 10, 17, 1, 4, 5, 10, 17]
If you know how multiplying a list works in Python, this have not amazed you. Although this is a normal behavior, this is definitely not what we wanted to achieve. Above, double(numbers)
applied function double()
to numbers
as a whole (as an object). This is not what we want; we want to apply double()
to each element of numbers
. This makes much of a difference, and this is where map()
gets in: you can use it when you want to apply a callable to each element of an iterable .
Warning: Some languages use a name map
for a hash map; in Python, dictionaries are hash maps. So, beware that when you see the term “map” in another language, first check what it represents. For example, map()
in R is equivalent to Python’s map()
; but in Go, map()
creates a hash map and works similarly to Python’s dict()
. This was confusing for me when I first started learning Go, but after some time you will just get over this.
This is how you should use map()
:
>>> doubled_numbers = map(double, numbers)
As you see, you provide a callable as the first argument and an iterable as the second argument to map()
. It returns a map object (in Python 3, but in Python 2 you would get a list):
>>> doubled_numbers #doctest: +ELLIPSIS
<map object at ...>
(Note that I used #doctest: +ELLIPSIS
directive, as this document is covered by doctest
s. It helped me ensure that all the examples were correct. You can read more about doctest
ing in the documentation.)
A map
object works like a generator. So, even though we used a list in map()
, we did not get a a list but a generator. Generators are evaluated on demand (lazily). If you want to convert a map
object into a list, use the list()
function, which will evaluate all the elements:
>>> list(doubled_numbers)
[2, 8, 10, 20, 34]
Or, you can evaluate map
’s elements in any other way, like in a for
loop. In order to avoid unpleasant headaches, remember that once such an object has been evaluated, it’s empty and thus cannot be evaluated anymore:
>>> list(doubled_numbers)
[]
Above, we applied map()
for a single iterable, but we can use as many of them. The function will use them based on their index, that is, first, it will call the callable for the first elements of the iterables (at index 0); then for the second; and so on.
A simple example for this:
>>> def sum_of_squares(x, y, z):
... return x**2 + y**2 + z**2
>>> x = range(5)
>>> y = [1, 1, 1, 2, 2]
>>> z = (10, 10, 5, 5, 5)
>>> SoS = map(sum_of_squares, x, y, z)
>>> list(SoS)
[101, 102, 30, 38, 45]
>>> list(map(sum_of_squares, x, x, x))
[0, 3, 12, 27, 48]
map()'s alternatives
Instead of map()
, you can use a generator, for examples, through a generator expression:
>>> doubled_numbers_gen = (double(x) for x in numbers)
This provides a generator, like a map()
did. When you need a list, you will do better with the corresponding list comprehension:
>>> doubled_numbers_list = [double(x) for x in numbers]
Which is more readable: the map()
version or the generator expression (or a list comprehension)? For me, without a second of hesitation, the generator expression and the list comprehension are clearer, even though I have no problem understanding the map()
version. But I know that some people would choose the map()
version, especially those who have recently moved to Python from another language that uses a similar function to map()
.
People often combine map()
with lambda
functions, which is a good solution when you do not want to reuse the function anywhere else. I think that part of the negative opinion of map()
comes from this usage, as lambda
functions often make code less readable. In that case, more often than not, a generator expression will be much more readable. Compare the two versions below: one with map()
combined with lambda
, and another with the corresponding generator expression. This time, we will not use our double()
function, but we will define it directly inside the calls:
# map-lambda version
map(lambda x: x*2, numbers)
# generator version
(x*2 for x in numbers)
The two lines lead to the same results, the only difference being the type of the returned objects: the returns a map
object while the latter a generator
object.
Let’s return for a moment to multi-iterable uses of map()
:
>>> SoS = map(sum_of_squares, x, y, z)
We can rewrite it using a generator expression in the following way:
>>> SoS_gen = (
... sum_of_squares(x_i, y_i, z_i)
... for x_i, y_i, z_i in zip(x, y, z)
... )
>>> list(SoS_gen)
[101, 102, 30, 38, 45]
This time my vote goes to the map()
version! In addition to being more concise, it’s in my opinion much clearer. The generator version utilizes the zip()
function; even if it’s a simple function to use, it adds up to the difficulty of the command.
So, we don’t need map(), do we?
As follows from the above discussion, there are no situations in which we must use the map()
function; instead, we can use a generator expression, a loop, or something else.
Knowing that, do we need map()
whatsoever?
Pondering this question, I have come up with three main reasons why we need map()
in Python.
Reason 1: Performance
As already mentioned, map()
is evaluated lazily. In many cases, however, evaluating map()
is quicker than evaluating the corresponding generator expression. Although this does not have to be the case with the corresponding list comprehension, we should remember this when trying to optimize our Python application.
Remember, however, that it’s not a general rule, so you should not assume this. Whether or not map()
will be quicker in your snippet needs to be checked every time.
Take this reason into account only if even slight differences in performance matter. Otherwise, you will gain very little by using map()
at the cost of readability, so you should think twice before going for it. Oftentimes, saving a minute means nothing. Other times, saving a second can mean a lot.
Reason 2: Parallelism and threading
When you parallelize your code or use a threading pool, you often end up using functions similar to map()
. This can include methods such as multiprocessing.Pool.map()
, pathos.multiprocessing.ProcessingPool.map()
, or concurrent.futures.ThreadPoolExecutor.map()
. So, learning to use map()
will help you understand how to use these functions. Often, you will want to switch between a parallel and non-parallel version. You can do so very easily, thanks to the similarity between these functions. Look: