avatarMarcin Kozak

Summary

The article discusses the relevance and utility of Python's map() function in modern Python programming, despite the existence of alternative approaches such as list comprehensions and generator expressions.

Abstract

While Python's map() function is often considered redundant due to the popularity of list comprehensions and generator expressions, the article argues that map() still holds value in the Python ecosystem. It emphasizes map()'s role in functional programming, its performance benefits due to lazy evaluation, and its ease of use for those transitioning from other programming languages. The author also points out its significance in parallel processing and its ability to handle multiple iterables more readably than its alternatives. Although not a necessity, map() is defended as a useful and Pythonic tool that developers should be familiar with, despite proposals to remove it from the language in the past.

Opinions

  • The author believes that map() is underappreciated among intermediate-level Python developers, despite its frequent discussion among Python authors.
  • There is a sentiment that map() can be more readable and concise than alternatives when dealing with multiple iterables.
  • The article suggests that map() can be beneficial for performance optimization in certain scenarios, although this is not a general rule.
  • The author opines that map() serves as a bridge for developers coming from other languages, facilitating the transition to Pythonic code.
  • map() is considered valuable for its role in parallelism and threading, making it easier to switch between parallel and non-parallel code versions.
  • The author expresses that map() should remain a part of Python, citing its various use cases and Pythonic nature, despite Guido van Rossum's past considerations to remove it.

Does Python still need the map() function?

Having various alternatives, Python’s map() seems to be redundant. So, does Python need it at all?

Does Python need the map() function? Photo by Muhammad Haikal Sjukri on Unsplash

Don’t worry, this will not be the millionth article on how to use map() in Python. I am not going to tell you that it’s better or worse than a list comprehension or a for loop. I am not going to benchmark it against its corresponding generator or list comprehension. I am not going to claim that using map() will make you look like an advanced Python developer…

You can read about all these things in other articles published on Medium. Even though the built-in map() function is not too popular among Python developers (you will seldom find it in production code), it has gained much popularity among Python authors (see, e.g., here, here and here). Why is that? Maybe because it’s an interesting function; resembles functional programming; or can be benchmarked against its alternatives, and benchmarking usually draws attention?

Most articles on Python’s map() merely show the how, but not the why: While showing how to use it, they usually fail to explain why one should use it. No wonder, then, that despite its popularity among Python authors, map() seems to be underappreciated among intermediate-level developers.

If you want to learn more about map(), this article is for you. We will discuss the map() function’s place on Python’s code road map, and why it’s worth it to know this function, irrespective of whether you will ever use it or not.

A few words about map()

map() does something that most Python developers do quite often: It calls a function (actually, a callable) for each element of an iterable.

It takes a callable as an argument, so it is a higher-order function. Since that is a typical feature of functional programming languages, map()conforms to the functional-programming style. For example, you will find a lot of map()’s applications in Lott’s book Functional Python Programming. I’d say map() uses a similar API, rather than being truly functional programming. This is because we can use map() for impure functions, that is, functions that have side effects; this is unacceptable in true functional programming.

It’s time to see map() in action.

>>> numbers = [1, 4, 5, 10, 17]
>>> def double(x):
...     return x*2

So, we have a list of numbers, and we have a function that doubles a number. double() works for a single number:

>>> double(10)
20

What will happen if we use double() for numbers?

>>> double(numbers)
[1, 4, 5, 10, 17, 1, 4, 5, 10, 17]

If you know how multiplying a list works in Python, this have not amazed you. Although this is a normal behavior, this is definitely not what we wanted to achieve. Above, double(numbers) applied function double() to numbers as a whole (as an object). This is not what we want; we want to apply double() to each element of numbers. This makes much of a difference, and this is where map() gets in: you can use it when you want to apply a callable to each element of an iterable .

Warning: Some languages use a name map for a hash map; in Python, dictionaries are hash maps. So, beware that when you see the term “map” in another language, first check what it represents. For example, map() in R is equivalent to Python’s map(); but in Go, map() creates a hash map and works similarly to Python’s dict(). This was confusing for me when I first started learning Go, but after some time you will just get over this.

This is how you should use map():

>>> doubled_numbers = map(double, numbers)

As you see, you provide a callable as the first argument and an iterable as the second argument to map(). It returns a map object (in Python 3, but in Python 2 you would get a list):

>>> doubled_numbers #doctest: +ELLIPSIS
<map object at ...>

(Note that I used #doctest: +ELLIPSIS directive, as this document is covered by doctests. It helped me ensure that all the examples were correct. You can read more about doctesting in the documentation.)

A map object works like a generator. So, even though we used a list in map(), we did not get a a list but a generator. Generators are evaluated on demand (lazily). If you want to convert a map object into a list, use the list() function, which will evaluate all the elements:

>>> list(doubled_numbers)
[2, 8, 10, 20, 34]

Or, you can evaluate map’s elements in any other way, like in a for loop. In order to avoid unpleasant headaches, remember that once such an object has been evaluated, it’s empty and thus cannot be evaluated anymore:

>>> list(doubled_numbers)
[]

Above, we applied map() for a single iterable, but we can use as many of them. The function will use them based on their index, that is, first, it will call the callable for the first elements of the iterables (at index 0); then for the second; and so on.

A simple example for this:

>>> def sum_of_squares(x, y, z):
...     return x**2 + y**2 + z**2
>>> x = range(5)
>>> y = [1, 1, 1, 2, 2]
>>> z = (10, 10, 5, 5, 5)
>>> SoS = map(sum_of_squares, x, y, z)
>>> list(SoS)
[101, 102, 30, 38, 45]
>>> list(map(sum_of_squares, x, x, x))
[0, 3, 12, 27, 48]

map()'s alternatives

Instead of map(), you can use a generator, for examples, through a generator expression:

>>> doubled_numbers_gen = (double(x) for x in numbers)

This provides a generator, like a map() did. When you need a list, you will do better with the corresponding list comprehension:

>>> doubled_numbers_list = [double(x) for x in numbers]

Which is more readable: the map() version or the generator expression (or a list comprehension)? For me, without a second of hesitation, the generator expression and the list comprehension are clearer, even though I have no problem understanding the map() version. But I know that some people would choose the map() version, especially those who have recently moved to Python from another language that uses a similar function to map().

People often combine map() with lambda functions, which is a good solution when you do not want to reuse the function anywhere else. I think that part of the negative opinion of map() comes from this usage, as lambda functions often make code less readable. In that case, more often than not, a generator expression will be much more readable. Compare the two versions below: one with map() combined with lambda, and another with the corresponding generator expression. This time, we will not use our double() function, but we will define it directly inside the calls:

# map-lambda version
map(lambda x: x*2, numbers)
# generator version
(x*2 for x in numbers)

The two lines lead to the same results, the only difference being the type of the returned objects: the returns a map object while the latter a generator object.

Let’s return for a moment to multi-iterable uses of map():

>>> SoS = map(sum_of_squares, x, y, z)

We can rewrite it using a generator expression in the following way:

>>> SoS_gen = (
...     sum_of_squares(x_i, y_i, z_i)
...     for x_i, y_i, z_i in zip(x, y, z)
... )
>>> list(SoS_gen)
[101, 102, 30, 38, 45]

This time my vote goes to the map() version! In addition to being more concise, it’s in my opinion much clearer. The generator version utilizes the zip() function; even if it’s a simple function to use, it adds up to the difficulty of the command.

So, we don’t need map(), do we?

As follows from the above discussion, there are no situations in which we must use the map() function; instead, we can use a generator expression, a loop, or something else.

Knowing that, do we need map() whatsoever?

Pondering this question, I have come up with three main reasons why we need map() in Python.

Reason 1: Performance

As already mentioned, map()is evaluated lazily. In many cases, however, evaluating map() is quicker than evaluating the corresponding generator expression. Although this does not have to be the case with the corresponding list comprehension, we should remember this when trying to optimize our Python application.

Remember, however, that it’s not a general rule, so you should not assume this. Whether or not map() will be quicker in your snippet needs to be checked every time.

Take this reason into account only if even slight differences in performance matter. Otherwise, you will gain very little by using map() at the cost of readability, so you should think twice before going for it. Oftentimes, saving a minute means nothing. Other times, saving a second can mean a lot.

Reason 2: Parallelism and threading

When you parallelize your code or use a threading pool, you often end up using functions similar to map(). This can include methods such as multiprocessing.Pool.map(), pathos.multiprocessing.ProcessingPool.map(), or concurrent.futures.ThreadPoolExecutor.map(). So, learning to use map() will help you understand how to use these functions. Often, you will want to switch between a parallel and non-parallel version. You can do so very easily, thanks to the similarity between these functions. Look:

Of course, in this simple example, parallelizing does not make sense and will be slower, but I wanted to show you how to do this.

Reason 3: Simplicity for Python newcomers from other languages

This reason is atypical and does not refer to the language itself, but it’s sometimes important. For me, a generator expression is almost always easier to write and more readable. However, when I was quite new to Python, comprehensions were not that easy for me, both to write and to understand. But since I came to Python after 16 years of R programming, I was very familiar with R’s map() function, which works in exactly the same way as does Python’s map(). Then, it was much easier for me to use map() than the corresponding generator expression or list comprehension.

What’s more, familiarity with map() helped me understand comprehensions. I was also able to write Pythonic code; yes, using map() is Pythonic. We know that a third option to achieve the same, which I did not cover here, is a for loop, but it’s seldom a better (or even good) option. Hence, if someone comes to Python and knows how such functions work, it’s much easier for them to write Pythonic code. For instance, someone coming to Python from C will likely use a for loop, which is in such instances considered non-Pythonic.

This means that map() is a bridge between Python and other languages — a bridge that can help others understand the language and write Pythonic code.

Reason 4: Readability in the case of several iterables

As shown above, when you want to use a callable for a number of iterables at the same time, map() can be much more readable and concise than the corresponding generator expression. So, even though in simpler situations map() is less readable, in more complex scenarios its readability becomes its strength.

Conclusion

Some say that Python does not need map(). Back in 2005, Guido himself wanted to remove it from Python, along with filter() and reduce(). But today, 17 years later, we still can use it, and I think — and sincerely hope — that this will not change. This brings us to two crucial questions this article is all about:

Is the map() function a must in Python? No, it is not. You can achieve the same result with other alternatives.

This being the case, does Python still need the map() function? Yes, it does. Not because it is a must, but because it is still used, and it serves various valuable purposes.

I think Python developers should know how to use map(), even if they don’t use it on a regular basis. In some situations, it can help improve performance, parallelize code with only minor changes, or improve code readability. It helps in understanding comprehensions. And it can help developers coming from other languages to use idiomatic Python — because, yes, map() is still Pythonic.

For these very reasons, I think map() deserves its place in the Python code base, even if it is not used too often.

Resources

Python
Data Science
Mapping
Coding
Programming
Recommended from ReadMedium