avatarTudor Surdoiu

Summarize

Top 12 most important Python concepts

Photo by Raul Cacho Oses on Unsplash

In this article I am going to explain several important python concepts and present their usage. They are very useful in the daily life of a python engineer and are very likely to be found in almost any python interviewer repertoire of question in one form or another.

Let’s begin:

Generators

The basic idea of a generator is that it allows you to create a function that has the same behavior as an iterator but without the boiler plate that comes with it.

If for an iterator we have to create a whole new class that implements the __next__ and __iter__ methods with the whole state management, a generator function can be declared almost like a normal function that instead of just using “return” it uses “yield”. The value you yield is the value we get from the next step of execution. An important aspect is that the call to the generator function returns an iterator, which must be then iterated over with the “next” function or using a for loop. The main advantage of a generator function is that it’s memory efficient, we don’t have to keep the values in an array to use them, we can get them in a “lazy” manner from the generator.

An example of a generator function that multiplies the given number by 2:

def multiply(n):
    state = n
    while True:
        state = state * 2
        yield state
num = multiply(3)
print(next(num)) # 6

Closures

Before we talk about closures we must first understand what a nested function is. So a nested function is a function that is defined in another function, something like this:

def first():
    name = "Marcus"
    def second():
        print(name)

In the above case “second” is a nested function and it’s in the enclosing scope of the “first” function. We can see that in a nested function we can access all the variables from the higher level function, however normally they are only read-only, if we want to change them we must re-declare them in the inner function with the nonlocal keyword. However, there is a trick we can do here, we can actually return the inner function and still be able to access the variable from the outer one:

def first():
    name = "Marcus"
    def second():
        print(name)
    return second
newFunction = first()
newFunction() # Marcus

In the above example we use a closure to keep the variable bound to the inner function even after the its normal scope was supposed to be terminated. So basically a closure is represented by the inner function and the context of its enclosing scope. Closures provide a type of data hiding, because we can access an enclosed variable only through the returned function implementation.

Comprehensions

Comprehensions are a shorter and more concise way to create new sequences in python, most of the time with one line of code, without using a classic for loop. There are four types of comprehensions: list, dictionary, set and generator comprehensions.

The components of a list comprehension are the following:

List comprehension

Basically we have three components:

  1. The output expression where we can alter the variables
  2. The sequence generation expression with for where we define the sequence
  3. The optional conditional expression that allows us to put an inclusion condition on a variable

The syntax is pretty similar to each type of comprehension, below I am going to give examples on how we can use them:

new_list = [x for x in range(10)] #list comprehension
new_dict = {x:x+1 for x in range(10)} #dictionary comprehension
new_set = {x for x in range(10)} #set comprehension
new_gen = (x for x in range(10)) #generator comprehension

We can also use comprehensions in a nested manner to build more complex sequences or process two list at the same time by grouping the values together using the “zip” function.

Decorators

The by the book definition is that a decorator is a function that extends the behavior of another function without modifying it explicitly. A more direct explanation is that a decorator is a function that takes as input our original function that we want to change, creates a wrapper function in which we do the extra work and where we call the input function, then it returns the wrapper function. To have a more clearer view of the steps mentioned above I am going to show you an example:

def the_decorator(original_func):
    def wrapper():
        print("Decorating!")
        original_func()
        print("Finished decorating!")
    return wrapper
def my_function():
    print("Hello there!")
my_function = the_decorator(my_function)

As you can see after decorating the original function we can replace the function to which the original name pointed to, with the returned wrapper function. We can do this because in python function are first class objects.

There are three things we can do to improve the above implementation. First, in python we can use decorator in a better way with the “@” symbol. So instead of calling directly the wrapper function and then assigning the new function to the old one we can just do the following:

@the_decorator
def my_function():
    print("Hello there!")

The second thing is more than just syntactic sugar, most of the times our function take one or more arguments and we want to have decorators that support such functions without having to define a new one every time. The solution is using “*args” and “**kwargs” in the wrapper function:

def the_decorator(original_func):
    def wrapper(*args, **kwargs):
        print("Decorating!")
        original_func(*args, **kwargs)
        print("Finished decorating!")
    return wrapper

And the third improvement is to help us when we want to use the introspection ability to find details about our function, such as its name or documentation. The problem is that when we try to do this after we apply the decorator we will get the information of the decorator instead of the original’s function. To fix this we can use the “@functools.wraps” decorator like this:

import functools
def the_decorator(original_func):
    @functools.wraps(original_func)
    def wrapper(*args, **kwargs):
        print("Decorating!")
        original_func(*args, **kwargs)
        print("Finished decorating!")
    return wrapper

Context Manager

Context mangers can be helpful when we have two distinct operations that we want to execute at the beginning and at the end of a block of code. One practical example is allocating and releasing resources like when we want to work with a specific file, first we have to open the file and after we finish our operations we have to close its file descriptor. In python we ca use a context manger with the “with” statement:

with open("file.txt", "w") as file_handler:
    file_handler.write("Hello there!")

In the above example, after the write instruction finishes, the context manager will close the opened file through the file handler.

We can actually implement our own context manager as a class:

class NewCM(object):
    def __init__(self):
        print('init method called')
    def __enter__(self):
        print('enter method called')
        return self
    def __exit__(self, exc_type, exc_value, exc_traceback):
        print('exit method called')

Slicing

Slicing is one of the simplest and most used python feature. It allows us to select multiple elements from a sequence and modify or use somewhere else those elements. The syntax is pretty straightforward “list[start:stop:step]”, and that’s it. The best way to understand how it works is to see it at work, so I am going to show you several use cases where we can use slices to select:

nums = [1,2,3,4,5,6,7]
#select the first 3 elements
nums[:3]
#select the last 3 elements
nums[-3:]
#select elements between 1 and 3
nums[1:3]
#select the reverse of the list
nums[::-1]

One important aspect is that when the new list is returned, it’s a shallow copy of the original so if we change the original sequence the new one remains the same:

nums = [1,2,3,4,5,6,7]
new_nums = nums[:3]
nums[0] = 100
nums
[100,2,3,4,5,6,7]
new_nums
[1,2,3]

We can also use slicing to change multiple elements of the list or even expand it:

nums = [1,2,3,4,5,6,7]
#change the first 3 elements
nums[:3] = [101, 102, 103]
#change and expand the first 3 elements
nums[:3] = [101, 102, 103, 104, 105]

Multiple Inheritance

This is a simple one, a class in python can be derived from multiple base classes. The syntax is the following:

class BaseOne:
    pass

class BaseTwo:
    pass

class Derived(BaseOne, BaseTwo):
    pass

As such the Derived class will inherit the features of both BaseOne class and BaseTwo class. One thing to remember is that in a multiple inheritance scenario when we want to access one attribute, it is first searched in the current class and if it is not found it continues to the parent class until it reaches the object class. This is called Method Resolution Order(MRO).

Lambdas

A lambda is a small anonymous function, often defined inline that can take any number of arguments. It is mostly used in places where we pass a custom new function as callback or just returning a new one. Its syntax is rather simple “lambda arguments : expression”. And one examples as well:

add_two = lambda x : x + 2
print(add_two(1))

Namespaces

Namespaces are structures that python uses to organize the symbolic names of the objects (almost everything we work with in python is an object). One easy way to think about namespace is to see them as dictionaries that map the name of the object to the actual object. There are three types of namespaces: built-in, global and local.

The built-in namespace is a collection of all the python’s built-in objects that are always available and are created at beginning, when we start the program.

The global namespace exists for every module that we load in our python program with the “import” statement.

Local namespaces are created whenever a function begins executing and it exists until the function returns. By using nested functions we can have multiple levels of local namespaces, each having its own list of names, so if we want to access a variable from an enclosing namespace we must use the “nonlocal” keyword. Also if we want to access a variable from the global namespace we must use the “global” keyword, and because everything is better with an example:

nameOne = "Marcus"
def outer():
    global nameOne
    nameOne = nameOne + " Aurelius"
    nameTwo = "Augustus"
    
    def inner():
        global nameOne
        nameOne = "Emperor " + nameOne
        
        nonlocal nameTwo
        nameTwo = "Emperor " + nameTwo
        
    inner()
    print(nameTwo)
        
outer()
print(nameOne)

As bonus if we want to see access all the names in the global and current local namespaces we can use the build-in functions “globals()” and “locals()” to get a dictionary with those values

Metaclasses

In python a metaclass is a class of a class. That sounds pretty strange, and you will probably never use it directly, but it’s nevertheless an important python concept that does a lot of work behind the scenes.

When trying to get the type of any class in python or even of the basic primitives like “int” or “list” it will return “type”. This “type” is a metaclass. When we create a new normal class, the metaclass “type” is used. To check the type of an class instance or class (which is also an object, instance of a metaclass) we can use the “type()” function or the “__class__” attribute.

We can also create custom metaclasses and customize the class creation process. To do this we must first create a new metaclass by inheriting from the “type” metaclass and then when creating a new class we must specify it with the keyword “metaclass” or just inheriting from a class that has done so before. Example time!:

class NewMeta(type):     
    pass  
class NewClass(metaclass=NewMeta):     
    pass  
class NewSubClass(NewClass):     
    pass
print(type(NewMeta)) #<class 'type'>
print(type(NewClass)) #<class '__main__.NewMeta'>
print(type(NewSubClass)) #<class '__main__.NewMeta'>

Normally when creating a new metaclass we are interesting in overriding the “__new__()” method, which is called before “__init__” and returns the new objects. We do this to controls the way in which the object is created.

If you want a more in-depth explanation and extra examples you can visit this page: https://realpython.com/python-metaclasses/.

Multiprocessing

Python supports three types of concurrent execution, multithreading, multiprocessing and asyncio. Multithreading is limited by the Global Interpreter Lock (GIL) which is a python mechanism that allows only one thread to execute at any given time in a process. This is the main reason why multiprocessing is preferred for cases when we have a highly parallelizable code. The third method asyncio, is a async/await type library that is ideal for IO bound network code.

The multiprocessing module has several classes that we can use to work with processes. One of them is the Process class. We can use it to create, start and join a new process with the parent one. One small example:

from multiprocessing import Process

def one():

    print('Started one')
    time.sleep(2)
    print('Finished one')

def main():
    p = Process(target=one)
    p.start()
    p.join()


if __name__ == '__main__':
    print('Started main')
    main()
    print('Finished main')

Another important class is Pool. We can use Pool to create multiple processes and distribute a sequence of objects to them to be executed by a specific function. The new created Pool object has a function “map” that we ca use to distribute the data to the pool of processes:

from multiprocessing import Pool, cpu_count

def showMe(data):
    print(data)


def main():
    values = [2, 4, 6, 8]

    with Pool() as pool:
        pool.map(showMe, values)

if __name__ == '__main__':
    main()

Processes have separate memory address space and as such by default one cannot access a variable from another process. There are however ways to share data or pass message between them.

To share data between processes we can use the Value or Array classes to create a shared memory space. We must be careful to synchronize the access to them by using a Lock or by calling the “get_lock()” method on the data object.

The other way to communicate is by passing message in queues, which is the preferred way because we avoid the need to synchronize the data access. For this we can use the Queue class, some processes will add data to the queue while other will retrieve it:

from multiprocessing import Queue, Process


def worker(queue):
    name = current_process().name
    print(f'{name} data received: {queue.get()}')

def main():
    queue = Queue()
    queue.put(0)
    queue.put(1)
    queue.put(2)
    queue.put(3)

    processes = [Process(target=worker, args=(queue,)) for _ in range(4)]

    for p in processes:
        p.start()

    for p in processes:
        p.join()

if __name__ == "__main__":
    main(

Buffering Protocol

The buffer protocol provides a way in which we can directly work with the raw byte array of python objects. This is very useful for scientific computing where we deal with large arrays and we don’t want to copy the whole dataset when we create new data views. One good example of a library that uses the buffering protocol is numpy. I am going to present a simple example of how this works, but for more details I highly encourage you to read the explanations from here https://jakevdp.github.io/blog/2014/05/05/introduction-to-the-python-buffer-protocol/ :

import array
arr = array.array('i', range(10)) #array stores data as a contiguous block
import numpy as np
npArr = np.asarray(arr)
npArr[4] = 555 # this will also change the original arr array 

Thank you for reading, I hope this helps you and remember to have a nice day! :)

Python
Programming
Programming Tips
Programming Interviews
Python Programming
Recommended from ReadMedium