avatarAmit Chauhan

Summarize

Enhance Python Performance with Cython, Numba, and Eval()

Easy concepts for data science and software development

Photo by Tudor Baciu on Unsplash

If you are facing the problem of slow-speed programming in python, then this article might be a treasure for you. Here, I will explain the three methods with code examples through which you can boost the speed of the code.

When we are working on high-level languages it becomes slow to reach the low-level instruction that makes it slower as compared to languages like C++ and others.

Table of content

1. Cython method
2. Numba (JIT) method
3. Eval() method

Cython

The use of cython is to improve the speed of the program. While thinking to improve the python program by NumPy's vectorization approach. We are working on a data frame and the computational processing speed is very slow, that’s where cython comes into the picture to give you a breath of faster speed.

To know the difference between python and cython, need to add an extension as ‘.pyx’ for cython files.

To install the cython in the local use the below command

pip install cython

To use cython, we need two files with extensions (file_name.pyx) and (setup.py). Our code will be available in the “.pyx file” and the setup.py file will be used to cynthonize the cython file as a makefile.

Example

  1. Code file
#first_code.pyx
print("Hello World")

2. Setup file

from setuptools import setup
from Cython.Build import cythonize

setup(
    ext_modules = cythonize("first_code.pyx")
)

After doing this, we need to build our cython code file by the below command on the command line.

python setup.py build_ext --inplace

Now, import the build file.

>>> import first_code
Hello World

Numba (JIT)

It is a very impressive computation library to run the function code at the machine level. It is also known as just-in-time (jit) use to compile python’s functions and loops. Numba uses decorators to make the functions compiled just-in-time as machine code.

To install the numba in the local use the below command.

pip install numba

Python and Numba speed Example

Python speed time

import pandas as pd
df = pd.DataFrame({"x": [4, 3, 6, 5],
                   "y": [4, 5, 2, 1]})
%%timeit
def add(df):
    df['sum'] = df['x'] + df['y']
    return df
add(df)
#output time
1.12 ms ± 299 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Numba speed time

import pandas as pd
from numba import jit,njit,vectorize
df = pd.DataFrame({"x": [4, 3, 6, 5],
                   "y": [4, 5, 2, 1]})
%%timeit
@jit
def add1(df):
    df['sum'] = df['x'] + df['y']
    return df
%%timeit
add1(df)
#output time
878 µs ± 52.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Eval() method

The eval() method is a built-in function in python that takes the expression and parse it as a python expression. This method is easy to use with expressions that contain large arrays.

The syntax of this method is shown below:

eval(expression[, globals[, locals]])

Examples

  1. Evaluates the python expression in the eval() method.
x = 'print(55)'
eval(x)
#output:
55

In the above example, the eval took the x expression and evaluates it as a python expression that printed the number ‘55’.

2. This example shows the speed difference between the normal expression and the eval function expression.

import pandas as pd
import numpy as np
nrows, ncols = 40000, 150
df1, df2, df3, df4 = [pd.DataFrame(np.random.randn(nrows, ncols))
                       for _ in range(4)]

The above example is of pandas data frame that contains four data frames. we will check the speed with the normal expression and eval expression.

%timeit df1 + df2 + df3 + df4
#output:
154 ms ± 27 ms per loop (mean±std. dev. of 7 runs, 1 loop each)
#with eval method
%timeit pd.eval('df1 + df2 + df3 + df4')
#output:
60.4 ms ± 6.87 ms per loop (mean±std. dev. of 7 runs,10 loops each)

Conclusion

We can manipulate and use different methods to speed up the computation in python.

I hope you like the article. Reach me on my LinkedIn and Twitter.

Recommended Articles

1. 8 Active Learning Insights of Python Collection Module 2. NumPy: Linear Algebra on Images 3. Exception Handling Concepts in Python 4. Pandas: Dealing with Categorical Data 5. Hyper-parameters: RandomSeachCV and GridSearchCV in Machine Learning 6. Fully Explained Linear Regression with Python 7. Fully Explained Logistic Regression with Python 8. Data Distribution using Numpy with Python 9. Decision Trees vs. Random Forests in Machine Learning 10. Standardization in Data Preprocessing with Python

Python
Programming
Data Science
Machine Learning
Artificial Intelligence
Recommended from ReadMedium