Python Programming: Manipulating tabular data efficiently
How to Speed up Pandas by 100x
With Great power comes great responsibility.
Pandas is a Data Analysis python library that aids in working with tabular data stored in spreadsheets and databases. It provides a vast set of functionalities for manipulating and transforming structural data aka dataframes. In this blog post, we shall discuss 3 simple tricks for speeding up Pandas operations.
1. Stop using iterrows() :
- Data manipulation often requires iterating over dataframe rows.
iterrows()is often the go-to option for such use cases. However, it is notoriously slow and can be easily swapped byitertuples().- Consider a simple (read: trivial) problem of adding two columns of a Pandas dataframe.












