The provided content discusses the importance of time complexity in Python programming, detailing various searching and sorting algorithms and their respective time complexities.
Abstract
The article "Time Complexity and Its Importance in Python" emphasizes the significance of choosing efficient algorithms for problem-solving in programming. It defines time complexity as the measure of the time an algorithm takes to run, with a focus on the worst-case scenario. The concept of Big-O notation is introduced to quantify time complexity, simplifying the comparison of different algorithms. The article illustrates this with examples of linear search, binary search, and various sorting algorithms such as insertion sort, selection sort, and merge sort, highlighting their time complexities as O(n), O(log n), O(n²), and O(n*logn), respectively. The text underscores the importance of reducing time complexity, especially in data science applications, where large datasets are common, and provides insights into the practical use of these algorithms in Python.
Opinions
The author suggests that reducing time complexity in data science can have a significant real-world impact.
Binary search is presented as a more efficient alternative to linear search for sorted lists, with the opinion that it can drastically reduce search time.
Insertion sort and selection sort are both acknowledged to have a time complexity of O(n²), but the author implies that their simplicity might make them preferable in certain scenarios.
Merge sort is highlighted as a superior sorting method compared to insertion and selection sorts due to its lower time complexity of O(n*logn).
The author expresses that while understanding the fundamentals of sorting and searching algorithms is important, Python programmers often use the built-in sort function for practical purposes.
Now-a-days, for one problem we can write the solution in n number of ways, but, how can we decide which type is better. We can use different types of algorithms to solve one problem. We need to compare these algorithms and have to choose the best one to solve the problem.
But what is an algorithm?
An algorithm is a set of instructions, which are created to get the required output. Many different algorithms can give same output.
To perform these instructions a computer should have memory, and it also requires time to perform those actions.
What is Time Complexity?
The amount of time it takes to run the program and perform the functions in it is known as Time Complexity. By using Time Complexity we can determine whether the program is efficient or we have to use another algorithm which take less time compared to the other one.
Reducing Time Complexity of an algorithm is often difficult in Data Science, rather than difficult we can say its a bigger challenge.
We will tell the time complexity of a program by calculating the time taken to run the algorithm in the worst-case scenario.
To quantify the Time Complexity, we will use Big-O notation.
Big-O Notation
Let’s start understanding what Big-O means with a short description from Wikipedia
Big-O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. The letter O is used because the growth rate of a function is also referred to as the order of the function or order of the program. We will always refer order of the function in it’s worst-case.
Let’s say we have a list which consists of integers and we have to check whether the number given by the user is present in that list or not.
l = [1,2,3,6,4,9,10,12]
k = 12
The simple code for this is
Here, the worst-case for this algorithm is to check the number which is present in the last element of the given list. So, if we go by the above program, first it’ll start with index 0 and check whether that element in the list is equal to k or not, i.e, one operation and we have to check for every element in the list for worst-case scenario.
In general terms, if a list has ’n’ elements in it. In the worst case scenario, we have to perform ‘n’ check operations. That is denoted by O(n) [order of n]. So, the time complexity of above program is O(n).
For better understanding of order of n, let’s see one more example
Given two lists of length ‘n’ and check whether the sum of the elements in those two lists are even or odd.
Now, let see how we can determine the order for the above program in terms of n. Let’s assume, there are n elements in the given two lists. First for each value of i, j will have ‘n’ elements to check for even or odd. Also, ‘i’ will have to run the loop for ‘n’ elements.
So, there will be n*n operations needed to perform the above program. So the time complexity of above program is O(n²).
Whenever you are calculating the order of a function if you come across as the following:
O(4n³ + 2n² + 3n + 1)
You can ignore the n², n and the constant and represent O(n³). Because, when compared to O(n³), the values n² and n are pretty negligible, so there will be no impact on the output time even if we neglect them.
This is how, we will calculate or determine the Time Complexity of an algorithm.
In Data Science, we often come across a large amount data and we need to search for particular entity or particular entry from that data. Reducing the time complexity in these type of programs will have huge impact in the real world.
Let’s see some of the searching techniques which helps in reducing the time complexity.
Linear Search
Binary Search
1. Linear Search
Let’s see the example I stated above once again to understand the Linear Search.
We have a list which consists of integers and we have to check whether the number given by the user is present in that list or not.
l = [1,2,3,6,4,9,10,12]
k = 12
A linear search is the most basic kind of search that is performed. A linear or sequential search, is done when you inspect each item in a list one by one from one end to the other to find a match for what you are searching for.
The Time Complexity of the above program is O(n).
What are its advantages and disadvantages
The advantages are it is very simple and easy to write the program. So, in the best-case scenario if the data you are searching is in the first of the list in which case you’ll get the result on the very first try.
The disadvantage is that, if the data list is very large as it often is in real world, and the element you are searching is of the last element in the list, then it may take a while to search the entire list.
Now, let’s see more efficient way of searching through a sorted list.
2. Binary Search
Say, you have a sorted list and you need to find a value in the list. Binary Search, refers to dividing a collection of elements into two halves and throwing away one of them at each step of the algorithm. This can significantly reduce the number of comparisons required to find an element.
Don’t forget in order to perform the Binary Search, the data should be sorted.
Let’s take the same example we have in Linear Search and see how to write the program using Binary Search
You are given a list which is sorted and have no repeat values in it and a key which we have to find in the list.
l = [1,2,3,4,6,9,10,12]
k = 3
Here, we are taking the list and splitting into two halves and checking in which half, the element is present by comparing the first and last element in the each half and ignoring the one where it is not present and so on until you find the number.
In the above program, first we will split the list into two halves at mid point.
[1,2,3,4] & [6,9,10,12]
We’ll check in which list the required number is present by comparing the first & last element in each list and ignoring the another. Now, in the selected list we’ll again divide that list into two halves and follow the same process. We do this again and again until we find the required element.
The Time Complexity for the Binary Search is ‘order of log n with base 2’
That is O(logn) with base 2.
To put it in perspective, if Binary Search took 10 seconds to perform an algorithm then the Linear Search takes about 1024 seconds(17 minutes).
Binary Search is the most common method followed, there are also other types of searches like Jump Search, Interpolation Search, Exponential Search etc. You can find all about those here.
Till now, we have seen how one can reduce Time Complexity for an algorithm by using different searches for sorted data.
Now, let’s see how we can sort the data. Sorting data is one of the most basic step you take in data exploration. There are so many sorting algorithms present but we will learn the three important sorting algorithms.
Insertion Sort
Selection Sort
Merge Sort
1. Insertion Sort
The following is the basic definition of insertion sort in Stack Abuse
An array is partitioned into a “sorted” sub-array and an “unsorted” sub-array. At the beginning, the sorted sub-array contains only the first element of our original array.
The first element in the unsorted array is evaluated so that we can insert it into its proper place in the sorted sub-array.
The insertion is done by moving all elements larger than the new element one position to the right.
Continue doing this until our entire array is sorted.
Let us see an example of Insertion Sort
User has a list and using insertion sort the user has to get the output in sorted manner.
L = [12, 13, 3, 7, 4, 2,1]
So, the Time Complexity for the above algorithm is O(n²).
What we have done here is, we have taken first element in the list and put it in the output. Now we take the next element in the list and compare it with each element in the list and decide whether the new element is supposed to be placed.
For example in the above problem, first take the first element ‘12’ and put it in the Output list. Now we take the next element ‘13’ and compare with the previous elements in the list i.e, with ‘12’ and put it next to ‘12’. The result till now will be [12, 13]. Now, take the next element ‘3’ and compare it with the elements ‘12’ & ‘13’ and put it before the ‘12’. The result will be [3, 12, 13].
This process continues till the last element and the resultant output will be
[1, 2, 3, 4, 7, 12, 13].
This type of sorting is called Insertion Sort.
2. Selection Sort
For the problem mentioned in Insertion Sort, one straightforward approach is find the minimum value in the list, place it in the first position and find the minimum value in the remaining list and place it in the second position and so on.
This process is called Selection Sort.
Let’s see how can we write code for Selection Sort
L = [12, 13, 3, 7, 4, 2,1]
The Time Complexity for the above program is O(n²). You can learn more about Selection Sort here.
So, we have seen two types of sorting methods with same Time Complexity as O(n²). Is it possible to do better than that?
Yes, the next sorting method Merge Sort will give you better Time Complexity than O(n²).
3. Merge Sort
The basic definition of Merge Sort is beautifully explained in educative.
Merge sort is one of the most prominent divide-and-conquer sorting algorithms in the modern era. It can be used to sort the values in any data structure such as a list.
Merge sort works by splitting the input list into two halves, repeating the process on those halves, and finally merging the two sorted halves together.
The algorithm first moves from top to bottom, dividing the list into smaller and smaller parts until only the separate elements remain.
From there, it moves back up, ensuring that the merging lists are sorted.
Let’s see an example for Merge Sort.
L = [12, 13, 3, 7, 4, 2,1]
The Time Complexity for the above program is O(n*logn). Which is less time when compared to Insertion and Selection Sort.
There is also another type of called Quick Sort which follows the same divide and conquer rule like Merge Sort. You can find out about this Quick Sort here.
Most of the time, you will be using the inbuilt sort function in Python. You can read about it here.
Conclusion
Having understood the basics about Time Complexity, Search algorithms and Sorting algorithms, you can now use them while developing a Python Code in a console or to make an existing code more efficient in real world.