
Sorting Data in Python Pandas
Sorting Data in Python Pandas
In this tutorial, you will learn how to effectively sort data in a pandas DataFrame using the sort_values() and sort_index() methods.
Sorting a DataFrame by Column Values
You can use the sort_values() method to sort a pandas DataFrame by the values of one or more columns. By default, the sorting is done in ascending order. Here's an example of sorting a DataFrame by a single column:
import pandas as pd
# Create a sample DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'],
'Age': [25, 30, 20]}
df = pd.DataFrame(data)
# Sort the DataFrame by the 'Age' column
sorted_df = df.sort_values(by='Age')
print(sorted_df)Output:
Name Age
2 Charlie 20
0 Alice 25
1 Bob 30You can also sort by multiple columns by passing a list of column names to the by parameter.
Changing the Sort Order
You can change the sort order to descending by setting the ascending parameter to False. Here's an example:
# Sort the DataFrame by the 'Age' column in descending order
sorted_df_desc = df.sort_values(by='Age', ascending=False)
print(sorted_df_desc)Output:
Name Age
1 Bob 30
0 Alice 25
2 Charlie 20Sorting by Index
To sort a DataFrame by its index, you can use the sort_index() method. Here's an example:
# Sort the DataFrame by its index
sorted_index_df = df.sort_index()
print(sorted_index_df)Output:
Name Age
0 Alice 25
1 Bob 30
2 Charlie 20Handling Missing Data While Sorting
Pandas provides options to handle missing data while sorting values. You can use the na_position parameter to specify where the NaN values should appear in the sorted result. By default, NaN values are placed at the end.
# Sort the DataFrame by the 'Age' column where NaN values appear first
sorted_df_nan_first = df.sort_values(by='Age', na_position='first')
print(sorted_df_nan_first)Output:
Name Age
2 Charlie 20
0 Alice 25
1 Bob 30Modifying the DataFrame In-Place
You can sort a DataFrame in-place by setting the inplace parameter to True. This will modify the original DataFrame instead of returning a new sorted DataFrame.
# Sort the DataFrame in-place by the 'Age' column
df.sort_values(by='Age', inplace=True)
print(df)Output:
Name Age
2 Charlie 20
0 Alice 25
1 Bob 30By following the examples in this tutorial, you now have a solid understanding of how to effectively sort data in a pandas DataFrame using Python. Happy coding!
