How to Select and Filter Data in Python Pandas
Python pandas library provides several methods for selecting and filtering data, such as loc, iloc, [ ] bracket operator, query, isin, between.

This article will guide you through the essential techniques and functions for data selection and filtering using pandas. Whether you need to extract specific rows or columns or apply conditional filtering, pandas has got you covered. Let’s dive in!
Table of Contents
1. Selecting Columns : [ ] operator, loc, iloc 2. Filtering Rows : [ ] operator, loc, iloc, isin, query, between, string methods 3. Updating Values : loc, iloc, replace
1. Selecting Columns
- [ ] Bracket operator : It allows to select one or multiple columns.
Example:
df[['column_label']]ordf[['column1', 'column2']]
# Selecting a single column
df[['Customer Country']]
# Selecting multiple columns
df[['Customer Country', 'Customer State']]
- loc[ ] : This accessor selects rows and columns by labels.
Example:
df.loc[row_label, column_label]
*** You can also use loc for slicing operations:
df.loc['row1_label':'row2_label' , 'column1_label':'column2_label']
# Using loc for label-based selection
df.loc[:, 'Customer Country':'Customer State']
# Using loc for label-based selection
df.loc[[0,1,2], 'Customer Country':'Customer State']
- iloc[ ] : This accessor selects rows and columns by integer location.
Example:
df.iloc[row_position, column_position]
*** You can also use iloc for slicing operations:
df.iloc['row1_position':'row2_position','col1_position':'col2_position']
# Using iloc for index-based selection
df.iloc[[0,1,2,3] , [3,4,5,6,7,8]]
# or
df.iloc[[0,1,2,3] , 3:9]
# Using iloc for index-based selection
df.iloc[:, 3:8]
2. Filtering Rows
- [ ] Bracket operator : It allows filtering rows based on a condition.
Example:
df[condition]
# Using [] bracket operator for filtering rows# Using [] bracket operator for filtering rows
condition = df['Order Quantity'] > 3
df[condition]
# or
df[df['Order Quantity'] > 3]
- loc[ ] : It filters rows by labels.
Example:
df.loc[condition]
# Using loc for filtering rows
condition = df['Order Quantity'] > 3
df.loc[condition]
# or
df.loc[df['Order Quantity'] > 3]
# Using loc for filtering rows
df.loc[df['Customer Country'] == 'United States']
- iloc() : It filters rows by integer positions.
# Using iloc for filtering rows
df.iloc[[0, 2, 4]]
# Using iloc for filtering rows
df.iloc[:3, :2]
first 3 rows and first 2 columns
- isin([ ]) : It is used to filter data based on a list.
Example:
df[df['column_name'].isin(['value1', 'value2'])]
# Using isin for filtering rows
df[df['Customer Country'].isin(['United States', 'Puerto Rico'])]
# Filter rows based on values in a list and select spesific columns
df[["Customer Id", "Order Region"]][df['Order Region'].isin(['Central America', 'Caribbean'])]
# Using NOT isin for filtering rows
df[~df['Customer Country'].isin(['United States'])]
- query() : This method is used to select data based on a SQL-like expression.
Example:
df.query('condition')
In case your column names contain spaces or special characters, first you should use the rename() function to rename them.
# Rename the columns before performing the query
df.rename(columns={'Order Quantity' : 'Order_Quantity', "Customer Fname" : "Customer_Fname"}, inplace=True)# Using query for filtering rows with a single condition
df.query('Order_Quantity > 3')
# Using query for filtering rows with multiple conditions
df.query('Order_Quantity > 3 and Customer_Fname == "Mary"')
- between() : Filters rows based on values that fall within a specified range.
Example:
df[df['column_name'].between(start, end)]
# Filter rows based on values within a range
df[df['Order Quantity'].between(3, 5)]
- String methods : Filters rows based on string matching conditions.
Example:
str.startswith(),str.endswith(),str.contains()
# Using str.startswith() for filtering rows
df[df['Category Name'].str.startswith('Cardio')]
# Using str.contains() for filtering rows
df[df['Customer Segment'].str.contains('Office')]
3. Updating Values
- loc[ ] : This accessor selects specific rows and columns in the DataFrame and assigns new values.
# Update values in a column based on a condition
df.loc[df['Customer Country'] == 'United States', 'Customer Country'] = 'USA'
- iloc[ ] : This accessor selects specific rows and columns in the DataFrame and assigns new values.
# Update values in a column based on a condition
df.iloc[df['Order Quantity'] > 3, 15] = 'greater than 3'
# or
condition = df['Order Quantity'] > 3
df.iloc[condition, 15] = 'greater than 3'
- replace() : It replaces specific values in a DataFrame with new values.
Ex:
df.['column_name'].replace(old_value, new_value, inplace=True)
# Replace specific values in a column
df['Order Quantity'].replace(5, 'equals 5', inplace=True)
Conclusion
Python pandas provides several functions and techniques for selecting and filtering data within a DataFrame. By mastering these techniques, you’ll be well-equipped to explore and analyze your data effectively.
Remember, practice makes perfect. Thank you for reading!
Check my previous post: Data Cleaning: Most Helpful Python Pandas Methods
Let’s connect on Linkedin.
