avatarPython Fundamentals

Summarize

10 Python Code Snippets That Solve Tasks Efficiently

Python Pandas Efficiency for Data Manipulation

Pandas, a go-to library for data manipulation in Python, offers an array of functionalities that empower data scientists and analysts. In this article, we’ll explore 10 Pandas code snippets designed to efficiently tackle specific tasks commonly encountered in data analysis. Each code snippet is accompanied by a practical example to showcase its effectiveness.

Photo from Pexels

Section 1: Introduction to the Example Dataset

import pandas as pd

# Creating a basic example dataset
data = {
    'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Emily'],
    'Age': [25, 30, 22, 35, 28],
    'Salary': [50000, 60000, 45000, 70000, 55000],
    'Department': ['HR', 'Finance', 'IT', 'Marketing', 'HR']
}

df = pd.DataFrame(data)
print(df)

Section 2: 10 Efficient Pandas Code Snippets

2.1 Remove Rows with Missing Values

# Remove rows with missing values
df_cleaned = df.dropna()
print(df_cleaned)

2.2 Select Columns Based on Data Types

# Select columns based on data types
numeric_columns = df.select_dtypes(include=['int', 'float']).columns
print(df[numeric_columns])

2.3 Filter Rows Based on Multiple Conditions

# Filter rows based on multiple conditions
filtered_df = df[(df['Age'] > 25) & (df['Department'] == 'HR')]
print(filtered_df)

2.4 Convert String Column to DateTime

# Convert string column to DateTime
df['JoinDate'] = pd.to_datetime(df['JoinDate'], format='%Y-%m-%d')

2.5 Calculate Row-wise Sum

# Calculate row-wise sum
df['Total'] = df.iloc[:, 1:].sum(axis=1)

2.6 Merge DataFrames on a Specific Column

# Merge DataFrames on a specific column
other_data = {
    'Department': ['HR', 'Finance', 'IT'],
    'Location': ['City1', 'City2', 'City3']
}

other_df = pd.DataFrame(other_data)
merged_df = pd.merge(df, other_df, on='Department', how='left')
print(merged_df)

2.7 Apply a Function Element-wise

# Apply a function element-wise
df['Bonus'] = df['Salary'].apply(lambda x: x * 0.1)

2.8 Create Dummy Variables for Categorical Column

# Create dummy variables for a categorical column
df_dummies = pd.get_dummies(df['Department'], prefix='Department')
print(df_dummies)

2.9 Group by and Aggregate with Multiple Functions

# Group by and aggregate with multiple functions
grouped_df = df.groupby('Department').agg({'Salary': ['mean', 'sum'], 'Age': 'max'})
print(grouped_df)

2.10 Reshape DataFrame with Melt

# Reshape DataFrame with melt
melted_df = pd.melt(df, id_vars=['Name'], value_vars=['Age', 'Salary'], var_name='Attribute', value_name='Value')
print(melted_df)

Conclusion:

These 10 Pandas code snippets offer efficient solutions to common data manipulation tasks. Whether you’re cleaning data, selecting columns, filtering rows, or reshaping your DataFrame, Pandas provides concise and powerful tools. As you integrate these snippets into your data analysis workflows, you’ll experience the streamlined efficiency that Pandas brings to the table.

Python Fundamentals

Thank you for your time and interest! 🚀 You can find even more content at Python Fundamentals 💫

In Plain English 🚀

Thank you for being a part of the In Plain English community! Before you go:

Data Science
Python
Data Analysis
Data Analytics
Data Scientist
Recommended from ReadMedium