Unlocking the Power of Python for Geospatial Analysis with GeoPandas and Folium
A comprehensive guide on how to use these libraries for mapping and analyzing geographical data
In today’s world, location-based information is becoming increasingly important. Whether you are an urban planner, environmental scientist or simply interested in visualizing your travel history, geospatial analysis can help uncover valuable insights from spatial datasets.
This article aims to introduce two powerful open source tools — GeoPandas and Folium which make working with geospatial data in Python easier than ever before.
Part I: Getting Started with GeoPandas
What is GeoPandas?
GeoPandas is a Python library used for manipulating and analyzing geospatial data. It combines the capabilities of pandas, a popular library for data manipulation, with those of shapely, a library for handling geometric objects such as points, lines, and polygons.
By doing so, it allows users to perform complex operations on spatial data using familiar Pandas functions like merge(), join() etc., while also providing methods specific to geographic features.
Installing GeoPandas
To install GeoPandas along with its dependencies Fiona, Shapely & pyproj, run the following command:
!pip install geopandas fiona shapely pyproj
Reading Spatial Data into GeoDataFrame
Once installed, we can read various file formats including ESRI Shapefile (.shp) and GeoJSON (.geojson) directly into a GeoDataFrame, similar to how one would load a CSV file into a regular DataFrame using pd.read_csv(). Here's an example using New York City Taxi Zone dataset available at https://data.cityofnewyork.us/:
import geopandas as gpd
taxi_zones = gpd.read_file('taxi_zone_lookup.shp')
taxi_zones.head()This will display the first few rows of our new GeoDataFrame containing both geometry and attribute columns.
Basic Operations on Geometry Column
Now let’s see some basic functionalities offered by GeoPandas over traditional pandas. We’ll find out the number of zones intersecting with Manhattan Borough:
manhattan_borough_bounds = box(40.7, -74.25, 40.85, -73.75) #define bbox for manhattan
nyc_boundary = gpd.read_file("nybb.shp") #load NYC boundary shape file
manhattan = nyc_boundary[nyc_boundary.boroname == 'Manhattan'] #filter Manhttan borough
intersection = taxi_zones.geometry.overlay(manhattan.geometry).dropna() #find intersection between taxi zone and manhattan borough
len(intersection)Output: 26
So there are 26 taxi zones partially or fully lying within Manhattan Borough.
Part II: Visualization with Folium
While GeoPandas provides excellent support for geospatial computations, when it comes to interactive web maps, Folium stands out due to its simplicity and flexibility. Let us explore how to create choropleth maps using this package.
Creating Choropleth Maps
Firstly, ensure folium is installed via pip:
!pip install folium
Next, create a choropleth map based on median income per zipcode across USA:
import json
url = 'https://raw.githubusercontent.com/python-visualization/folium/master/examples/data/'
zip_polygon_url = url + 'us-states.json'
state_geoms = json.loads(requests.get(zip_polygon_url).text)
income_df = pd.read_csv('median_household_income_by_zip.csv') #assuming having column 'per capita'
m = folium.Map([39, -98], zoom_start=4)
for key, val in income_df['per capita'].items():
if float(val) > 45000:
color = 'green'
elif float(val) > 35000:
color = 'lightblue'
else:
color = 'red'
m.choropleth(
geometries=state_geoms['features'],
info=key,
fill_color=color,
fill_opacity=0.7,
line_opacity=0.2,
legend_name='Median Household Income ($)'
)
mConclusion
With just a little bit of knowledge about python programming, anyone can unlock tremendous potential hidden inside geospatial data through GeoPandas and Folium. With their ease of usage and robust functionality, they have become indispensable tools among GIS professionals and enthusiasts alike.





