Geocode with Python
How to Convert physical addresses to Geographic locations → Latitude and Longitude
Datasets are rarely complete and often require pre-processing. Imagine some datasets have only an address column without latitude and longitude columns to represent your data geographically. In that case, you need to convert your data into a geographic format. The process of converting addresses to geographic information — Latitude and Longitude — to map their locations is called Geocoding.
Geocoding is the computational process of transforming a physical address description to a location on the Earth’s surface (spatial representation in numerical coordinates) — Wikipedia
In this tutorial, I will show you how to perform geocoding in Python with the help of Geopy and Geopandas Libraries. Let us install these libraries with Pip if you have already Anaconda environment setup.
pip install geopandas
pip install geopy
If you do not want to install libraries and directly interact with the accompanied Jupyter notebook of this tutorial, there are Github link with MyBinder at the bottom of this article. This is a containerised environment that will allow you to experiment with this tutorial directly on the web without any installations. The dataset is also included in this environment so there is no need to download the dataset for this tutorial.
Geocoding Single Address
To geolocate a single address, you can use Geopy python library. Geopy has different Geocoding services that you can choose from, including Google Maps, ArcGIS, AzureMaps, Bing, etc. Some of them require API keys, while others do not need.
![](https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*I9Miy4XzfFIewVJe5GclRg.png)
As our first example, we use Nominatim Geocoding service, which is built on top of OpenStreetMap data. Let us Geocode a single address, the Eifel tower in Paris.
locator = Nominatim(user_agent=”myGeocoder”)
location = locator.geocode(“Champ de Mars, Paris, France”)
We create locator
that holds the Geocoding service, Nominatim. Then we pass the locator we created to geocode any address, in this example, the Eifel tower address.
print(“Latitude = {}, Longitude = {}”.format(location.latitude, location.longitude))
Now, we can print out the coordinates of the location we have created.
Latitude = 48.85614465, Longitude = 2.29782039332223
Try some different addresses of your own. In the next section, we will cover how to geocode many addresses from Pandas Dataframe.
Geocoding addresses from Pandas
Let us read the dataset for this tutorial. We use an example of Store addresses dataset for this tutorial. The CSV file is available in this link.
Download the CSV file and read it in Pandas.
df = pd.read_csv(“addresses.csv”)
df.head()
The following table provides the first five rows of the DataFrame table. As you can see, there are no latitude and longitude columns to map the data.
![](https://cdn-images-1.readmedium.com/v2/resize:fit:800/1*85umjG3K78b9EeUMEKxwFA.png)
We concatenate address columns into one that is appropriate for geocoding. For example, the first address is:
Karlaplan 13,115 20,STOCKHOLM,Stockholms län, Sweden
We can join address columns in pandas like this to create an address column for the geocoding:
Once we create the address column, we can start geocoding as below code snippet.