avatarDr. Shouke Wei

Summary

The article provides a step-by-step guide on how to read a dataset stored in Google Drive into a Python environment using the Pandas library.

Abstract

The article titled "How to Read Dataset from Google Drive with Python Pandas" offers a convenient method for accessing datasets stored in Google Drive. It begins by instructing readers to create a shareable link for their dataset with viewer access. The article then guides users through the process of setting up their Python environment, including installing the Pandas library, importing it, and creating a variable with the shared link. It details how to extract the file ID from the URL and construct a direct read URL for the dataset. Finally, it demonstrates how to use pd.read_csv to load the dataset into a Pandas DataFrame, assuming the dataset is in CSV format. The article also promotes an online course for those interested in deepening their knowledge of Python data analysis.

Opinions

  • The author assumes that storing datasets in Google Drive is a common practice among readers with data projects.
  • The author suggests that the method described is particularly useful for those who are not familiar with reading datasets from Google Drive.
  • The use of Jupyter notebook is recommended for demonstrating the reading process, indicating a preference for this IDE.
  • The article implies that readers may benefit from the author's online course to further their skills in Python data analysis and modeling.
  • The author encourages reader engagement by asking for support in the form of claps if the content is found helpful.

How to Read Dataset from Google Drive with Python Pandas

Easy and convenient Method to read a dataset from the Google Drive

In the previous article, I displayed how to read a dataset from GitHub and save it locally. I have some projects and its dataset stored in the Google Drive. In this article, we will see how to read a dataset from Google Drive. If you like to store datasets, or you are planning to store datasets in Google Drive as I, but you are not familiar with methods to read your datasets from your Drive. Then please follow the following process in this article.

1. Create a Share Link

First, we should create a shared link for the dataset. Go to the dataset that you want to share, and right-click to choose the Share from the menu.

Next, choose Anyone with the link for the General access.

For the role, we choose Viewer or others according to your aim.

Then, copy the link and paste in any text file.

2. Open Python IDE

Open any Python IDE, here I use the Jupyter note to display the reading process.

(1) install Pandas library

If you have not installed Pandas, just type the following command in a command-line shell or in the Jupyter notebook.

pip install pandas

(2) import pandas

import pandas as pd

(3) create url variable

Past the shared link that we copied as the url value.

url = ‘https://drive.google.com/file/d/17LjfBwzFGcv7IHQ0KCLYQavLsR6aEert/view?usp=sharing'

(4) get the data file ID

It is 17LjfBwzFGcv7IHQ0KCLYQavLsR6aEert, but we write a code to use url.split method to get it, which is the second one from the end.

file_id = url.split('/')[-2]

(5) create the read url

The read or download url for file in Google Drive is https://drive.google.com/uc?id=' + file id.

read_url='https://drive.google.com/uc?id=' + file_id

(6) read the data

Finally, we can read the dataset using Pandas methods. In this example, ⁣pd.read_csv is used because this dataset in my Google Drive is in .csv format.

# read the data
df = pd.read_csv(read_url)

# display the first 5 rows
df. head()

5. Online Course

If you are interested in learning Python data analysis in details, you are welcome to enroll one of my course:

If you like this post or think it is helpful, please do not forget to give a clap to show your kind support. I appreciate it for your support!

Pandas
Read Data
Google Drive
Python
Recommended from ReadMedium