avatarHimanshu Sharma

Summary

The undefined website article outlines the process of using the Opendatasets library in Python to easily download datasets from Kaggle and Google Drive for use in Data Science projects.

Abstract

The article titled "Downloading datasets from Kaggle and G-Drive" discusses the importance of hands-on practice with diverse datasets for those embarking on a Data Science journey. It introduces Opendatasets, an open-source Python library that simplifies the downloading process of datasets, including well-known ones like Boston, Diabetes, MNIST, and Cifar-10. The article provides step-by-step instructions for installing the library using pip, importing it into a Python script, and downloading datasets from Kaggle, which requires a user's API key and username. It also mentions that the downloaded data is stored in the current working directory and encourages readers to try downloading different datasets, offering support for any difficulties encountered. The article concludes with a call to action for readers to connect with the author via email, LinkedIn, or GitHub, and to explore more of the author's work on Medium.

Opinions

  • The author emphasizes the necessity for Data Science practitioners to work with a variety of datasets to gain a deeper understanding of Machine Learning and Deep Learning.
  • Opendatasets is presented as a solution to the common problem of datasets not being easily accessible or downloadable.
  • The author suggests that using Opendatasets can enhance the learning process by providing easy access to a range of curated datasets.
  • The article is collaborative, acknowledging the contribution of Piyush Ingale, which implies a community-driven approach to sharing knowledge in the field of Data Science.
  • The author offers personal contact information and social profiles, indicating a willingness to engage with the community and support readers beyond the scope of the article.

Downloading datasets from Kaggle and G-Drive

Using Opendatasets for downloading different datasets

Photo by Mika Baumeister on Unsplash

While starting a Data Science journey you should get more and more hands-on training on different datasets. When I started my Data Science journey I started with a dataset named Boston and I am sure all of you must have used this dataset if you are into Data Science. Similarly, there are many datasets which are used for the past some time like Diabetes dataset, MNIST, Cifar-10, etc.

In order to get a deeper understanding of Data Science branches like Machine learning and Deep learning, we should work on diverse datasets but the problem is either these datasets are not easily available or not easily downloadable.

Opendatasets is an open-source python library that allows us to download datasets from Kaggle and google drive in just a single line of code. Also, it contains some of the curated datasets which can be downloaded easily.

In this article, we will explore how to download datasets using Opendatasets and use them accordingly.

Let’s get started…

Installing required libraries

We will start by installing Opendatasets using pip. The command given below will do that.

!pip install opendatasets

Importing required libraries

In this step, we will import the required library that isOpendatasets.

import opendatasets as od
import pandas as pd 

Downloading Datasets

In order to download datasets from Kaggle, we need to have an API key and our Kaggle username. The API key can be downloaded from Kaggle account settings which will be a JSON file that contains username and key. Let’s see how we can download different datasets.

import opendatasets as od
dataset_url = 'https://www.kaggle.com/deepankurk/flight-take-off-data-jfk-airport'
df = od.download('https://www.kaggle.com/deepankurk/flight-take-off-data-jfk-airport')
Dataset(Source: By AUthor)

After running this line of code it will ask you to enter your username and API key and after that, it will download the data in your current working directory. Now you can load this data and use it accordingly. Similarly, you can download any dataset from Kaggle by just passing a URL.

Go ahead try downloading different datasets. In case you find any difficulty please let me know in the response section.

This article is in collaboration with Piyush Ingale.

Before You Go

Thanks for reading! If you want to get in touch with me, feel free to reach me at [email protected] or my LinkedIn Profile. You can view my Github profile for different data science projects and packages tutorials. Also, feel free to explore my profile and read different articles I have written related to Data Science.

Data Science
Python
Machine Learning
Artificial Intelligence
Data Visualization
Recommended from ReadMedium