avatarManpreet Singh

Summary

This web page provides instructions on how to scrape tweets using Python, specifically with the Twint package, which allows for scraping without using Twitter's API.

Abstract

The web page titled "How To Scrape Tweets With Python" introduces the Twint package, a powerful tool for scraping tweets without using Twitter's API. The author provides a brief introduction to Python for those who are new to the programming language. The article then explains how to install the Twint package using pip or by cloning the repository from GitHub. The author also provides examples of Python scripts that demonstrate how to scrape tweets containing specific keywords and pull tweets from specific users. The article ends with a call to action, encouraging readers to explore the Twint package further and share their thoughts on its usage.

Opinions

  • The author believes that Python is an awesome programming language with a ton of capability.
  • The author recommends using the Twint package for scraping tweets due to its ability to scrape without using Twitter's API, allowing for unrestricted data collection.
  • The author suggests that readers check out the Twint GitHub repository for more information on the package.
  • The author encourages readers to share their thoughts and plans for using the Twint package.
  • The author provides a link to their Twitter account for further contact and connection.
  • The author also shares a link to their favorite resources for learning programming.
  • The author promotes an AI service, ZAI.chat, as a cost-effective alternative to ChatGPT Plus(GPT-4).

How To Scrape Tweets With Python

Welcome back! Python is an awesome programming language with a ton of capability, if you’re new to Python, check out the link below to learn more about it:

So, let’s take a look at an awesome way to scrape tweets with Python! This specific method is by using the Twint package, here is a link to their GitHub repository:

This specific tool allows us to scrape tweets without using the Twitter API, this allows us to get as much data as we want without restrictions (most of the time). To start using this package, you can use the following pip command:

pip3 install twint

You can also clone this repository using the following command:

git clone --depth=1 https://github.com/twintproject/twint.git
cd twint
pip3 install . -r requirements.txt

At this point, you can either use the CLI or import this package into a Python script, here is an example script that scrapes the amount of tweets containing a specific keyword based off a username:

import twint

# Configure
c = twint.Config()
c.Username = "realDonaldTrump"
c.Search = "great"

# Run
twint.run.Search(c)

The output of this command would be the following:

Here is another example which brings in another username, and pulls 10 tweets from this user:

import twint

c = twint.Config()

c.Username = "noneprivacy"
c.Custom["tweet"] = ["id"]
c.Custom["user"] = ["bio"]
c.Limit = 10
c.Store_csv = True
c.Output = "none"

twint.run.Search(c)

I would highly recommend checking out their GitHub repository (linked above) to gain even more knowledge about this package, do you plan on using this package? I would love to hear your thoughts about this!

Thanks So Much!

if you have any suggestions, thoughts, or just want to connect, feel free to contact/follow me on Twitter! Also, below is a link to some of my favorite resources for learning programming:

Thanks so much for your support!

Coding
Programming
Python
Data Science
Machine Learning
Recommended from ReadMedium