avatarThe PyCoach

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

3171

Abstract

Analytica scandal. In fact, Facebook seemed to <a href="https://www.vice.com/en/article/n7vegw/facebook-decides-to-let-research-project-collecting-ad-targeting-data-continuefor-now">back down</a> from its threat to shut down the project after receiving public criticism.</p><div id="121d" class="link-block"> <a href="https://medium.datadriveninvestor.com/5-things-you-should-know-to-easily-learn-web-scraping-6577bd8ebb08"> <div> <div> <h2>5 Things You Should Know to Easily Learn Web Scraping</h2> <div><h3>Make learning Web Scraping less difficult.</h3></div> <div><p>medium.datadriveninvestor.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/1*O7IXp7gYQvb2JVtbX8TfZQ.jpeg)"></div> </div> </div> </a> </div><h1 id="00dd">Web Scraping for commercial purposes</h1><p id="e933">When you scrape data from websites, you’re accessing data that might be protected by copyright. If you scrape websites and use the data obtained for commercial purposes, you could get into trouble.</p><p id="0cdc">Let’s consider the HiQ and LinkedIn case as an example.</p><p id="6351">HiQ is a data analytics firm that provides business intelligence based on publicly-available data scraped from LinkedIn. For this reason, LinkedIn invoked the Computer Fraud and Abuse Act (CFAA) in a cease-and-desist letter to HiQ.</p><p id="854f">Although the US Court of Appeals <a href="https://parsers.me/us-court-fully-legalized-website-scraping-and-technically-prohibited-it/">denied</a> LinkedIn’s request saying that web scraping public sites does not violate the CFAA, it should be noted that the decision doesn’t grant HiQ the freedom to use data obtained by scraping for commercial purposes.</p><p id="4c7b">Some of the techniques HiQ used for scraping LinkedIn and avoiding IP ban are explained in the following article.</p><div id="a536" class="link-block"> <a href="https://readmedium.com/3-simple-ways-for-web-scraping-without-getting-blocked-34f3b1f885d1"> <div> <div> <h2>3 Simple Ways For Web Scraping Without Getting Blocked</h2> <div><h3>A guide to handle anti-scraping mechanisms.</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*DGMVKxfD9kexwQ37)"></div> </div> </div> </a> </div><p id="342d">So projects with commercial purposes, like selling the data scraped, are not safe. <b>Instead, you could make your project profitable without selling data with the following option</b>.</p><h1 id="6a70">Web Scraping for trading projects</h1><p id="0553">Trading involves buying and selling things. The goal is to profit off of any buying and selling activity.</p><p id="321d">Web scraping for trading usually works in the following way.</p><p id="9196">Some people build a scraper that returns the price

Options

of a specific product, so when the price drops, the program automatically buys the product before it’s sold out. Once the demand for the product grows and the price rises, they resell the product to make a profit.</p><p id="82d7">However, one of the best types of trading is arbitrage because you only take advantage of the price difference between two or more markets, so you buy in one market and simultaneously sell in another market for a higher price.</p><p id="f33a">Arbitrage occurs in exchange rates, sports betting, cryptocurrencies and more. On top of that, arbitrage trading is legal in the United States and <a href="https://www.investopedia.com/articles/investing/032615/why-arbitrage-trading-legal.asp#:~:text=Arbitrage%20trading%20is%20not%20only,providing%20liquidity%20in%20different%20markets.">is encouraged</a>, as it contributes to market efficiency.</p><p id="4742">If you like the idea of making extra money with arbitrage trading, check the article below:</p><div id="7aaa" class="link-block"> <a href="https://readmedium.com/how-to-make-money-from-web-scraping-without-selling-data-92c1f961b25"> <div> <div> <h2>How to Make Money From Web Scraping Without Selling Data</h2> <div><h3>You just need some lines of code to make extra money</h3></div> <div><p>medium.com</p></div> </div> <div> <div style="background-image: url(https://miro.readmedium.com/v2/resize:fit:320/0*fPbdqZQkbMVNKhQO)"></div> </div> </div> </a> </div><h1 id="bbc1">Conclusion</h1><p id="b515">In this article, we analyzed the legality of web scraping. We found that research projects shouldn’t have legal issues, especially in countries like the UK, which clearly stated that web scraping is legal for researchers. Projects with commercial purposes, however, might infringe copyright. That being said, it’s still possible to profit off of the data obtained with arbitrage trading.</p><p id="f453">Above all, it seems that “what you’re gonna do with the data” would determine the legality of scraping a website.</p><p id="71f9"><a href="https://www.youtube.com/channel/UCGKngc82bux4NIDar572E5Q/featured?sub_confirmation=1"><b>If you want to learn Python in Spanish, subscribe to my YouTube channel. Every week I publish videos like the one below.</b></a></p> <figure id="b20e"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fwww.youtube.com%2Fembed%2FCHiwaFEUB1Y&amp;display_name=YouTube&amp;url=https%3A%2F%2Fwww.youtube.com%2Fwatch%3Fv%3DCHiwaFEUB1Y&amp;image=http%3A%2F%2Fi.ytimg.com%2Fvi%2FCHiwaFEUB1Y%2Fhqdefault.jpg&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=youtube" allowfullscreen="" frameborder="0" height="480" width="854"> </div> </div> </figure></iframe></div></div></figure><h2 id="3c96">Gain Access to Expert View — Subscribe to DDI Intel</h2></article></body>

Web Scraping for Data Science — Is it legal?

Keep this in mind when scraping websites for your data science projects.

Photo by Tingey Injury Law Firm on Unsplash

As a data scientist, you won’t always have a clean dataset ready to use for new projects. In most cases, the first step in your data science project would be extracting data from websites with scraping tools that imitate human surfing behavior on the internet.

However, these web scraping tools might be seen as a threat to most websites. In fact, companies like Facebook seemed to be against web scraping, even for research purposes.

To start off your data science project on the right foot, we’ll analyze in which scenarios scraping data would be considered either legal or illegal.

Web Scraping for research projects

Although projects with research purposes like those involving sentiment analysis, fake news detections, or model predictions might seem harmless, it might raise alarms of some companies if you obtained that data from their website without permission.

Let’s consider what happened between Facebook and New York University as an example.

On October 16th, Facebook wrote a letter demanding a New York University (NYU) research project stop collecting data without permission. The NYU project consisted of tracking how political ads were targeted by collecting ad targeting data from users. Facebook wrote:

“Scraping tools, no matter how well-intentioned, are not a permissible means of collecting information from us. We understand the intent behind your tool. However, the browser plugin scrapes information in violation of our terms, which are designed to protect people’s privacy.”

Although companies like Facebook don’t allow web scraping, scraping isn’t considered a big issue when it comes to research. In fact, web scraping is legal for UK researchers, after introducing, in 2014, clear legislation to define exceptions to copyright for non-profit research.

That being said, Facebook’s response to the New York University project might be only an inevitable overreaction after the Cambridge Analytica scandal. In fact, Facebook seemed to back down from its threat to shut down the project after receiving public criticism.

Web Scraping for commercial purposes

When you scrape data from websites, you’re accessing data that might be protected by copyright. If you scrape websites and use the data obtained for commercial purposes, you could get into trouble.

Let’s consider the HiQ and LinkedIn case as an example.

HiQ is a data analytics firm that provides business intelligence based on publicly-available data scraped from LinkedIn. For this reason, LinkedIn invoked the Computer Fraud and Abuse Act (CFAA) in a cease-and-desist letter to HiQ.

Although the US Court of Appeals denied LinkedIn’s request saying that web scraping public sites does not violate the CFAA, it should be noted that the decision doesn’t grant HiQ the freedom to use data obtained by scraping for commercial purposes.

Some of the techniques HiQ used for scraping LinkedIn and avoiding IP ban are explained in the following article.

So projects with commercial purposes, like selling the data scraped, are not safe. Instead, you could make your project profitable without selling data with the following option.

Web Scraping for trading projects

Trading involves buying and selling things. The goal is to profit off of any buying and selling activity.

Web scraping for trading usually works in the following way.

Some people build a scraper that returns the price of a specific product, so when the price drops, the program automatically buys the product before it’s sold out. Once the demand for the product grows and the price rises, they resell the product to make a profit.

However, one of the best types of trading is arbitrage because you only take advantage of the price difference between two or more markets, so you buy in one market and simultaneously sell in another market for a higher price.

Arbitrage occurs in exchange rates, sports betting, cryptocurrencies and more. On top of that, arbitrage trading is legal in the United States and is encouraged, as it contributes to market efficiency.

If you like the idea of making extra money with arbitrage trading, check the article below:

Conclusion

In this article, we analyzed the legality of web scraping. We found that research projects shouldn’t have legal issues, especially in countries like the UK, which clearly stated that web scraping is legal for researchers. Projects with commercial purposes, however, might infringe copyright. That being said, it’s still possible to profit off of the data obtained with arbitrage trading.

Above all, it seems that “what you’re gonna do with the data” would determine the legality of scraping a website.

If you want to learn Python in Spanish, subscribe to my YouTube channel. Every week I publish videos like the one below.

Gain Access to Expert View — Subscribe to DDI Intel

Data Science
Programming
Web Scraping
Python
Selenium
Recommended from ReadMedium