Stories
Slash Boxes
Comments

SoylentNews is people

posted by martyb on Wednesday July 31 2019, @06:36AM   Printer-friendly
from the Arachnophobia dept.

"Come on, I worked so hard on this project! And this is publicly accessible data! There's certainly a way around this, right? Or else, I did all of this for nothing... Sigh..."

Yep - this is what I said to myself, just after realizing that my ambitious data analysis project could get me into hot water. I intended to deploy a large-scale web crawler to collect data from multiple high profile websites. And then I was planning to publish the results of my analysis for the benefit of everybody. Pretty noble, right? Yes, but also pretty risky.

Interestingly, I've been seeing more and more projects like mine lately. And even more tutorials encouraging some form of web scraping or crawling. But what troubles me is the appalling widespread ignorance on the legal aspect of it.

So this is what this post is all about - understanding the possible consequences of web scraping and crawling. Hopefully, this will help you to avoid any potential problem.

Disclaimer: I'm not a lawyer. I'm simply a programmer who happens to be interested in this topic. You should seek out appropriate professional advice regarding your specific situation.

https://benbernardblog.com/web-scraping-and-crawling-are-perfectly-legal-right/


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by jbruchon on Thursday August 01 2019, @03:14AM

    by jbruchon (4473) on Thursday August 01 2019, @03:14AM (#873849) Homepage

    It's amazing how you got "make a copy (which violates the owner's copyright)" out of "send an HTTP request to the owner for a copy and receive a copy from the owner (you know, the person who's explicitly allowed to make copies and send them to people)." Not a lawyer? If you are, I'd predict that you won't be for long. Next thing you know, you'll be arguing that ad blocking is theft because somehow it's illegal to NOT go get other data after you have already been given a copy of the data by the owner of the data.

    --
    I'm just here to listen to the latest song about butts.
    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3