“Web scraping” involves the use of software to collect data from the internet, which can then be sold to other users. See link to Lexology above.
The Ninth Circuit Court of Appeals recently held in hiQ Labs, Inc. v. LinkedIn Corp., No. 17-16783 that LinkedIn cannot restrict robotic software from collecting public information on its website. The GDPR takes the opposite approach from the United States in regard to data privacy protection.
The GDPR requires consent of the data subject for all personal information collected. There are six reasons that the GDPR will respect with regard to collecting personal data. Of these six, only consent (which is very unlikely in the case of webscraping) and a legitimate business purpose are applicable. See Article 6 (1) (f) of the GDPR and Recital 47.
Under the GDPR a legitimate business purpose can be simply economic gain but such a reason will not stand if the data subject objects to the collection of the data. The GDPR builds into its architecture the duty of transparency on the part of the data controller. See GDPR Articles 12-13 generally; and specifically for webscraping issues, Article 14. Pursuant to the duty of transparency, the data subject must be notified and given certain categories of information (Article 12), and certain other applicable information (Article 13) and finally if the personal data was not collected directly from the data subject, the webscraper must provide notice to the data subject based on the requirements in Article 14.
Accordingly, webscrapers must notify every data subject whose data they scrape that they have done so and how the data subject can have their data removed from the webscraper’s control. These duties can be abrogated if the effort would yield a disproportionate impact. That exemption from notification was recently challenged in Poland. In UODO (Poland’s Supervisory Authority) v. Bisnode, the Court had no sympathy for the webscraber and ordered an approximately €220,000 fine as well as required Bisnode to notify approximately 6 million data subjects which would have cost Bisnode approximately € 8 million. Bisnode chose to delete the data instead. However, it is unclear if they have done so given their pending right to appeal and their stated intent to appeal all the way to the Court of Justice. While an English version of the decision could not be found at the time of writing, it seems as though the UODO was troubled by the notice placed on Bisnode’s website which informed website users that data had been scraped by the Bisnode. Unfortunately, for Bisnode if you didn’t know your data had been scraped in the first place, how would you know to go to Bisnode’s site to see the notice? Clearly, Bisnode was aware of the GDPR violation and chose to rely on the disproportionate impact exemption. Bisnode gambled and lost. At least so far. Implicit in the decision however, was that webscraping is not necessarily prohibited in the EU under the GDPR.