Sunday 26 April 2009

The Ethics of Scraping

I’ve recently been given a commission to do a piece of coding involving grabbing information from another website and displaying it in a different manner. Obviously I won’t say much more about the brief – don’t want to annoy the customer or expose myself!

The brief got me wondering – how legal is it to grab content from another website? I guessed it was probably another one of the ‘grey areas’ that surround the internets at the moment, however another issue concerned me slightly more… how ethical was it? My script essentially steals information from a website for use in other ways, that the original publisher has no control over. Since the website in question publishes no open API or content feeds for this purpose, surely they mustn’t want this information being scraped from them? The website being scraped is essentially just a collection of many different items, published by their own individual owners. Do they want the information scraped and re-published?

I’ve heard of another case from a friend (again, won’t get into details) whereby a company is keeping a tight fist around its data. This is sort of understandable, but again the website in question is aggregating many different peoples data. Is the website being anti-competitive  by not releasing APIs?

I will sleep better once this script is out of development and off my server – I know I wouldn’t like it if someone else was doing this to my website! However – the customer is always right, and I won’t disappoint!

No comments: