Category Archives: Web Scrapping
The world-wide web presents enormous amounts of data. Unfortunately, the majority of the data is not directly available for download. In response, web scraping exploits indirect means to harvest data from websites. In practice, web scrapping is not unique and is totally legal. For example, web browsers rely on the Hypertext Transfer Protocol (HTTP) to fetch data and so does web scrapping. The difference with web scrapping is that the user retrieves, selects and extracts website content and data intended for browser display. This article shows how web scraping works and presents tools available in the R programming language for both manual and automated web-scraping.