The Java Web Scraping Handbook

A step by step guide to web scraping

Web scraping or crawling is the art of fetching data from a third party website by downloading and parsing the HTML code to extract the data you want. It can be hard. From bad HTML code to heavy Javascript use and anti-bot techniques, it is often tricky.

Lots of companies use it to obtain knowledge concerning competitor prices, news aggregation, lead generation...

This book will teach you how to extract data from any website, how to deal with AJAX / Javascript heavy websites, break captchas, deploy your scrapers in the cloud and many other advanced techniques.