RoboBrowser

Scrape the web while browsing



This one a webkit powered browser which built for web scraping purposes.

It loads requested webpage, saves page source to disk, and sends it's path

to php script as first parameter. You have to write your own script to

collect data from each page. A sample included in application.



Application also supports javascript injection to loaded webpage to

automate clicks. you can use preloaded jquery framework to simulate

clicks and mouse events to create an headless setup.



##################################################



Pros:

+ you can watch and take action when scraper stalled

+ solve captcha's yourself or enter your account credentials

when servers asked.

+ browse webpages like any regular visitor by using

all available web standards. servers can't tell the

difference you're a bot or human.



Cons:

- Kinda slow when compared to headless spiders

- Uses more system resources

- Not designed for headless execution



Known Issues:

- Due to a renderer bug in cef, scrollbars gone red.

It's only visual, doesn't affect the application behaviour.



- Sometimes application crashing while shutting down, you have

to terminate it manually from process manager in that case.



- If application won't start, make sure there is no any other

instances active and delete the contents of the "cache" folder



##################################################



Might be useful while scraping results from highly secured

data sources such as search engines, live stock information

sports bet results etc.



##################################################

It's designed for personal purposes, if you need extra upgrades

contact me at root@psychip.net



Armagan Corlu aka Psychip

http://psychip.net

Aug 2016