The Process

It’s pretty simple really. I probably could have explained this all in the headline. Let me break it down:

Go to target site, navigate to target resource

Look at the traffic in Burp Suite

I really don’t know why I chose this for the example.

Then do this thing (right click in the lower part of the window)

You read that correctly.

This is where I cheat and use this neato curl-to-ruby tool.

The product so far is something like this:

Technically, you could now parse the response with something like Nokogiri, but that almost seems silly to mix methods like that. I like Mechanize, so I typically use this as a template and convert it. So yeah, something like this:

It’s beautiful, really.

You may be thinking “why the hell did we do that curl-to-ruby thing at all?”. Perfectly understandable, I’ll excuse your rudeness. When you start scraping sites with frustrating technologies under the hood, like .NET (shudder), you start getting ugly post bodies like this:

This is a tame example.

Converting to a friendlier library like Mechanize makes sense from a mental health perspective, but you probably also want something you can maintain that doesn’t consist of 2000+ lines of copy/pasted code. Ah, yeah, I guess both of those are the same point.