In this post I’m going to explore web scraping in Rust through a basic Hacker News CLI. My hope is to point out resources for future Rustaceans interested in web scraping. Plus, highlight Rust’s viability as a scripting language for everyday use.

Scraping Ecosystem

Typically, when faced with a web scraping task most people don’t run to a low-level systems programming language. Given the relative simplicity of scraping it would appear to be overkill. However, Rust makes this process fairly painless.

The main libraries, or crates, I’ll be utilizing are the following:

An easy and powerful Rust HTTP Client

HTML parsing and querying with CSS selectors

A Rust library to extract useful data from HTML documents, suitable for web scraping.

I’ll present a couple different scripts to get a feel for each crate.

Grabbing All Links

The first script will perform a fairly basic task: grabbing all links from the page. For this, we’ll utilize reqwest and select.rs . As you can see the syntax is fairly concise and straightforward.