Web scraping is a technique used for retrieving data from websites. You fetch the page’s contents, and then extract the data you need from the page for processing, saving it, or simply displaying it on your app. It comes in handy when the app/website you are trying to scrape does not expose any external API for public consumption. It is worth noting that some sites do not allow scraping, so be aware of that before you attempt it.

In Node.js, we can fetch the webpage using an HTTP client, like axios and use cheerio for extracting the data we need from the page.

Here, I will walk us through the process of:

Fetching a webpage

Extracting data from a webpage

Displaying the content on a webpage

Saving the data in JSON format

Fetching the webpage

The site we will be scraping is remoteok. It is a job board where remote work is listed, with tags, company names, and categories.

We are using axios for data fetching, but first, we install our dependencies

$ mkdir scraper && cd scrapper

$ npm init -y

$ npm install --save axios cheerio

And use it like so:

index.js

const siteUrl = "https://remoteok.io/";

const axios = require("axios"); const fetchData = async () => {

const result = await axios.get(siteUrl);

return cheerio.load(result.data);

};

Extracting data from a webpage