

For one of my projects I needed a tool to automatically monitor specific values (number of comments/downloads etc) on web pages. After seeing this article on Hacker News I thought that it would be interesting to write the tool with NodeJS myself.

Features

Monitor the value of a specific element based on a CSS Selector

Alert for changes on said element

Default URL/selector or dynamic as command line arguments

Prerequisites

NodeJS & NPM

A URL

A CSS Selector (see below how to find it)

Setup

jsdom

In order to fetch the page’s HTML and select the element by a CSS Selector we are going to use jsdom. jsdom can parse HTML from a string, file or in our case a URL and then expose a window variable similar to the one we have on the browser.

So lets install jsdom by running:

1 $ npm install jsdom

CSS Selector

Finding a unique CSS Selector for a page’s element is quite easy by using our browser’s developer tools.

Code

Now that we have everything we need, create index.js and lets jump into code.

jsdom & variables

First we will include jsdom and declare our basic variables

index.js 1 2 3 4 5 6 const jsdom = require("jsdom"); const delay = 10; let url = 'https://news.ycombinator.com/news'; let ccsSelector = '#score_13635230'; let previousValue;

checkValue function

Now lets create a function that gets the html and finds the value we want.

We will pass two arguments to jsdom, the HTML’s source(our url) and a callback function. In the callback we have access to the window object so it’s easy to extract the value with native DOM methods.

index.js 1 2 3 4 5 6 7 8 9 function checkValue(){ jsdom.env( url, function (err, window) { const newValue = window.document.querySelector(ccsSelector).textContent; } ); }

Handling the new value

Now that we have the value we have to decide what to do, there are 3 possibilities:

It’s the first time we checked the value, lets print it

The value changed from the last time we checked it, lets print both old and new values

The value didn’t change, do nothing

Lets write it into code and add it into our callback

index.js 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 function checkValue(){ jsdom.env( url, function (err, window) { const newValue = window.document.querySelector(ccsSelector).textContent; if( typeof previousValue==='undefined' ){ console.log(`The value is ${newValue}`); } else if( previousValue!==newValue ){ console.log(`Value changed from ${previousValue} to ${newValue}`); } previousValue = newValue; } ); }

Exiting on error

If jsdom returns an error lets print it and stop the program

index.js - checkValue() 1 2 3 4 5 6 7 8 9 10 ... function (err, window) { if( err ){ console.log(err); process.exit(); } const newValue = window.document.querySelector(ccsSelector).textContent; ...

Repeating the check

The checkValue function is ready, lets make it run repeatedly with setInterval() .

setInterval() will wait the specified delay before the first call so we will invoke the function manually for the first time.

index.js 1 2 3 4 ... checkValue(); setInterval(checkValue, delay*1000); ...

Our code so far

index.js 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 const jsdom = require ( "jsdom" ); const delay = 10 ; let url = 'https://news.ycombinator.com/news' ; let ccsSelector = '#score_13635230' ; let previousValue; checkValue(); setInterval(checkValue, delay* 1000 ); function checkValue ( ) { jsdom.env( url, function ( err, window ) { if ( err ){ console .log(err); process.exit(); } const newValue = window .document.querySelector(ccsSelector).textContent; if ( typeof previousValue=== 'undefined' ){ console .log( `The value is ${newValue} ` ); } else if ( previousValue!==newValue ){ console .log( `Value changed from ${previousValue} to ${newValue} ` ); } previousValue = newValue; } ); }

Lets run it with

1 $ node index

After a while you should see something like

1 2 3 The value is 326 points Value changed from 326 points to 327 points Value changed from 327 points to 328 points

Checking if the url contains the ‘http://‘ prefix

In order for jsdom to treat our string as a url and fetch the HTML we have to make sure that the ‘http://‘ or ‘https://‘ prefixes are included.

index.js 1 2 3 4 5 6 7 8 9 10 ... let previousValue; if( url.indexOf('http')!==0 ){ url = 'http://'+url; } checkValue(); setInterval(checkValue, delay*1000); ...

Dynamic url/selector from command line arguments

In node we can find the arguments at the process.argv array, the first 2 values will be node and index (from index node ) so our values will be on the last 2 items([2] & [3]).

index.js 1 2 3 4 5 6 7 8 9 10 11 ... let previousValue; if ( process.argv.length== 4 ){ url = process.argv[ 2 ]; ccsSelector = process.argv[ 3 ]; } if ( url.indexOf( 'http' )!== 0 ){ url = 'http://' +url; ...

Now we can override the default url/selector when starting the program

1 $ node index "https://example.com" "#someSelector"

Final code

index.js 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 const jsdom = require ( "jsdom" ); const delay = 10 ; let url = 'https://news.ycombinator.com/news' ; let ccsSelector = '#score_13635230' ; let previousValue; if ( process.argv.length== 4 ){ url = process.argv[ 2 ]; ccsSelector = process.argv[ 3 ]; } if ( url.indexOf( 'http' )!== 0 ){ url = 'http://' +url; } checkValue(); setInterval(checkValue, delay* 1000 ); function checkValue ( ) { jsdom.env( url, function ( err, window ) { if ( err ){ console .log(err); process.exit(); } const newValue = window .document.querySelector(ccsSelector).textContent; if ( typeof previousValue=== 'undefined' ){ console .log( `The value is ${newValue} ` ); } else if ( previousValue!==newValue ){ console .log( `Value changed from ${previousValue} to ${newValue} ` ); } previousValue = newValue; } ); }

Closing

You can find the final code with comments on the GitHub repo. Future improvements will probably be a frontend that gets updated on changes via sockets(already under development) and simultaneous checking of many webpages.