I needed a project idea to try out some stuff I learned when reading Kyle Simpson’s Functional-Light JavaScript, and I’ve been on an Overwatch bender, so I thought, “hey why not that”. I highly recommend checking that book out if you’re interested in learning some functional programming ideas/philosophies. It covers a lot of ground while actively trying to not intimidate the reader with big words and theory.

Background

In order to get player stats for an Overwatch account, you need to go to the player profile page. There is no Overwatch API provided by Blizzard, so if you want that information in something like JSON, you have to scrape the site yourself or use an existing third-party Overwatch API like OWAPI.

In order to get a nice JSON representation of a user’s stats, you have to

figure out which URL to go to for a given user

pull down the HTML document from that URL

find the relevant stats data by using a tool like cheerio to parse the document and organize them into a more friendly form, like a JavaScript object

These three tasks could all fit within a single function, or at least the first two.

Function Composition

Consider placing each task in its own function and chaining them such that the output of the first is the input to the second function and the second function’s output feeds into the third.

In this case, each function has a much smaller responsibility. buildUrl is only concerned with consuming player information like battletag and region and producing a URL string that points to that player’s resource. getPage simply takes a URL and returns a Promise that resolves to the text of the HTML page at the given URL. getStatsFromDocument is only concerned with consuming an HTML document and producing a new object with stats information from the given document. You can think of each of these functions as small building blocks that you can compose together to create a larger abstraction called getPlayerStats , which takes the same input as buildUrl and produces parsed stats data.

It turns out that chaining or piping functions together is a pretty common pattern in functional programming, so there’s a few utilities you can use to generically compose functions, instead of creating a new function definition that chains specific functions together. compose and pipe are utilities that accept a variadic list of functions and produce a new function that chains the given functions together. The only difference between the two is the order that the functions are executed. compose executes the functions in reverse order, which is similar to the way functions look when their nested in code. compose(last, second, first) will create a function where the functions are run like last(second(first(x))) . pipe reverses the order that compose executes the functions in, or better said it executes the functions in the same order they’re passed to pipe . This behavior is similar to the way UNIX pipes work. pipe(buildUrl, getDataFromUrl, converHTMLToObject) will execute buidlUrl and pass the result to getDataFromUrl and so on. If that’s more intuitive to you than compose, then go crazy.

Equivalent ways of writing a getPlayerStats function.

Getting information from the HTML

The meat and potatoes of the scraper is figuring out how different pieces of data are laid out in the HTML and organizing them in a way that’s easy to work with programatically.

The first portion of the page contains information like hero name, games won, profile image, and profile level. There’s not a lot of repeated structure here, so we can probably write a single function to handle all of that.

The second portion of the page divides information based on quickplay and competitive play, but each section uses the same layout, so we can probably use the same logic and pass in the quickplay or competitive container element as a the context to grab data from. The second portion is further divided into a Top Heroes and Career Stats section. The Top Heroes section displays a list of top hero stats categorized by a stat like Time Played or Games Won, whereas the Career Stats section displays a list of stats categorized by hero. Both of these section use a select dropdown to filter the stats displayed. Luckily selecting an option from the dropdown only modifies which information is displayed on the page, meaning that all of the information we need exists somewhere on the page, but just isn’t visible.

Unfortunately, the element that contains all Top Heroes stats for the Time Played stat does not contain the name of stats category it represents. Instead it has a data-category-id which contains some weird value like overwatch.guid.0x0860000000000021 . Not super helpful. After a little more digging it appears that options in the Top Heroes select dropdown have value attributes that match the data-category-id on the actual stats containers.

We can make a function that takes a select element and a guid and returns the name that that guid represents.

When converting Top Heroes or Career Stats categories from guids to names, guidToText will most likely be used many times with the same category argument. We can curry guidToText , meaning that when we give it the container argument, it returns a function that takes a guid and then calls guidToText with the container argument giving in the first function application and the guid given in the second function application. You can also name the new function returned to be something like getTopHeroCategoryName if the curried guidToText is called with the Top Heroes select dropdown and pass that function around.

Since getTopHeroCategoryName is a unary function now, it’s easy to use in things like map , filter , reduce .

const getCareerStatsCategoryName = curriedGuidToText(careerStatsSelect);

const careerGuids = ['0x02E0000000000004', '0x02E0000000000005', '0x02E0000000000007'];

// [Mercy, Hanzo, Reinhardt] const heroNames = careerGuids.map(getCareerStatsCategoryName);

This new function can be used wherever a stats category needs to lookup a category name from a dropdown. On the overwatch stats page this happens multiple times, so we can put that function to work multiple times.

Now that we have a way to look up category names, we can turn each stats section into some JS data structure. Let’s start with Top Heroes.

One way that might make sense to organize the Top Heroes data is to make an object that has keys for each stat category and then a list of objects with values for the hero and the value of that category.

We can build a function called heroNodeToObject , which will take a DOM Node that contains the hero name and the value of the stat and create a JS object with hero and value keys.

For each Top Heroes category we can map the list of data nodes underneath that category container and get a list of objects for each category. This handles the value part of our desired output, but now we need a way to associate it with keys for the category names.

We can grab all category containers used by Top Heroes and write a function called buildTopHeroesObject that reduces each the list of all category containers into a single object where the category name is the key and the list of heroes as the value. We can also use our curried guidToText function to translate the data-category-id values on each category container into its proper name. See the implementation below for an example on how to put everything together.

For Career Stats, it would be helpful to have an object with hero name keys to a nested object with stat names to stat values. This follows a similar pattern to the approach we took for Top Heroes. Start out by defining a function that takes extracts stat names and values for a given category. We an call it makeHeroStatsPair and it can take the first and second table data nodes ( td ) from a table row ( tr ). Given an HTML table for a career stats category, we can create a list of pairs of the form [ stat_name, stat_value ] . Ideally we want an object with key/value pairs, so lodash’s fromPairs can do a nice job of turning our list of pairs into an object of key/value pairs.

From there we need to reduce all career stats categories into a single object of hero names to stats objects created by makeHeroStatsPair . Our reducer function, buildCareerStatsObject follows a similar pattern of looking up the data-category-id and setting a that as a key on the careerStats object with the value being the object created from makeHeroStatsPair .

To put this all together you can check out the overwat repo on GitHub. PRs are welcome.

Also check out Functional-Light JavaScript.