Sometimes I'm curious about something on the web. Maybe it's a table with numbers and I'd like an arithmetic average of them. Or, in this case, someone says that "Project Euler isn't as maths-y as people say." Immediately I want to look at the titles of a random sample of a few Project Euler challenges to see how mathsy they really are. I could do all this manually, but I could also automate it because I'm a programmer.

Preparation

Since challenges on Project Euler are indexed by numbers 1 through 512, I know I need a bunch of random numbers to pick out random challenges. System.Random to the rescue!

import Control.Monad import System.Random main = do numbers <- take 10 . randomRs (1,512) <$> getStdGen forM_ numbers $ \i -> do print (i :: Int)

This should be pretty self-explainatory. numbers is a list of 10 random numbers distributed between 1 and 512, based on the global standard generator. I loop through them and print them all. As it turns out, the Int type signature is necessary, because otherwise GHC doesn't know if I want Int s, Integer s, Double s or anything else vaguely number-y.

Download the Page

Now that we have the random numbers, let's download the challenges corresponding to those numbers! This is easy as pie with wreq. The only thing we change (besides imports) is the loop body.

import Control.Concurrent import Control.Lens import Control.Monad import Network.Wreq import System.Random problem n = "https://projecteuler.net/problem=" ++ show (n :: Int) main = do numbers <- take 10 . randomRs (1,512) <$> getStdGen forM_ numbers $ \i -> do response <- get (problem i) print (response ^. responseBody) threadDelay 2000000

First we use the wreq function get to make a get request for a problem. (The type signature is included here for the same reason as before.) We store the response in the response variable. Then we print the responseBody field of the response. Finally we sleep for two seconds after each request to be nice toward the server.

Get the Titles

Just dumping the HTML of the page, as we have done now, isn't particularly productive. We would like to extract the title of the challenge and print out only that to make it easier to read the data. This requires a small modification of the loop body again, plus some imports – notably bringing taggy-lens, which does most of the heavy lifting, into scope.

{-# LANGUAGE OverloadedStrings #-} import Control.Concurrent import Control.Lens import Control.Monad import Data.Text.IO as T import Data.Text.Lazy.Encoding import Network.Wreq import System.Random import Text.Taggy.Lens problem n = "https://projecteuler.net/problem=" ++ show (n :: Int) main = do numbers <- take 10 . randomRs (1,512) <$> getStdGen forM_ numbers $ \i -> do response <- get (problem i) T.putStrLn (response ^. responseBody . to decodeUtf8 . title) threadDelay 2000000 title = html . allNamed (only "h2") . contents

I know the title of the challenge is in the only <h2> tag on the page, so I create a lens title which drills down into the HTML, then into all <h2> tags, and their contents. The lens combinator ^. will turn them all into a single text value (by concatenation), which I then print.

Wrapping Up

And that's it, really. What's so great about this is how the lenses that do the extraction work combine so easily. It's like writing JQuery except in a real language! The combination of wreq and taggy-lens works great in the interactive interpreter too! In fact, that's how I came up with the access string

responseBody . to decodeUtf8 . html . allNamed (only "h2") . contents

I just started with the first bit and then added step after step until I had focused on the data I wanted.

So... what's the result?

Scary Sphere Digit factorials Compromise or persist Number letter counts The Ackermann function Lowest-cost Search Combined Volume of Cuboids Arithmetic expressions Remainder of polynomial division Composites with prime repunit property

Pretty mathsy, I'd say.