Please don’t hit me, Haskell does a great job of that already.

I love Haskell for the same reasons I love Dark Souls. Fantastic and inscrutable lore, a great combat type system, a cliff-wall difficulty curve, and unending punishment.

I want to collect some statistics from the GitHub API.

Step One - Stack

I download stack and start a project:

> cd /home/jack/programming && stack new github-stats && cd github-stats Downloading template "new-template" to create project "github-stats" in github-stats/ ... ...... All done.

So far so good. Does it work?

> stack build && stack exec -- github-stats-exe github-stats-0.1.0.0: configure ..... Registering github-stats-0.1.0.0... someFunc

Awww yisss. This is going to be so easy!

Step Two - HTTPS GET Request

Now I need to query the GitHub API. Not my first time to the rodeo, I generate a personal access token from GitHub and copy it to a local file. What query should I run first? How about the count for all ASM tetris repositories? Poking around the docs comes up with:

GET https://api.github.com/search/repositories?q=tetris+language:assembly&sort=stars&order=desc User-Agent: steveshogren Authorization: token PUT_TOKEN_HERE

{.. “total_count”: 354}

Easy life. Now how do you GET a resource in Haskell? Ah, Network.HTTP! I copy the front page sample into src/Lib.hs

module Lib ( someFunc ) where x = simpleHTTP (getRequest "https://www.github.com/") >>= fmap (take 100) . getResponseBody someFunc :: IO () someFunc = print x

So simple! This is why laugh at my NodeJS loving friends!

> stack build src/Lib.hs:5:5: Not in scope: ‘simpleHTTP’ src/Lib.hs:5:17: Not in scope: ‘getRequest’ src/Lib.hs:5:77: Not in scope: ‘getResponseBody’ Compilation failed.

Doesn’t compile. Durp, hackage is a package library, I need to add this to my cabal. What is the name of the package? HTTP-4000? HTTP-4000.3.2? Nothing in hackage seems to indicate what goes into the cabal file. I discover it is just HTTP through trial and error. I update my cabal file… in all three build-depends…?

build-depends: base >= 4.7 && < 5 , HTTP

Hrm, same error.

> stack build src/Lib.hs:5:5: Not in scope: ‘simpleHTTP’ src/Lib.hs:5:17: Not in scope: ‘getRequest’ src/Lib.hs:5:77: Not in scope: ‘getResponseBody’ Compilation failed.

Oh, durp, I’d need an import. (WHY ISN’T THIS IN THE CODE SAMPLE?!) Also, print doesn’t work, I need putStrLn .

import Network.HTTP x = simpleHTTP (getRequest "https://www.github.com/") >>= fmap (take 100) . getResponseBody someFunc :: IO () someFunc = x >>= putStrLn

Here goes!!!

> stack build && stack exec -- github-stats-exe github-stats-exe: user error (https not supported)

Wat. Further inspection of the docs shows a line WAAY DOWN in paragraph 5.

NOTE: This package only supports HTTP;

When playing Dark Souls programming Haskell, sometimes the best move is to run away. I search again. haskell https request returns “http-conduit” as the best choice. After adding http-conduit to my cabal, I come up with this beast without any surprises:

query :: IO String query = do initReq <- parseUrl "https://api.github.com/search/repositories" let r = initReq { method = "GET" , requestHeaders = [(hUserAgent, "steveshogren") , (hAuthorization, "token PUT_TOKEN_HERE")]} let request = setQueryString [("q", Just "tetris+language:assembly") ,("order", Just "desc") ,("sort", Just "stars")] r manager <- newManager tlsManagerSettings res <- httpLbs request manager return . show . responseBody $ res someFunc :: IO () someFunc = do query >>= putStrLn

Huzzah! Results! I’m getting back a monster string of json data.

“\“{\\“total_count\\”:66, ….}\”

Step Three - Parsing JSON

Time to parse this mega JSON string. Aeson seems to be the biggest contender. To use Aeson and get the total_count value from the return, I needed the following additions:

{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DeriveGeneric #-} import GHC.Generics import Data.Aeson data ResultCount = ResultCount { total_count :: Int } deriving (Generic, Show) instance ToJSON ResultCount instance FromJSON ResultCount

ResultCount allows me to use decode from aeson instead of show to parse the “total_count” from the JSON response into an Int. Sure enough, it does!

{-# LANGUAGE OverloadedStrings #-} {-# LANGUAGE DeriveGeneric #-} module Lib ( someFunc ) where import Control.Monad import Network import Network.HTTP.Conduit import Network.HTTP.Types.Header import GHC.Generics import Data.Aeson data ResultCount = ResultCount { total_count :: Int } deriving (Generic, Show) instance ToJSON ResultCount instance FromJSON ResultCount query :: IO (Maybe Int) query = do initReq <- parseUrl "https://api.github.com/search/repositories" let r = initReq { method = "GET" , requestHeaders = [(hUserAgent, "steveshogren") , (hAuthorization, "token PUT_TOKEN_HERE")]} let request = setQueryString [("q", Just "tetris+language:assembly") ,("order", Just "desc") ,("sort", Just "stars")] r manager <- newManager tlsManagerSettings res <- httpLbs request manager return . liftM total_count . decode . responseBody $ res someFunc :: IO () someFunc = query >>= print

Puts out: Just 66 . Success! Wait. 66 isn’t the same count I got when running from the browser. Check again. Sure enough, browser comes up with a totally different count.

Maybe the query request isn’t correct? Adding a print request on line 31 after building the request shows:

Request { host = "api.github.com" port = 443 secure = True requestHeaders = [("User-Agent","steveshogren"),("Authorization","token PUT_TOKEN_HERE")] path = "/search/repositories" queryString = "?q=tetris%2Blanguage%3Aassembly&order=desc&sort=stars" method = "GET" proxy = Nothing rawBody = False redirectCount = 10 responseTimeout = Just (-3425) requestVersion = HTTP/1.1 }

The queryString isn’t right! It encoded my + and : ! After an hour of reading through docs and researching URL encoding specs, it dawns on me. + is an encoded whitespace.

No face-palm gif could ever represent the shear magnitude of my current emotions… You’ll have to use your imagination

I change my query to "tetris language:assembly" and the right count comes back! Just 354

I finally have something that correctly fetches a count of repositories from GitHub and parses it into an Int. After over four hours of Dark Souls Haskell punishment, we deserve to enjoy a bonfire!

Edit: Bonus Round!

Thanks to Chris Allen and /u/JeanParker for pointing me towards wreq, which weirdly didn’t come up when I looked around for libs yesterday. Yep, it was 6th on the Google when searching for haskell https get . Network.HTTP is the top three results, and that doesn’t even do https.

¯\(ツ)/¯

Armed with their helpful suggestions, I knocked this out this morning.

import Network.Wreq import Control.Lens import Data.Aeson import Data.Aeson.Lens import qualified Data.Text as T import qualified Data.ByteString.Char8 as BS opts :: String -> String -> Options opts lang token = defaults & param "q" .~ [T.pack $ "tetris language:" ++ lang] & param "order" .~ ["desc"] & param "sort" .~ ["stars"] & header "Authorization" .~ [BS.pack $ "token " ++ token] query lang = do token <- readFile "token" r <- getWith (opts lang token) "https://api.github.com/search/repositories" return $ r ^? responseBody . key "total_count" . _Number

MUCH better. This includes reading my token from file called “token” so I don’t accidentally commit it. Also includes building up the different query options based on inputs, which was the next step. Thanks y’all.