Controlling Chromium in Haskell



Published on September 1, 2013 under the tag Driving Chromium using its built-in WebSockets serverPublished on September 1, 2013 under the tag haskell

Introduction

The chromium browser has a built-in WebSockets server which can be used to control it. I first heard about this through an issue with my WebSockets project.

Since I am currently rewriting the WebSockets library to have it use io-streams, I wanted to give this a try and made it into a blogpost. As a sidenote – the port is going great, io-streams is a very nice library and I managed to solve a whole lot of issues in the library (mostly exception handling stuff).

Note that this code uses a yet unreleased version of the WebSockets library – you can get it from the github repo though, if you want to: check out the io-streams branch.

This file is written in literate Haskell so we will have a few boilerplate declarations and imports first:

{-# LANGUAGE OverloadedStrings #-} module Main ( main ) where

import Control.Applicative ((<$>)) (()) import Control.Monad (forever, mzero) (forever, mzero) import Control.Monad.Trans (liftIO) (liftIO) import Data.Aeson ( FromJSON (..), ToJSON (..), (.:), (.=)) (..),(..), (.:), (.=)) import qualified Data.Aeson as A import qualified Data.Map as M import Data.Maybe (fromMaybe) (fromMaybe) import qualified Data.Text.IO as T import qualified Data.Vector as V import qualified Network.HTTP.Conduit as Http import qualified Network.URI as Uri import qualified Network.WebSockets as WS

Locating the WebSockets server

To enable the WebSockets server, chrome must be launched with the --remote-debugging-port flag enabled:

chromium --remote-debugging-port=9160

Now, in order to connect to the built-in WebSockets server, we have to know its URI and this requires some extra code. We will first use cURL to demonstrate this:

$ curl localhost:9160/json [ { "description": "", ... "webSocketDebuggerUrl": "ws://localhost:9160/devtools/page/8937C189-5CED-8E34-E26E-A389641FE8FF" } ]

That webSocketDebuggerUrl is the one we want. Let us write some Haskell code to automate obtaining it.

We create a datatype to hold this info. Currently, we are only interested in a single field:

data ChromiumPageInfo = ChromiumPageInfo { chromiumDebuggerUrl :: String } deriving ( Show )

We will use aeson to parse the JSON. We need a FromJSON instance for our datatype:

instance FromJSON ChromiumPageInfo where A.Object obj) = parseJSON (obj) ChromiumPageInfo <$> obj .: "webSocketDebuggerUrl" obj = mzero parseJSON _mzero

The http-conduit library can be used to do what we just did using curl :

getChromiumPageInfo :: Int -> IO [ ChromiumPageInfo ] = do getChromiumPageInfo port <- Http.withManager $ \manager -> Http.httpLbs request manager responseHttp.withManager\managerHttp.httpLbs request manager case A.decode (Http.responseBody response) of A.decode (Http.responseBody response) Nothing -> error "getChromiumPageInfo: Parse error" Just ci -> return ci cici where = Http.def requestHttp.def = "localhost" { Http.host = port , Http.portport = "/json" , Http.path }

One remaining issue is that the JSON contains the WebSockets URL as a single string, and the WebSockets library expects a (host, port, path) triple. Luckily for us, the standard network library has a Network.URI module which makes this task pretty simple:

parseUri :: String -> ( String , Int , String ) = fromMaybe ( error "parseUri: Invalid URI" ) $ do parseUri urifromMaybe ( u <- Uri.parseURI uri Uri.parseURI uri <- Uri.uriAuthority u authUri.uriAuthority u let port = case Uri.uriPort auth of ( ':' : str) -> read str; _ -> 80 portUri.uriPort authstr)str; _ return (Uri.uriRegName auth, port, Uri.uriPath u) (Uri.uriRegName auth, port, Uri.uriPath u)

Once we are connected to Chromium, we will be sending commands to it. A simple Haskell datatype can be used to model these commands:

data Command = Command { commandId :: Int , commandMethod :: String , commandParams :: [( String , String )] [()] } deriving ( Show )

We use the aeson library again here, to convert these commands into JSON data:

instance ToJSON Command where = A.object toJSON cmdA.object [ "id" .= commandId cmd commandId cmd , "method" .= commandMethod cmd commandMethod cmd , "params" .= M.fromList (commandParams cmd) M.fromList (commandParams cmd) ]

What is left is a simple main function to tie it all together.

main :: IO () () = do main : _) <- getChromiumPageInfo 9160 (ci_)getChromiumPageInfo let (host, port, path) = parseUri (chromiumDebuggerUrl ci) (host, port, path)parseUri (chromiumDebuggerUrl ci) $ \conn -> do WS.runClient host port path\conn -- Send an example command $ A.encode $ Command WS.sendTextData connA.encode = 1 { commandId = "Page.navigate" , commandMethod = [( "url" , "http://haskell.org" )] , commandParams[()] } -- Print output to the screen $ do forever <- WS.receiveData conn msgWS.receiveData conn $ T.putStrLn msg liftIOT.putStrLn msg

Conclusion

This is a very simple example of what you can do with Haskell and Chromium, but I think there are some pretty interesting opportunities to be found here. For example, I wonder if it would be possible to create a simple Selenium-like framework for web application testing in Haskell.

Thanks to Gilles J. for a quick proofread and Ilya Grigorik for this inspiring blogpost!