We only need to implement three methods: onStart , onStop and receive , but first we need to create a Java class with gen-class .

Since we want to play nice with the rest of the JVM ecosystem (Java, Scala, etc.) and Clojure doesn't support interfaces with generics, we specify AbstractWikipediaEditReceiver in Java.

public abstract class AbstractWikipediaEditReceiver extends Receiver < WikipediaEditEvent > { public AbstractWikipediaEditReceiver ( StorageLevel storageLevel ) { super ( storageLevel ); } }

From the previous post you may recall that :constructors is a map from the types of our constructor methods to the types of the superclass's constructor. We have three constructors, one without arguments in which case we'll choose a random nickname when connecting, one with the nickname specified, and one with both the nickname and the storage level for our RDDs.

( :gen-class :name com.wjoel.spark.streaming.wikiedits.WikipediaEditReceiver :extends com.wjoel.spark.streaming.wikiedits.AbstractWikipediaEditReceiver :init init :state state :prefix "receiver-" :constructors {[] [ org.apache.spark.storage.StorageLevel ] [ String ] [ org.apache.spark.storage.StorageLevel ] [ String org.apache.spark.storage.StorageLevel ] [ org.apache.spark.storage.StorageLevel ]} :main false )

The IRC library we use has a default adapter for receiving events. All the messages we are interested in will be sent as private messages, so we can use proxy to implement only onPrivMsg to call a message handling function for each message received.

( defn make-irc-events-listener [ message-fn ] ( proxy [ IRCEventAdapter ] [] ( onPrivmsg [ target user msg ] ( message-fn msg ))))

It's easy but tedious to connect to the IRC server, but once we have a connection we can add this listener and use the store method of our Receiver to pass the events back to Spark.

( .addIRCEventListener ( make-irc-events-listener ( fn [ msg ] ( when-let [ edit-event ( edit-event-message->edit-event msg )] ( .store this edit-event )))))

edit-event-message->edit-event uses a regexp to extract the different fields from the message and create a WikipediaEditEvent JavaBean which we have created using clj-bean, as described at the end of the previous post.

We need to start a thread in onStart and clean up in onStop .