Monadic Generators in OCaml¶

Generators are one of those features that have been heavily popularized by today’s fashionable languages such as Python and ES6. So it’s not the first time I’ve encountered programmers who are curious about OCaml bring up with questions along the lines of:

Why doesn’t OCaml support generators?

Are there alternatives to generators in OCaml?

How can I emulate generators in OCaml?

In this post, I’d like to briefly address these questions.

Where are the Generators?¶ I can only speculate but it seems self evident, OCaml doesn’t have native support for generators because it doesn’t really need it. There’s quite a few satisfactory ways of emulating them to the point where the strong justification for adding a feature to the language just isn’t there. With that being said, it’s well known that people are working on algebraic effects for OCaml and it’s my understanding that it will be possible to implement generators in direct style which would be indistinguishable from direct support in the compiler. So this feature is effectively a strict superset of generators and should settle this issue once and for all.

Alternatives to Generators¶ Maybe I’m just old fashioned, but I find myself rarely needing generators. I usually find the bog standard lazy lists to be more convenient to work with. Where my implementation of choice the one is the one that comes with containers. Here’s a simple lazy list of primes: (* save to file.ml and run as $ utop file.ml *) # require "containers.iter" ;; module Llist = CCLazy_list let primes = let open Llist in let rec next_prime n = try for i = 2 to int_of_float ( sqrt ( float_of_int n )) do if n mod i = 0 then raise_notrace Exit done ; Cons ( n , lazy ( next_prime ( n + 1 ))) with Exit -> next_prime ( n + 1 ) in Lazy . from_val @@ Cons ( 2 , lazy ( next_prime 3 )) let () = let n = 100_000 in let primes = Llist . length ( Llist . take n primes ) in Printf . printf "Found: %d primes

" primes OK, maybe this isn’t the prettiest code, but I’ve picked a bad example. Lazy lists usually work much better. Promise. The other candidate I usually reach for is the gen type: type ' a gen = unit -> ' a option Which comes with its own useful gen library. The main advantage of this type is that it’s structural. We can consume and produce gens without ever depending on a library. ocaml-re uses this to great effect here for example. The disadvantage of this type is that implementing generators usually involves ugly mutable code. Oh well, if I had a problem with this kind of mutation, I’d be using Haskell. let primes () = let prime = ref 1 in let rec loop () = incr prime ; try for i = 2 to int_of_float ( sqrt ( float_of_int ! prime )) do if ! prime mod i = 0 then raise_notrace Exit done ; Some ! prime with Exit -> loop () in loop let () = let prime_gen = primes () in for _ = 0 to 100_00 do assert ( prime_gen () != None ); done ; print_endline "There's more than 100_000 primes" Note the type of this function: val primes : unit -> ( unit -> int option ) We must not forget to create a new generator whenever we want to iterate over our generator from scratch.

Really Faking Generators¶ While the approaches above were similar, they’re still not capturing the essence of generators - the ability to yield control back to the caller along with a value anywhere inside a computation. How do we approach this problem? Like every good functional programmer: with Monads! Here’s the signature we’re aiming to implement: module type Gen_intf = sig type ( ' result , ' yield ) t val return : ' a -> ( ' a , _) t val yield : ' s -> ( unit , ' s ) t val (>>=) : ( ' a , ' s ) t -> ( ' a -> ( ' b , ' s ) t ) -> ( ' b , ' s ) t val iter : ( ' result , ' a ) t -> f :( ' a -> unit ) -> ' result end With this signature, we’re modeling generators as computations that can produce a final 'result` value and possibly many intermediate 'yield' values along the way. These computations can of course be sequenced (the essence of monads), and finally executed to produce the final a result along with running a callback for every yielded value. Representing this behavior with a type we arrive at a generator: type ( ' result , ' yield ) t = | Done of ' result | Yield of ( unit -> ' yield * ( ' result , ' yield ) t ) A constructor for terminating the generator and continuing it along with a yielded value. Note that we need to guard every yield behind a thunk. Otherwise we will find it quite hard to pause our generator mid way. Since this a monad instance, let return a = Done a let rec (>>=) x f = match x with | Done r -> f r | Yield t -> Yield ( fun () -> let ( y , next ) = t () in ( y , next >>= f ) ) where return trivial and (>>=) is simply sequencing a generator along with a computation that produces a generator from the result. If you really want to get the hang of it, try implementing join from scratch: val join : (( ' a , ' s ) t , ' s ) -> ( ' a , ' s ) t Now, finally we need yield to generate values: let yield r = Yield ( fun () -> ( r , return () )) Let’s briefly contrast this to our return above. While return created a generator that yielded no values and produced some result, yield creates a generator that yields a single value and produces a trivial result. With this kit, we can construct generators: (* assuming that all these functions are in Gen*) let primes : ( unit , int ) Gen . t = let open Gen in let rec loop n = try for i = 2 to int_of_float ( sqrt ( float_of_int n )) do if n mod i = 0 then raise_notrace Exit done ; yield n >>= fun () -> loop ( n + 1 ) with Exit -> loop ( n + 1 ) in loop 2 We’re almost done here, all that remains is a way to run a generator: let rec iter x ~ f = match x with | Done s -> s | Yield t -> let ( y , x ) = t () in f y ; iter x ~ f While this is a useful function in general, it doesn’t quite work in our case since our generator doesn’t terminate. No matter, we can have a variant that will only consume up to a constant number of yields: let take : type a b . ( a , b ) t -> int -> b list = fun t n -> let rec loop acc t n = match t , n with | Done _, _ | _, 0 -> List . rev acc | Yield t , n -> let ( a , next ) = t () in loop ( a :: acc ) next ( n - 1 ) in loop [] t n As an exercise, one can also implement a generator-returning version of this: val take : ( ' a , ' b ) t -> int -> ( unit , ' b ) t And finally: ignore ( Gen . take primes 100_000 )