Jane Street Style Guide

Generalities

This guide documents what you could call Jane Street’s house style. It’s not an absolute guide to how code is written everywhere here; different groups make different decisions in some cases. We’ll document some of the variations here, while noting which one of those we think of as the house style.

Even though the house style isn’t universally followed, it’s a good default, and it’s the style we use in our most foundational libraries, including Base , Core , Async and Incremental . Indeed, those libraries are generally good places to look at for examples of good practice.

Formatting

Indentation should follow the rules of ocp-indent . This is enforced by default by Jenga, and is included in Jane Street default vim and emacs configs, but you can also achieve it by using ocp-indent with the JaneStreet ruleset.

Formatting should follow the rules of ocamlformat with the JaneStreet profile.

90 characters is the maximum line length.

Naming

Identifiers (constructors, variables, structures, type names, …) are written using underscores to separate words, not in CamelCase . So write num_apples not numApples and Foo_bar , not FooBar .

Identifiers that exist for short scopes should be short, e.g., List . iter ~ f :( fun x -> x + 1 ) numbers whereas identifiers that live for large scopes should be larger and more descriptive.

Boolean-returning functions should have predicates for names. For instance, is_valid is better than check_validity .

Use unsafe to indicate memory-unsafety only. e.g., the following, from Base.Array , is an appropriate use of unsafe. (** Unsafe version of [set]. Can cause arbitrary behavior when used for an out-of-bounds array access. *) external unsafe_set : ' a t -> int -> ' a -> unit = "%array_unsafe_set" This function, however, from Base.Int_intf , uses unchecked instead of unsafe to indicate a function that may have bad behavior in certain circumstances, but is not memory unsafe. (** [of_float_unchecked] truncates the given floating point number to an integer, rounding towards zero. The result is unspecified if the argument is nan or falls outside the range of representable integers. *) val of_float_unchecked : float -> t

Use OCamldoc style comments – starting with (** – in mli’s. Make sure to use square-brackets to enclose small OCaml values, and {[ ]} to enclose larger blocks.

Avoid comments that add no useful information to the type and function name. i.e., avoid this: (** Compares two strings *) val compare : string -> string -> int Whereas this would be better. (** [compare x y] Compares two strings lexicographically *) val compare : string -> string -> int If you really can’t find anything useful to add with a comment, it’s acceptable to have a comment that is redundant with the type and name, particularly in broadly-aimed libraries like Base and Core. This reflects the fact that it’s considered gauche in some circles to have functions with no documentation whatsoever. As an example, in the Bytes module, we might have this (** [length t] returns the number of bytes in [t]. *) val length : t -> int even though the content is largely duplicative.

Let-syntax

When programming with monadic and applicatives, the use of let%bind is generally preferred in cases where an explicit variable is bound. For example, prefer this:

let % bind x = foo y in let % bind z = bar x in return x + z

to

foo y >>= fun x -> bar x >>= fun z -> return x + z

That said, even when using let%bind or let%map , infix operators are still useful when operating in a point-free style, i.e., when not binding variables. So, we might write.

let % bind x = m >>| Model . x in foo x

rather than

let % bind x = ( let % map m = m in Model . x m ) in foo x

Opening Modules

Top-Level

Only open modules with a clear and standard interface, and open all such modules before defining anything else, e.g.,

open Core open Async let snoo () = ...

not:

let now () = ... (* shadowed below by opening Time *) open Core open Time

Local “open”

Even using a local “open” will pollute the namespace of an expression and risk silently shadowing a variable of the same type. When you do use a local-open, you should aim to keep the scope small. This notation:

let f g = Time . ( now () < lockout_time )

is generally preferable to this one:

let f g = let open Time in now () < lockout_time

because the scope is more clearly delimited.

That said, when the interface being opened has a standard and widely understood API, then the let open syntax is preferred.

let open Option . Monad_infix in load_config () >>= create

Some modules provide an Infix module or an O module which is specifically designed for local opens.

Signatures

Most modules should contain a single type named t . For instance, the String modules defines String.t . When you’re reading the String interface and you see t , think ‘string’ in your head.

Prefer functions that return explicit options (or errors) over throwing exceptions. If your function routinely raises an exception, put _exn in the name. For example: val create : string -> t option val create_exn : string -> t

Functions in a module M should typically take M.t as their first argument. An exception to this is that optional arguments should be placed before the M.t argument if that is the sole positional argument, to allow the optional arguments to be erased.

Prefer standard signature includes to hand-written interfaces. E.g., prefer this: include Comparable . S with type t := t over this: val t_of_sexp : Sexp . t -> t val sexp_of_t : t -> Sexp . t

Most comments should be for users of a module, not implementers, which means that most comments should be in the mli . We place comments above the function signature, module, or record field described. Small comments can also go to the right of a line of code.

Defensive Programming

Always annotate the type of ignored values. That way the compiler will complain if the type changes. For example, imagine what happens to ignore ( Unix . wait ` Any ); when val wait : [ ` Any | ` Pid of Pid . t ] -> Pid . t changes to return the exit code instead of raising on non-zero: val wait : [ ` Any | ` Pid of Pid . t ] -> Pid . t * Exit . t It’s better to write: ignore ( Unix . wait ` Any : Pid . t ); The same logic applies to underscores, except where the type is more-or-less pinned down anyway.

If a function takes two arguments of the same type and the arguments are used differently, they should usually be labeled to avoid accidental permutation. Notable exceptions include List.append and (-) where the order is sufficiently clear. For example: val send : source : User . t -> dest : User . t -> unit

Avoid catch-all cases in pattern matches. For example, prefer this: let position_change = function | Execution e -> Dir . sign ( Execution . dir e ) * Execution . size e | Ack _ | Out _ | Reject _ -> 0 to this. let position_change = function | Execution e -> Dir . sign ( Execution . dir e ) * Execution . size e | _ -> 0 Both are correct, but the former will produce an error if Correction is added to the type being matched on, and the latter won’t.

Optional arguments should typically only be used for functions that are called in many different places. That’s because optional arguments make your code less explicit, which makes the call sites harder to understand, and it makes it easy to forget to specify the argument in a case where it’s required. A good rule of thumb is to avoid optional arguments for functions that are not exposed in your module interface.

Directory Names

Use dashes (“ - ”) in for multi-word directory names, instead of underscores (“ _ ”). Thus:

~/local/trunk/trader-tools/

instead of:

~/local/trunk/broken_directory_name/

Exceptions

It is OK for a function to raise an exception in an exceptional circumstance. We often, but not always, suffix such functions with _exn . Although raising is fine, one should not write code that depends on which exception is raised. That is, one should never declare exceptions or match on them. Doing so is problematic in a large code base because the type system doesn’t track which exception is raised. If callers of a function need to discriminate among error cases, then the function should return a variant distinguishing the cases.

Instead of declaring new exceptions, one should use a function that implicitly constructs an exception, e.g.:

raise_s [% message "something bad happened" ( t : t )]

Tests

Test your code. Some codebases do better with having unit tests for every function, some are better with end-to-end tests, but generally speaking all significant changes should have tests demonstrating their effect.

Tests can sometimes go in the same file, but typically should go in a separate test library. This is preferable for a number of reasons: For one thing, it encourages you to write tests against the exposed API of your code, which is usually the right approach. It also allows you to draw on more dependencies in your test code than you might want to link in to your production code.

Expect tests are the preferred way of writing tests. This doesn’t mean that you shouldn’t use Quickcheck or other forms of property tests; it’s just that let%expect_test is a single umbrella under which you can write all of these kinds of tests conveniently.

Private submodules