The release of OCaml 4.09.0 is particularly significant for us at OCaml Labs, as it represents a phase shift in our development efforts towards integrating multicore parallelism into the language. For the past few years, we have been implementing multicore as a branch based off released versions of the compiler. We finished rebasing it to OCaml 4.06.1 in April and since then have been working on upstreaming a series of incremental changes to OCaml itself.

OCaml 4.09.0 is the first such release in which multicore patches are appearing in released versions of the compiler. This is not the full multicore feature set, but rather the prerequisites to introducing changes required towards introducing parallelism into the runtime. You can now expect to see a regular set of incremental changes towards multicore in every release of OCaml as we ramp up our upstreaming efforts.

One decision we have taken recently is to spend our time on upstreaming changes in favour of further rebases to more recently versions of the compiler. If someone does have a pressing need for a rebase to OCaml 4.08 or 4.09, then please get in touch with me – but bear in mind that it’s a significant amount of work and so will need to be justified with a usecase.

In the meanwhile, here’s a summary of what some of those patches are, and what to expect in future releases:

4.09.0

In the upcoming multicore GC, object headers (tags and lengths) are immutable due to multiple threads scanning the heap simultaneously; any mutations could violate heap invariants in another thread and cause corruption. Therefore, Obj.truncate (#2279) and Obj.set_tag (#1725) have now been deprecated, and all uses removed from the standard library.

Values can be passed from OCaml to C by registering them under a known name using the Callback.register function. They can later be retrieved from C using caml_named_value , which returns a value* that can then later be dereferenced. OCaml 4.09.0 modifies the C return type to const value* to indicate that the C code cannot use the pointer that is returned to mutate the value that is registered (#2293). The ability to mutate a value using the raw pointer returned by caml_named_value is incompatible with the upcoming multicore GC, and rarely (never?) used in existing single-core OCaml code.

Ongoing for 4.10.0~dev

This is the subsequent release that is branching imminently now that OCaml 4.09.0 has been released.

Variables that are global in the OCaml runtime need to be duplicated per-domain in multicore, since each parallel thread of execution maintains its own table of domain local variables. OCaml 4.10.0 moves all such global C variables into a “domain state” table (#8713). While the change does not introduce any API changes, it significantly alters code generation by reserving a register that was previously used as the exception pointer in every CPU backend for quickly accessing the domain state table. It is therefore a syntactically heavy change, but shouldn’t modify the semantics of your code. If you do notice any oddnesses when testing OCaml 4.10~dev when it is released as a beta, please do report a reproduction case upstream.

(bonus change) While emerging from deep in a rabbit hole from fixing thread stack overflow detection and reentrant marshalling by ensuring that allocation functions do not trigger OCaml callbacks when invoked from C, it was discovered that major GC hooks could also interact with the GC heap. This is now forbidden (#8711) in OCaml 4.10.0. There was no code found in the wild that did not already conform to this restriction, and it is generally safer this way for the multicore GC as well.

Ongoing for 4.11.0~dev

As 4.10 is about to be branched, we are working away on the following next set of features to push upstream into OCaml 4.11:

Better safe points (#187)

Tracing and deprecating the instrumented runtime

Converging on the representation of closures in bytecode and native code.

Modifying GC colors to suit multicore.

As always, these chunks of ongoing work are subject to change as the technical design process is quite iterative and dependent on benchmarking results, but are hopefully useful for you to know!