In the previous post, I showed how to manually rewrite a Python function into “combinator form”, and then apply YC to it, to get a serialisable version of the original function.

In this post, I show how to do this automatically. We shall derive the function Combinator, which can be applied to any python function and return an equivalent function which is a combinator (and thus can be serialised).

If the above doesn’t make sense, you should really read the first three posts:

Automatically Rewriting to Combinator Form

Consider the factorial function from the previous article:

def makefact():

def fact(n):

return fact(n-1) * n if n > 0 else 1

return fact

Let’s dump it:

>>> fact1 = makefact()

>>> dumpf(fact1)

<function fact at 0x7ff8c3c288c0>

#<function fact at 0x7ff8c3c288c0> — Function in its own closure, stop traverse

Ok. So we’re seeing fact1 is the inner function fact, with itself in its own closure (to be bound to the name “fact” in its body).

To convert fact1 into combinator form, what we’d like to do is to replace the fact in the closure with another function. We want to do this in a wrapper; then it’d look a lot like combinator form. Something like this:

def MakeCombinatorForm(r):

def h(g):

return CopyFunctionWithClosureReplace(r, g)

return h

What’s this do? Well, calling MakeCombinatorForm on a function r should give us a new function h, which is the combinator form of r as detailed in the last article. ie: we can get the serialisable combinator for r by doing YC(MakeCombinatorForm(r)).

To do this, we need to make a copy of r, in the closure of which we replace r with g, the argument to h.

But can we do that?

Copying Functions, meddling with Closures.

Here’s a simple implementation of CopyFunctionWithClosureReplace.

def make_cell(value):

return (lambda x: lambda: x)(value).__closure__[0] def CopyClosureWithReplace(f, g, foundf):

if f.__closure__:

newValues = [

(

(

g # (a)

if value == f else

(

value # (b)

if value in foundf else

CopyFunctionWithClosureReplace(

value, g, foundf

) # (c)

)

)

if isinstance(value, types.FunctionType) else

value # (d)

)

for value in [

lcell.cell_contents

for lcell in f.__closure__

]

]

return tuple([make_cell(value) for value in newValues])

else:

return None



def CopyFunctionWithClosureReplace(f, g, foundf = []):

return types.FunctionType(

f.__code__,

f.__globals__,

closure=CopyClosureWithReplace(

f, g, foundf + [f]

)

)

The gist of these functions: We make a copy of the function f, replacing f with g everywhere we see it in its own closure, transitively.

We have to make copies, because functions and their closures are immutable.

Here’s a quick explanation of what’s happening in CopyClosureWithReplace. We’re iterating through the closure of the function, either taking items directly from the old closure to the new one, or altering them if necessary.

For each closure element, I’ve labelled the four different scenarios with letters in the code.

(a): The value is the function f we are looking to replace in the closure with g. So, we replace it with g.

(b): The value is a function other than f, but we have already seen it in our traversal of the tree. ie: it is a different cycle to the one we are looking for. We wont find anything new by traversing it (it’s a cycle), so just keep the value the same, and don’t traverse further. Note that this is here for safety, and if this code is ever actually executed, the result will not be a closure. I’ll talk about what we need to do to fix this later. It doesn’t affect anything we’ve seen so far.

(c): The value is a function other than f, and we haven’t seen it before. It might have f in its closure (transitively), so we copy it recursively using CopyFunctionWithClosureReplace. This is the recursive call that makes this a tree traversal.

(d): The value is not a function, so don’t tamper with it.

And what on god’s green earth does make_cell do? Well, it’s a lovely little koan that I cribbed from elsewhere (I forget where). Basically, a closure cell is a wrapper around closure values, needed in creating a closure, and there’s no constructor for them. However, you can make a throwaway function where the value you want appears in its closure, inside a closure cell, then just grab that cell. That’s what’s happening.

Anyway, let’s use this thing. Let’s try making a copy of factorial this way:

>>> factcomb = MakeCombinatorForm(makefact())

>>> f2 = YC(factcomb)

>>> f2(6)

720

That’s super cool! Notice that unlike in the last article, we didn’t have to rewrite fact into combinator form; this time MakeCombinatorForm did it for us.

Or did it?

>>> dumpf(f2)

<function fact at 0x7ff8c3c38aa0>

#<function <lambda> at 0x7ff8c3c389b0>

##<function <lambda> at 0x7ff8c3c38668>

###<function h at 0x7ff8c3c38500>

####<function fact at 0x7ff8c3c38848>

#####<function fact at 0x7ff8c3c38848> — Function in its own closure, stop traverse

Whoops, that’s not a combinator! What went wrong?

Closures are grabby

>>> dumpf(factcomb)

<function h at 0x7ff8c3c28ed8>

#<function fact at 0x7ff8c3c28f50>

##<function fact at 0x7ff8c3c28f50> — Function in its own closure, stop traverse

Our original “combinator form” didn’t seem to be in combinator form at all. If you look at h, you can see why;

def MakeCombinatorForm(r):

def h(g):

return CopyFunctionWithClosureReplace(r, g)

return h

While h returns a new copy of r with r replaced by g in the closure, it references the original r via lexical scope. So, r ends up in its closure. The original r is not a combinator. So neither is h, and neither is f2.

Introducing the placeholder

We need to do something to break things up, so that r is not directly referenced in h. To do this, we need a placeholder.

Here’s a modified version of MakeCombinatorForm:

def MakeCombinatorForm(r):

lid = str(uuid.uuid4())

def __placeholder__():

return lid rp = CopyFunctionWithClosureReplace(r, r, __placeholder__) def h(g):

return CopyFunctionWithClosureReplace(rp, __placeholder__, g) return h

Let’s talk it through.

First, notice that calls to CopyFunctionWithClosureReplace have been modified to take three arguments. Now, the first argument is the function to copy, the second one is the function to look for, and the third one is the one to replace the second argument with. So a little more generic. I’ll present the new version below.

The function __placeholder__ needs to be set in place, then later replaced, It turns out that we end up with separate instances of __placeholder__ when we want to test for it (ie: function equality doesn’t help us), so it’s returning a UUID; we’ll use this to check if it’s the same __placeholder__.

(This matters because you can have multiple separate recursions which need to be handled with multiple separate __placeholder__ functions, see below)

Next we create a copy of r, called rp. It is r, with all references to r replaced with __placeholder__.

Lastly, inside h, we copy rp, replacing the __placeholder__ with g. This achieves exactly what the previous version did, but notice that we don’t reference r itself inside h. We only reference rp, __placeholder__, and g. None of those are recursive, so h shouldn’t be either. Let’s check it out:

>>> f1 = makefact()

>>> f3 = YC(MakeCombinatorForm(f1))

>>> f3(6)

720

>>> dumpf(f3)

<function fact at 0x7f329d58ac80>

#<function <lambda> at 0x7f329d58ab90>

##<function <lambda> at 0x7f329d58a758>

###<function h at 0x7f329d58a5f0>

####<function __placeholder__ at 0x7f329d58ac08>

####<function fact at 0x7f329d58acf8>

#####<function __placeholder__ at 0x7f329d58ac08>

>>> import dill

>>> ser = dill.dumps(f3)

>>> f4 = dill.loads(ser)

>>> f4(6)

720

That’s it! Success!

Just recapping what’s been achieved: makefact() returns us a recursive inner function. YC(MakeCombinatorForm(makefact()) makes us an equivalent function which is a combinator. We can then serialise and deserialise that function. The achievement here is that instead of manually performing a combinator rewrite on makefact(), we’ve got an automatic transformation happening.

I should point out here that f3 above actually is recursive at runtime. We haven’t altered the original code, so that recursive form is inescapable. However, the placeholder logic is allowing us to remove the recursive structure until we actually call f3, at which time the recursion is reassembled.

So, you will still get the bad side-effects of recursion, for example you’ll hit limits if you recurse too deeply. But that’s fine for this usage; if we were happy with f1, we were already ok with that.

Oh, I said I’d give you the expanded version of CopyFunctionWithClosureReplace:

def functionsEqual(f1, f2):

return f1 == f2 or \

(

f1.__name__ == "__placeholder__" and

f2.__name__ == "__placeholder__" and

f1() == f2()

) def CopyClosureWithReplace(f, freplace, g, foundf):

if f.__closure__:

newValues = [

(

(

g # (a)

if functionsEqual(value, freplace) else

(

value # (b)

if value in foundf else

CopyFunctionWithClosureReplace(

value, freplace, g, foundf

) # (c)

)

)

if isinstance(value, types.FunctionType) else

value # (d)

)

for value in [

lcell.cell_contents

for lcell in f.__closure__

]

]

return tuple([make_cell(value) for value in newValues])

else:

return None



def CopyFunctionWithClosureReplace(f, freplace, g, foundf = []):

return types.FunctionType(

f.__code__,

f.__globals__,

closure=CopyClosureWithReplace(

f, freplace, g, foundf + [f]

),

)

Not much different, just the extra function parameter freplace, and a change to look for freplace instead of f in CopyFunctionWithClosureReplace.

Oh, and you’ll see the function comparison has been refactored out into its own function, which does a special extra check for the __placeholder__.

The function Combinator

So, our function Combinator could just be this:

def Combinator(f):

return YC(MakeCombinatorForm(f))

And that’ll work almost everywhere. However, there are a couple of problems:

1: It’d be really nice if Combinator could be careful enough to only perform the transformation if f actually requires it.

2: Remember clause (b) in CopyClosureWithReplace()? It’s detecting completely separate cycles which aren’t being fixed by this process, and ignoring them. We should really do something about those.

Only transforming when necessary

To do this, we need an initial check for whether f is a combinator already.

def Combinator(f):

if HasRecursions(f):

f2 = YC(MakeCombinatorForm(f))

else:

f2 = f

return f2

Makes sense. Here’s a simple HasRecursions:

def HasRecursions(f, foundf=[]):

retval = False

if f and isinstance(f, types.FunctionType):

retval = f in foundf

if not retval and f.__closure__:

closureValues = [lcell.cell_contents

for lcell in f.__closure__]

for value in closureValues:

retval = HasRecursions(value, foundf + [f])

if retval:

break

return retval

This is a simple traverse of the closure graph looking for cycles.

Dealing with separate cycles

In Combinator(f) we are dealing with recursion involving f. But what if f isn’t recursive, but instead some function g is recursive, which is somewhere in f’s closure graph?

Check out this example:

def makefactouter():

def fact(n):

return fact(n-1) * n if n > 0 else 1 def fact2(n):

return fact2(n-1) * n if n > 0 else 1 def makefact():

fact2(4) return fact return makefact

That looks obtuse, but these forms really do occur if you use a functional style in anger. Okay, let’s dump it.

>>> mf = makefactouter()

>>> dumpf(mf)

<function makefact at 0x7f329b306398>

#<function fact at 0x7f329d58a668>

##<function fact at 0x7f329d58a668> — Function in its own closure, stop traverse

#<function fact2 at 0x7f329b306848>

##<function fact2 at 0x7f329b306848> — Function in its own closure, stop traverse

Recursive double trouble. This isn’t mutual recursion, it’s two separate instances of recursion. Worse, neither of them actually involved makefact, so if we call Combinator(mf) now, we’ll get this:

>>> mf2 = Combinator(mf)

>>> dumpf (mf2)

<function makefact at 0x7f329b29bed8>

#<function fact at 0x7f329b29b5f0>

##<function fact at 0x7f329b29b230>

###<function fact at 0x7f329b2b9d70>

####<function fact at 0x7f329b2b9d70> — Function in its own closure, stop traverse

#<function fact2 at 0x7f329b29b668>

##<function fact2 at 0x7f329b29b848>

###<function fact2 at 0x7f329b2b9e60>

####<function fact2 at 0x7f329b2b9e60> — Function in its own closure, stop traverse

We still have cycles.

We can fix this with a simple tweak to CopyClosureWithReplace. Remember at b, we were detecting separate cycles and ignoring them? What we really need to do is make them into combinators. We can do that with the function Combinator.

def CopyClosureWithReplace(f, freplace, g, foundf):

if f.__closure__:

newValues = [

(

(

g # (a)

if functionsEqual(value, freplace) else

(

Combinator(value) # (b)

if value in foundf else

CopyFunctionWithClosureReplace(value, freplace, g, foundf) # (c)

)

)

if isinstance(value, types.FunctionType) else

value # (d)

)

for value in [

lcell.cell_contents

for lcell in f.__closure__

]

]

return tuple([make_cell(value) for value in newValues])

else:

return None

Ok, let’s try that troublesome scenario again:

>>> mf3 = Combinator(mf)

>>> dumpf(mf3)

<function makefact at 0x7f329b2c7050>

#<function fact at 0x7f329b2b4c80>

##<function fact at 0x7f329b2b4c08>

###<function <lambda> at 0x7f329b2b4b90>

####<function <lambda> at 0x7f329b2b4b18>

#####<function h at 0x7f329b2b4050>

######<function __placeholder__ at 0x7f329b2b40c8>

#######c5bfdda9–8f24–41f4–8899-a5c895684716

######<function fact at 0x7f329b2b4230>

#######<function __placeholder__ at 0x7f329b2b41b8>

########c5bfdda9–8f24–41f4–8899-a5c895684716

#<function fact2 at 0x7f329b2c70c8>

##<function fact2 at 0x7f329b2b4e60>

###<function <lambda> at 0x7f329b2b4f50>

####<function <lambda> at 0x7f329b2b4ed8>

#####<function h at 0x7f329b2b42a8>

######<function __placeholder__ at 0x7f329b2b4cf8>

#######5c4a1f0c-e2ce-4b45–9f8e-cce487fae87d

######<function fact2 at 0x7f329b2b4de8>

#######<function __placeholder__ at 0x7f329b2b4d70>

########5c4a1f0c-e2ce-4b45–9f8e-cce487fae87d

Alright, that’s big, but it’s a combinator. Does it work?

>>> mf3()(5)

120

Cool! Does it serialise?

>>> ser = dill.dumps(mf3)

>>> mf4 = dill.loads(ser)

>>> mf4()(5)

120

Well, hot damn.

All the code

So where did we get to? Here’s the final, finished code:

import types

import uuid def make_cell(value):

return (lambda x: lambda: x)(value).__closure__[0] def functionsEqual(f1, f2):

return f1 == f2 or \

(

f1.__name__ == "__placeholder__" and

f2.__name__ == "__placeholder__" and

f1() == f2()

)



def CopyClosureWithReplace(f, freplace, g, foundf):

if f.__closure__:

newValues = [

(

(

g # (a)

if functionsEqual(value, freplace) else

(

Combinator(value) # (b)

if value in foundf else

CopyFunctionWithClosureReplace(

value, freplace, g, foundf

) # (c)

)

)

if isinstance(value, types.FunctionType) else

value # (d)

)

for value in [

lcell.cell_contents

for lcell in f.__closure__

]

]

return tuple([make_cell(value) for value in newValues])

else:

return None



def CopyFunctionWithClosureReplace(f, freplace, g, foundf = []):

return types.FunctionType(

f.__code__,

f.__globals__,

closure=CopyClosureWithReplace(

f, freplace, g, foundf + [f]

)

) def MakeCombinatorForm(r):

lid = str(uuid.uuid4())

def __placeholder__():

return lid rp = CopyFunctionWithClosureReplace(r, r, __placeholder__) def h(g):

return CopyFunctionWithClosureReplace(

rp, __placeholder__, g

)

return h def HasRecursions(f, foundf=[]):

retval = False

if f and isinstance(f, types.FunctionType):

retval = f in foundf

if not retval and f.__closure__:

closureValues = [lcell.cell_contents

for lcell in f.__closure__]

for value in closureValues:

retval = HasRecursions(value, foundf + [f])

if retval:

break

return retval def YC(f):

return (

lambda x: x(x)

)(

lambda y:

f(

lambda *args, **kwargs:

y(y)(*args, **kwargs)

)

) def Combinator(f):

if HasRecursions(f):

f2 = YC(MakeCombinatorForm(f))

else:

f2 = f return f2

There’s not all that much of it, so that’s a good sign? I hope so.

Anyway, a final recap:

If you’ve manually rewritten your recursive functions in combinator form, then you can just go

r1 = YC(rcomb)

If you’ve want this done automatically, you can use Combinator, like this

r2 = Combinator(r)

r wont serialise, but r1 and r2 will.

—

This code needs its own repo, and I’d like to make a version of dill that incorporates Combinator transparently. I’ll do those things, then make a follow up post with details. Thanks for reading this!