

f x y = sqrt x + y

result = f 1 2 + f 1 4





f x = let sqrt_x = sqrt x in \y -> sqrt_x + y

result = let f_1 = f 1 in f_1 2 + f_1 4



sqrt

result

sqrt

sqrt



Step 1: Rewrite f to compute sqrt after one argument instead of two.

to compute after one argument instead of two.

Step 2: Rewrite result to share the result of f with one argument.



sqrt

f

f



f x y = sqrt x + y





given x and y, compute sqrt x + y



sqrt x

y

sqrt x

x

y



given x, compute sqrt x, then given y, add that value to y





f = \x -> let sqrt_x = sqrt x in \y -> sqrt_x + y





f x = let sqrt_x = sqrt x in \y -> sqrt_x + y



f

result



result = f 1 2 + f 1 4



f 1



result = let f_1 = f 1 in f_1 2 + f_1 4



f

f 1

f

f

f_1

sqrt 1

f_1 2

f_1 4



Step 1: Rewrite the function to perform some computation before all arguments are supplied.



Step 2: Share the partially applied function.



type String = [Char]



resolveSynonyms :: [Synonym] -> TypeSig -> TypeSig





resolveSynonyms synonyms = let info = buildSynonymTable synonyms in \x -> transformSynonyms info x



resolveSynonyms



data SynonymTable

buildSynonymTable :: [Synonym] -> SynonymTable

resolveSynonyms :: SynonymTable -> TypeSig -> TypeSig



An easy way to improve performance is to call something fewer times , which requires understanding how many times something gets called. One topic I find myself regularly explaining is how lambda expressions under let expressions affect sharing. Consider the two following examples:In each example, how many times isexecuted to compute? (Assume no advanced optimisations - these often break down on larger examples.)In Example 1 we executetwice, while in Example 2 we executeonce. To go from Example 1 to Example 2 we need to make two changes:Performing either rewrite alone will still result inbeing executed twice.Let's take a look at the original definition ofRewriting this function in English, we can describe it as:But the computation ofdoes not depend on. If the computation ofis expensive, and if we know the function will often be called with the samefor many different values of, it is better to describe it as:The Haskell syntax for this description is:Which would usually be written in the equivalent declaration form as:If we look at the definition ofWe see that the subexpressionoccurs twice. We can perform common subexpression elimination (CSE) and write:With the original definition of, commoning upwould have had no performance benefit - afterwas applied to 1 argument it did nothing but wait for the second argument. However, with the revised definition of, the valuewill create the computation of, which will be performed only once when executed byandThis optimisation technique can be described as:Crucially the function in Step 1 must take it's arguments in an order that allows computation to be performed incrementally.In previous versions of Hoogle , the function I wrote to resolve type synonyms (e.g.) was:Given a list of type synonyms, and a type signature, return the type signature with all synonyms expanded out. However, searching through a list of type synonyms is expensive - it is more efficient to compute a table allowing fast lookup by synonym name. Therefore, I used the optimisation technique above to write:This technique worked well, especially given that the list of synonyms was usually constant. However, from simply looking at the type signatures, someone else is unlikely to guess thatshould be partially applied where possible. An alternative is to make the sharing more explicit in the types, and provide:The disadvantage is the increase in the size of the API - we have gone from one function to two functions and a data type. Something that used to take one function call now takes two.I think all Haskell programmers benefit from understand how the interaction of lambda and let affect sharing. Pushing lambda under let is often a useful optimisation technique, particularly when the resulting function is used in a map. However, I wouldn't usually recommend exporting public API's that rely on partial application to get acceptable performance - it's too hard to discover.