equivalent of EXPLAIN PLAN with GHC?

joshua: > Hello, I'm quite new to Haskell, but experienced in other languages (C, > Python, Ruby, SQL, etc). I am interested in Haskell because I've heard > that the language is capable of lots of optimizations based on laziness, > and I want to learn more about that. > > I dug in with Project Euler problem #1, and wrote: > > main = print (show (sum [x | x <- [3..999], x `mod` 3 == 0 || x `mod` 5 == 0])) > > So far so good, but I want to have some way of observing what > optimizations GHC has performed. Most notably, in this example I want > to know if the list was ever actually constructed in memory. The "sum" > function only needs the elements one at a time, in order, so if Haskell > and GHC are everything I've heard about them, I would fully expect the > list construction to be optimized out. :) > > Unfortunately I was not able to see any way of examining ghc's output to > determine whether it had performed this optimization. The C it produced > with '-fvia-C -C' was totally unreadable -- it looked like something > from the IOCCC. :( I couldn't find any way to match up any of the code > it had generated with code I had written. My attempts to objdump the > binaries was similarly unproductive. > > Is there any kind of intermediate form that a person can examine to see > how their code is being optimized? Anything like EXPLAIN PLAN in SQL? > It would make it much easier to understand the kinds of optimizations > Haskell can perform. > > I'm not looking so much for profiling -- obvious this program is trivial > and takes no time. I just want to better understand what kind of > optimizations are possible given Haskell's language model. > So the optimization you're looking for here is fusion of some kind. GHC ships with build/foldr fusion, and there are libraries for an alternative system, stream fusion. GHC uses an intermediate representation called 'Core', which is a mini-Haskell, essentially, that is optimized repeatedly via type-preserving transformations. You can inspect this with a number of tools, including "ghc-core" (available on Hackage). Now, you're example uses a list comprehension (which is translated into an enumFromTo call, and a call to filter. It also uses a call to sum, which is a non-fusing left-fold under build/foldr fusion, but fuses under stream fusion. I'll desugar your code explicitly, and translate the calls from build/foldr to stream-fusible functions: $ cabal install uvector import Data.Array.Vector main = print . sumU . filterU (\x -> x `mod` 3 == 0 || x `mod` 5 == 0) $ enumFromToU 3 (999 :: Int) Running through ghc-core we see: 146 PreInlineUnconditionally 320 PostInlineUnconditionally 84 UnfoldingDone 18 RuleFired 7 +# 1 -# 1 <=# 2 ==#->case 1 ># 3 SC:$wfold0 1 fromIntegral/Int->Int 2 streamU/unstreamU 12 LetFloatFromLet 1 EtaReduction 210 BetaReduction 8 CaseOfCase 94 KnownBranch 4 CaseMerge 5 CaseElim 6 CaseIdentity 1 FillInCaseDefault 18 SimplifierDone Showing what transformations happened. Notably, 2 occurences of the "streamU/unstreamU" transformation, to remove intermediate structures. The final code looks like: $s$wfold :: Int# -> Int# $s$wfold = \ (sc_s19l :: Int#) -> case modInt# (-9223372036854775807) 3 of wild21_a14L { __DEFAULT -> case modInt# (-9223372036854775807) 5 of wild211_X159 { __DEFAULT -> $wfold sc_s19l (-9223372036854775806); 0 -> $wfold (+# sc_s19l (-9223372036854775807)) (-9223372036854775806) }; 0 -> $wfold (+# sc_s19l (-9223372036854775807)) (-9223372036854775806) } $wfold :: Int# -> Int# -> Int# $wfold = \ (ww_s189 :: Int#) (ww1_s18d :: Int#) -> case ># ww1_s18d 999 of wild_a15N { False -> case ww1_s18d of wild2_a14K { __DEFAULT -> case modInt# wild2_a14K 3 of wild21_a14L { __DEFAULT -> case modInt# wild2_a14K 5 of wild211_X159 { __DEFAULT -> $wfold ww_s189 (+# wild2_a14K 1); 0 -> $wfold (+# ww_s189 wild2_a14K) (+# wild2_a14K 1) }; 0 -> $wfold (+# ww_s189 wild2_a14K) (+# wild2_a14K 1) }; (-9223372036854775808) -> case modInt# (-9223372036854775808) 3 of wild21_a14N { __DEFAULT -> case lvl_r19G of wild211_a14x { False -> $s$wfold ww_s189; True -> $s$wfold (+# ww_s189 (-9223372036854775808)) }; 0 -> $s$wfold (+# ww_s189 (-9223372036854775808)) } }; True -> ww_s189 } Which might take a while to understand, but the key thing is the types at the top level are Int# -- they don't allocate [Int], but instead use unboxed, machine-level int values. Optimization succesful. There are papers and manuals describing "GHC Core" - the intermediate form - on haskell.org. For fusion, google for "deforestation" or "stream fusion". -- Don