A Type Inference and Elaboration Technique for Removing Intermediate Data Structures
Deforestation is a compiler optimization that removes intermediate data structure allocations from functional programs to improve their efficiency. This is an old idea, but previous approaches have proved limited or impractical — they either only worked on compositions of predefined combinators (shortcut fusion), or involved the aggressive unfolding of recursive definitions until a depth limit was reached or a reoccurring pattern was found to tie the recursive knot, resulting in impractical algorithmic complexity and large amounts of code duplication. We present Lumberhack, a general-purpose deforestation approach for purely functional call-by-value programs. Lumberhack uses subtype inference to reason about data structure production and consumption and uses an elaboration pass to fuse the corresponding recursive definitions. It fuses large classes of mutually recursive definitions while avoiding much of the unproductive (and sometimes counter-productive) code duplication inherent in previous approaches. We prove the soundness of Lumberhack using logical relations and experimentally demonstrate significant speedups in the standard nofib benchmark suite. We manually adapted nofib programs to call-by-value semantics and compiled them using the OCaml compiler. The average speedup over the 38 benchmarked programs is 8.2% while the average code size increases by just about 1.79x. In particular, 19 programs see their performance mostly unchanged, 17 programs improve significantly (by an average speedup of 16.6%), and only three programs visibly worsen (by an average slowdown of 1.8%). As a point of comparison, we measured that the well-proven but semi-manual list fusion technique of the Glasgow Haskell Compiler (GHC), which only works for call-by-need programs, had an average speedup of 6.5%. Our technique is in its infancy still and misses many deforestation opportunities. We are confident that further refinements to the core technique will yield greater performance improvements in the future.