Deforestation: Techniques for Removing Intermediate Data Structures

PhD Qualifying Examination


Title: "Deforestation: Techniques for Removing Intermediate Data Structures"

by

Mr. Yijia CHEN


Abstract:

Deforestation is a compiler optimization that removes intermediate data 
structure allocations from functional programs to improve their efficiency. 
Functional programming style usually encourages programmers to write smaller 
functions and reuse them to build larger ones. This makes code more modular 
and helps to improve the clarity and robustness of programs because 
programmers can focus on smaller tasks and then compose those functions 
later. However, because values need to be created and passed between 
different functions to "glue" them together, programming in this style often 
leads to more allocations of intermediate data structures, which harms the 
efficiency of programs. This motivates the idea of deforestation, which aims 
at removing unnecessary allocations of data structures. In this article, we 
study various approaches to tackle the problem by following two general 
frameworks. The first class of methods falls into the unfold/fold framework, 
where function definitions are extensively inlined in the unfolding process 
to approximate the execution of input programs, and output programs are then 
extracted with the help of the folding process, which ties recursive knots 
and generates more efficient function definitions. The second class of 
methods belongs to the shortcut fusion framework, where special combinators 
capturing the essence of producing and consuming data structures are 
designed. These combinators can usually be deforested easily, and general 
data structure manipulating functions can be rewritten in them to be 
transformed. We illustrate the core ideas for both frameworks, demonstrate a 
comparison between them, and propose a novel method leveraging subtype 
inference to solve the problem.


Date:                   Wednesday, 3 July 2024

Time:                   9:00am - 11:00am

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Dr. Lionel Parreaux (Supervisor)
                        Dr. Jiasi Shen (Chairperson)
                        Prof. Shing-Chi Cheung
                        Dr. Amir Goharshady