Deforestation: Techniques for Removing Intermediate Data Structures

PhD Qualifying Examination


Title: "Deforestation: Techniques for Removing Intermediate Data Structures"

by

Mr. Yijia CHEN


Abstract:

Deforestation is a compiler optimization that removes intermediate data 
structure allocations from functional programs to improve their efficiency. 
Functional programming style usually encourages programmers to write smaller 
functions and reuse them to build larger ones. This makes code more modular and 
helps to improve the clarity and robustness of programs because programmers can 
focus on smaller tasks and then compose those functions later. However, because 
values need to be created and passed between different functions to "glue" them 
together, programming in this style often leads to more allocations of 
intermediate data structures, which harms the efficiency of programs. This 
motivates the idea of deforestation, which aims at removing unnecessary 
allocations of data structures. In this article, we study various approaches to 
tackle the problem by following two general frameworks. The first class of 
methods falls into the unfold/fold framework, where function definitions are 
extensively inlined in the unfolding process to approximate the execution of 
input programs, and output programs are then extracted with the help of the 
folding process, which ties recursive knots and generates more efficient 
function definitions. The second class of methods belongs to the shortcut 
fusion framework, where special combinators capturing the essence of producing 
and consuming data structures are designed. These combinators can usually be 
deforested easily, and general data structure manipulating functions can be 
rewritten in them to be transformed. We illustrate the core ideas for both 
frameworks, demonstrate a comparison between them, and propose a novel method 
leveraging subtype inference to solve the problem.


Date:                   Wednesday, 3 July 2024

Time:                   9:00am - 11:00am

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Dr. Lionel Parreaux (Supervisor)
                        Dr. Jiasi Shen (Chairperson)
                        Prof. Shing-Chi Cheung
                        Dr. Amir Goharshady