Semantic Soundness for Algebraic Union, Intersection, Negation, and Equi-recursive Types
Boolean-algebraic subtyping (BAS) is a powerful subtyping approach introduced in 2022 as the “secret sauce” enabling backtracking-free principal type inference in the MLstruct research language, a structurally-typed functional programming language with tagged records, tag inheritance, record subtyping, and tag-based pattern matching. By supporting distributive intersection, union, negation, and equi-recursive types, MLscript can express powerful programming patterns, such as subtyped extensible variants, without needing row variables. But the use of atypical subtyping rules that partially violate the usual understanding of intersection and union types, the mutual distributivity between them, and the complexity of coinductive reasoning for equi-recursive types have collectively made the study of BAS difficult. The syntactic soundness proofs provided in the original work are dauntingly complicated and long-winded, obscuring the intuitive meaning of BAS.
In this paper, we distill the simple essence of Boolean-algebraic subtyping: we discover that BAS can be understood under the light of five characteristic Boolean homomorphisms defined on types in context. Two of these map to power sets of simpler objects; the rest map back to types, but under an unguarded coinductive assumptions context. Together, these five homomorphisms let us prove rather directly that BAS is sound, in that it does not relate constructors of incompatible shapes. These homomorphisms are characteristic in the sense that they are sufficient to capture the meaning of subyping: we prove that if an inequality holds between two types under all five homomorphisms, then subtyping holds between these two types in the original context. This directly suggests a new subtyping decision procedure for BAS, which avoids some inefficiencies in the original algorithm, although it still has exponential worst-case time complexity. We prove that the subtyping problem is in fact NP-hard even without recursive types. Finally, we discover that BAS is already powerful enough to encode the removal of a field from a type. This enables us to support extensible records through one new term form and one new typing rule, but, perhaps surprisingly, no changes to subtyping at all.
Our new approach to the semantics of Boolean-algebraic subtyping sheds some light on the core of MLstruct’s type system. It could be adapted to other languages with algebraically-flavored subtyping rules, such as Scala 3 and Ceylon, making their design and verification more approachable. Tellingly, all our subtyping soundness proofs fit inside the main body of this paper, with only some administrative lemmas relegated to the appendix.