# Choice Calculus

The choice calculus is a simple, formal language for representing *variation*. It supports research in areas that deal with variation in software, such as software product lines. Within this domain, it intends to fulfill a role similar to the lambda calculus in programming languages research, providing a minimal basis for communicating ideas and sharing results.

The core choice calculus is very small—just two constructs, *choices* and generic abstract syntax tree—but it can be easily extended with new language features that support different ways of developing, maintaining, and analyzing variability. The core choice calculus is also generic, emphasizing that variation can often be treated orthogonally to other concerns, but it can be instantiated by different object languages and data types when they interact fundamentally with variability.

The primary reference for the choice calculus is the original TOSEM paper [3], however, the most comprehensive and up-to-date introduction is my PhD thesis [1]. The tutorial from the 2011 GTTSE Summer School [2] is also a good (and hopefully fun) entry point. The tutorial promotes the idea of “variation anywhere” and of generically lifting non-variational algorithms to variational ones.

Using the choice calculus we have developed safe transformations, quality criteria, and new abstraction techniques for variational programs. We have also developed a notion of variational types and extended the Damas-Milner algorithm to perform efficient type inference on a variational lambda calculus [8,9].

The choice calculus also has its own web page: choicecalculus.org.

## Introductions to the Choice Calculus

- The Choice Calculus: A Formal Language of Variation[Abstract, PDF]PhD thesis, Oregon State University, 2013
In this thesis I present the choice calculus, a formal language for representing variation in software and other structured artifacts. The choice calculus is intended to support variation research in a way similar to the lambda calculus in programming language research. Specifically, it provides a simple formal basis for presenting, proving, and communicating theoretical results. It can serve as a common language of discourse for researchers working on different views of similar problems and provide a shared back end in tools.

This thesis collects a large amount of work on the choice calculus. It defines the syntax and denotational semantics of the language along with modular language extensions that add features important to variation research. It presents several theoretical results related to the choice calculus, such as an equivalence relation that supports semantics-preserving transformations of choice calculus expressions, and a type system for ensuring that an expression is well formed. It also presents a Haskell DSEL based on the choice calculus for exploring the concept of variational programming.

- Variation Programming with the Choice Calculus[Abstract, PDF, Code]Generative and Transformational Techniques in Software Engineering IV (GTTSE 2011), Revised and Extended Papers, LNCS vol. 7680, Springer, 2013, 55–100
The choice calculus provides a language for representing and transforming variation in software and other structured documents. Variability is captured in localized choices between alternatives. The space of all variations is organized by dimensions, which provide scoping and structure to choices. The variation space can be reduced through a process of selection, which eliminates a dimension and resolves all of its associated choices by replacing each with one of their alternatives. The choice calculus also allows the definition of arbitrary functions for the flexible construction and transformation of all kinds of variation structures. In this tutorial we will first present the motivation, general ideas, and principles that underlie the choice calculus. This is followed by a closer look at the semantics. We will then present practical applications based on several small example scenarios and consider the concepts of “variation programming” and “variation querying”. The practical applications involve work with a Haskell library that supports variation programming and experimentation with the choice calculus.

- The Choice Calculus: A Representation for Software Variation[Abstract, PDF]ACM Trans. on Software Engineering and Methodology (TOSEM), vol. 21, num. 1, 2011, 6:1–6:27
Many areas of computer science are concerned with some form of variation in software–from managing changes to software over time, to supporting families of related artifacts. We present the choice calculus, a fundamental representation for software variation that can serve as a common language of discourse for variation research, filling a role similar to the lambda calculus in programming language research. We also develop an associated theory of software variation, including sound transformations of variation artifacts, the definition of strategic normal forms, and a design theory for variation structures, which will support the development of better algorithms and tools.

## Applications of the Choice Calculus

- Variational Databases[Abstract, PDF]Int. Symp. on Database Programming Languages (DBPL), ACM, 2017, 11:1–11:4
Data variations are prevalent in real-world applications. For example, software vendors handle variations in the business requirements, conventions, and environmental settings of a software product using hundreds of features each combination of which creates a different version of the product. In database-backed software, the database of each version may have a different schema and different content. Variations in the value and representation of each element in a dataset give rise to numerous variants in these applications. Users often would like to express information needs over all such variants. For example, a software vendor would like to perform common tests over all versions of its product, e.g., whether each relation in a relational database has a primary key. Hence, users need a query interface that hides the variational nature of the data and processes a query over all variations of a dataset. We propose a novel abstraction called a

*variational database*that provides a compact and structured representation of general forms of data variations and enables users to query database variations easily. - Concepts, Operations, and Feasibility of a Projection-Based Variation Control System[Abstract, PDF, Code/Data]IEEE Int. Conf. on Software Maintenance and Evolution (ICSME), 2016, 323–333
Highly configurable software often uses preprocessor annotations to handle variability. However, understanding, maintaining, and evolving code with such annotations is difficult, mainly because a developer has to work with all variants at a time. Dedicated methods and tools that allow working on a subset of all variants could ease the engineering of highly configurable software. We investigate the potential of one kind of such tools: projection-based variation control systems integrated with projectional editors. For such systems we aim to understand: (i) what end-user operations they need to support, and (ii) whether they can realize the actual evolution of real-world, highly configurable software. We conduct an experiment that investigates variability-related evolution patterns and that evaluates the feasibility of a projection-based variation control system by replaying parts of the history of a real-world 3D printer firmware project. Among others, we show that the prototype variation control system does indeed support the evolution of a highly configurable system and that in general, it does not degrade the code.

- Projectional Editing of Variational SoftwareACM SIGPLAN Int. Conf. on Generative Programming and Component Engineering (GPCE), 2014, 29–38Best paper[Abstract, PDF]
Editing the source code of variational software is complicated by the presence of variation annotations, such as #ifdef statements, and by code that is only included in some configurations. When editing some configurations and not others, it would be easier to edit a simplified version of the source code that includes only the configurations we currently care about. In this paper, we present a projectional editing model for variational software. Using our approach, a programmer can partially configure a variational program, edit this simplified view of the code, and then automatically update the original, fully variational source code. The model is based on an isolation principle where edits affect only the variants that are visible in the view. We show that this principle has several nice properties that are suggested by related work on bidirectional transformations.

- Extending Type Inference to Variational Programs[Abstract, PDF]ACM Trans. on Programming Languages and Systems (TOPLAS), vol. 36, num. 1, 2014, 1:1–1:54
Through the use of conditional compilation and related tools, many software projects can be used to generate a huge number of related programs. The problem of typing such variational software is difficult. The brute-force strategy of generating all variants and typing each one individually is (1) usually infeasible for efficiency reasons and (2) produces results that do not map well to the underlying variational program. Recent research has focused mainly on efficiency and addressed only the problem of type checking. In this work we tackle the more general problem of variational type inference and introduce variational types to represent the result of typing a variational program. We introduce the variational lambda calculus (VLC) as a formal foundation for research on typing variational programs. We define a type system for VLC in which VLC expressions are mapped to correspondingly variational types. We show that the type system is correct by proving that the typing of expressions is preserved over the process of variation elimination, which eventually results in a plain lambda calculus expression and its corresponding type. We identify a set of equivalence rules for variational types and prove that the type unification problem modulo these equivalence rules is unitary and decidable; we also present a sound and complete unification algorithm. Based on the unification algorithm, the variational type inference algorithm is an extension of algorithm W. We show that it is sound and complete and computes principal types. We also consider the extension of VLC with sum types, a necessary feature for supporting variational data types, and demonstrate that the previous theoretical results also hold under this extension. Finally, we characterize the complexity of variational type inference and demonstrate the efficiency gains over the brute-force strategy.

- An Error-Tolerant Type System for Variational Lambda Calculus[Abstract, PDF]ACM SIGPLAN Int. Conf. on Functional Programming (ICFP), 2012, 29–40
Conditional compilation and software product line technologies make it possible to generate a huge number of different programs from a single software project. Typing each of these programs individually is usually impossible due to the sheer number of possible variants. Our previous work has addressed this problem with a type system for variational lambda calculus (VLC), an extension of lambda calculus with basic constructs for introducing and organizing variation. Although our type inference algorithm is more efficient than the brute-force strategy of inferring the types of each variant individually, it is less robust since type inference will fail for the entire variational expression if any one variant contains a type error. In this work, we extend our type system to operate on VLC expressions containing type errors. This extension directly supports locating ill-typed variants and the incremental development of variational programs. It also has many subtle implications for the unification of variational types. We show that our extended type system possesses a principal typing property and that the underlying unification problem is unitary. Our unification algorithm computes partial unifiers that lead to result types that (1) contain errors in as few variants as possible and (2) are most general. Finally, we perform an empirical evaluation to determine the overhead of this extension compared to our previous work, to demonstrate the improvements over the brute-force approach, and to explore the effects of various error distributions on the inference process.

- #ifdef Confirmed Harmful: Promoting Understandable Software Variation[Abstract, PDF]IEEE Int. Symp. on Visual Languages and Human-Centric Computing (VL/HCC), 2011, 143–150
Maintaining variation in software is a difficult problem that poses serious challenges for the understanding and editing of software artifacts. Although the C preprocessor (CPP) is often the default tool used to introduce variability to software, because of its simplicity and flexibility, it is infamous for its obtrusive syntax and has been blamed for reducing the comprehensibility and maintainability of software. In this paper, we address this problem by developing a prototype for managing software variation at the source code level. We evaluate the difference between our prototype and CPP with a user study, which indicates that the prototype helps users reason about variational code faster and more accurately than CPP. Our results also support the research of others, providing evidence for the effectiveness of related tools, such as CIDE and FeatureCommander.

- Program Fields for Continuous Software[Abstract, PDF]ACM SIGSOFT Workshop on the Future of Software Engineering Research (FoSER), 2010, 105–108
We propose program fields, a formal representation for groups of related programs, as a new abstraction to support future software engineering research in several areas. We will discuss opportunities offered by program fields and research questions that have to be addressed.

## Extensions to the Choice Calculus

- A Calculus for Variational Programming[Abstract, PDF]European Conf. on Object-Oriented Programming (ECOOP), LIPIcs vol. 56, 2016, 6:1–6:28
Variation is ubiquitous in software. Many applications can benefit from making this variation explicit, then manipulating and computing with it directly—a technique we call “variational programming”. This idea has been independently discovered in several application domains, such as efficiently analyzing and verifying software product lines, combining bounded and symbolic model-checking, and computing with alternative privacy profiles. Although these domains share similar core problems, and there are also many similarities in the solutions, there is no dedicated programming language support for variational programming. This makes the various implementations tedious, prone to errors, hard to maintain and reuse, and difficult to compare.

In this paper we present a calculus that forms the basis of a programming language with explicit support for representing, manipulating, and computing with variation in programs and data. We illustrate how such a language can simplify the implementation of variational programming tasks. We present the syntax and semantics of the core calculus, a sound type system, and a type inference algorithm that produces principal types.

- Formula Choice Calculus[Abstract, PDF, Code]Int. Workshop on Feature-Oriented Software Development (FOSD), ACM, 2016, 49–57
The choice calculus is a simple metalanguage and associated theory that has been successfully applied to several problems of interest to the feature-oriented software development community. The formal presentation of the choice calculus essentially restricts variation points, called choices, to vary based on the inclusion or not of a single feature, while in practice variation points may depend on several features. Therefore, in both theoretical applications of the choice calculus, and in tools inspired by the choice calculus, the syntax of choices has often been generalized to depend on an arbitrary propositional formula of features. The purpose of this paper is to put this syntactic generalization on more solid footing by also generalizing the associated theory. Specifically, after defining the syntax and denotational semantics of the formula choice calculus (FCC), we define and prove the soundness of a syntactic equivalence relation on FCC expressions. This effort validates previous work which has implicitly assumed the soundness of rules in the equivalence relation, and also reveals several rules that are specific to FCC. Finally, we describe some further generalizations to FCC and their limits.

- Adding Configuration to the Choice Calculus[Abstract, PDF]Int. Workshop on Variability Modelling of Software-Intensive Systems (VaMoS), ACM, 2013, 13:1–13:8
The choice calculus is a formal language for representing variation in software artifacts. Variability is organized in the choice calculus through the use of dimensions, where each dimension represents a decision that must be made in order to obtain a particular variant. However, the process of selecting alternatives from dimensions was relegated to an external operation. This precludes many interesting variation and reuse patterns, such as nested product lines, and theoretical results, such as a syntactic description of configuration, that would be possible if selection were a part of the language itself.

In this paper we add a selection operation to the choice calculus and illustrate how that increases the expressiveness of the calculus. We investigate some alternative semantics of this operation and study their impact and utility. Specifically, we will examine selection in the context of static and dynamically scoped dimension declarations, as a well as a modest and greedy form of dimension elimination. We also present a design for a type system to ensure configuration safety and modularity of nested product lines.

- A Calculus for Modeling and Implementing Variation[Abstract, PDF]ACM SIGPLAN Int. Conf. on Generative Programming and Component Engineering (GPCE), 2012, 132–140
We present a formal calculus for modeling and implementing variation in software. It unifies the compositional and annotative approaches to feature implementation and supports the development of abstractions that can be used to directly relate feature models to their implementation. Since the compositional and annotative approaches are complementary, the calculus enables implementers to use the best combination of tools for the job and focus on inherent feature interactions, rather than those introduced by biases in the representation. The calculus also supports the abstraction of recurring variational patterns and provides a metaprogramming platform for organizing variation in artifacts.

## Variational Data Structures

- A Choice of Variational Stacks: Exploring Variational Data Structures[Abstract, PDF]Int. Workshop on Variability Modelling of Software-Intensive Systems (VaMoS), ACM, 2017, 28–35
Many applications require not only representing variability in software and data, but also computing with it. To do so efficiently requires variational data structures that make the variability explicit in the underlying data and the operations used to manipulate it. Variational data structures have been developed ad hoc for many applications, but there is little general understanding of how to design them or what tradeoffs exist among them. In this paper, we strive for a more systematic exploration and analysis of a variational data structure. We want to know how different design decisions affect the performance and scalability of a variational data structure, and what properties of the underlying data and operation sequences need to be considered. Specifically, we study several alternative designs of a variational stack, a data structure that supports efficiently representing and computing with multiple variants of a plain stack, and that is a common building block in many algorithms. The different variational stacks are presented as a small product line organized by three design decisions. We analyze how these design decisions affect the performance of a variational stack with different usage profiles. Finally, we evaluate how these design decisions affect the performance of the variational stack in a real-world scenario: in the interpreter when executing real software containing variability.

- Variational Data Structures: Exploring Trade-Offs in Computing With Variability[Abstract, PDF]ACM SIGPLAN Symp. on New Ideas in Programming and Reflections on Software (Onward!), 2014, 213–226
Variation is everywhere, and in the construction and analysis of customizable software it is paramount. In this context, there arises a need for variational data structures for efficiently representing and computing with related variants of an underlying data type. So far, variational data structures have been explored and developed ad hoc. This paper is a first attempt and a call to action for systematic and foundational research in this area. Research on variational data structures will benefit not only customizable software, but many other application domains that must cope with variability. In this paper, we show how support for variation can be understood as a general and orthogonal property of data types, data structures, and algorithms. We begin a systematic exploration of basic variational data structures, exploring the tradeoffs among different implementations. Finally, we retrospectively analyze the design decisions in our own previous work where we have independently encountered problems requiring variational data structures.

- An Abstract Representation of Variational Graphs[Abstract, PDF]Int. Workshop on Feature-Oriented Software Development (FOSD), ACM, 2013, 25–32
In the context of software product lines, there is often a need to represent graphs containing variability. For example, extending traditional modeling techniques or program analyses to variational software requires a corresponding notion of variational graphs. In this paper, we introduce a general model of variational graphs and a theoretical framework for discussing variational graph algorithms. Specifically, we present an abstract syntax based on tagging for succinctly representing variational graphs and other data types relevant to variational graph algorithms, such as variational sets and paths. We demonstrate how (non-variational) graph algorithms can be generalized to operate on variational graphs, to accept variational inputs, and produce variational outputs. Finally, we discuss a filtering operation on variational graphs and how this interacts with variational graph algorithms.