Experimental game theory refers to the use of game theoretic models in simulations and experiments to understand strategic behavior. It is an increasingly important research tool in many fields, including economics, biology and many social sciences, but computer support for such projects is primarily found in custom programs written in general purpose languages.
Here we present Hagl (short for "Haskell game language", and loosely intended to evoke the homophone "haggle"), a domain-specific language embedded in Haskell, intended to drastically reduce the development time of such experiments, and make the definition and exploration of games and strategies simple and fun.
Spreadsheets are popular end-user programming tools. Many people use spreadsheet-computed values to make critical decisions, so spreadsheets must be correct. Proven software engineering principles can assist the construction and maintenance of dependable spreadsheets. However, how can we make this practical for end users? One way is to exploit spreadsheets' idiosyncratic structure to translate software engineering concepts such as type checking and debugging to an end-user programming domain. The simplified computational model and the spatial embedding of formulas, which provides rich contextual information, can simplify these concepts, leading to effective tools for end users.
Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. Most of the currently available tools that try to mitigate this problem require varying levels of user intervention. This paper presents a system, called UCheck, that detects errors in spreadsheets automatically. UCheck carries out automatic header and unit inference, and reports unit errors to the users. UCheck is based on two static analyses phases that infer header and unit information for all cells in a spreadsheet.
We have tested UCheck on a wide variety of spreadsheets and found that it works accurately and reliably. The system was also used in a continuing education course for high school teachers, conducted through Oregon State University, aimed at making the participants aware of the need for quality control in the creation of spreadsheets.
Paper.pdf (548K)
Labels in spreadsheets can be exploited for finding errors in spreadsheet formulas. Previous approaches have either used the positional information of labels or their interpretation as dimension for checking the consistency of formulas. In this paper we demonstrate how these two approaches can be combined. We have formalized a combined reasoning system and have implemented a corresponding prototype system. We have evaluated the system on the EUSES spreadsheet corpus. The evaluation has demonstrated that adding a syntactic, spatial analysis to a dimension inference can significantly improve the rate of detected errors.
Paper.pdf (1.2M)
Continuing our research in explanation-oriented language design, we present a domain-specific visual language for explaining probabilistic reasoning. Programs in this language, called explanation objects, can be manipulated according to a set of laws to automatically generate many equivalent explanation instances. We argue that this increases the explanatory power of our language by allowing a user to view a problem from many different perspectives.
Paper.pdf (272K)
Experimental game theory is an increasingly important research tool in many fields, providing insight into strategic behavior through simulation and experimentation on game theoretic models. Unfortunately, despite relying heavily on automation, this approach has not been well supported by tools. Here we present our continuing work on Hagl, a domain-specific language embedded in Haskell, intended to drastically reduce the development time of such experiments and support a highly explorative research style. In this paper we present a fundamental redesign of the underlying game representation in Hagl. These changes allow us to better leverage domain knowledge by allowing different classes of games to be represented differently, exploiting existing domain representations and algorithms. In particular, we show how this supports analytical extensions to Hagl, and makes strategies for state-based games vastly simpler and more efficient.
Paper.pdf (212K)
We propose a new focus in language design where languages provide constructs that not only describe the computation of results, but also produce explanations of how and why those results were obtained. We posit that if users are to understand computations produced by a language, that language should provide explanations to the user. As an example of such an explanation-oriented language we present a domain-specific language for explaining probabilistic reasoning, a domain that is not well understood by non-experts. We show the design of the DSL in several steps. Based on a story-telling metaphor of explanations, we identify generic constructs for building stories out of events, and obtaining explanations by applying stories to specific examples. These generic constructs are then adapted to the particular explanation domain of probabilistic reasoning. Finally, we develop a visual notation for explaining probabilistic reasoning.
Paper.pdf (592K)
The design of hardware-software systems is a complex and difficult task exacerbated by the very different tools used by designers in each field. Even in small projects, tracking the impact, motivation and context of individual design decisions between designers and over time quickly becomes intractable. In an attempt to bridge this gap, we present a general, low-level model of the system design process. We formally define the concept of a design decision, and provide a hierarchical representation of both the design space and the context in which decisions are made. This model can serve as a foundation for software-hardware system design tools which will help designers cooperate more efficiently and effectively. We provide a high-level example of the use of such a system in a design problem provided through collaboration with NASA.
Paper.pdf (380K)
Based on (1) research into mutation testing for general purpose programming languages, and (2) spreadsheet errors that have been reported in literature, we have developed a suite of mutation operators for spreadsheets. We present an evaluation of the mutation adequacy of du-adequate test suites generated by a constraint-based automatic test-case generation system we have developed in previous work. The results of the evaluation suggest additional constraints that can be incorporated into the system to target mutation adequacy.
In addition to being useful in mutation testing of spreadsheets, the operators can be used in the evaluation of error-detection tools and also for seeding spreadsheets with errors for empirical studies. We describe two case studies where the suite of mutation operators helped us carry out such empirical evaluations. The main contribution of this paper is the suite of mutation operators for spreadsheets that can (1) help with systematic mutation testing of spreadsheets, and (2) be used for carrying out empirical evaluations of spreadsheet tools.
Paper.pdf (2.4M)
Spreadsheets are among the most widely used programming systems in the world. Individuals and businesses use spreadsheets for a wide variety of applications, ranging from performing simple calculations to building complex financial models. In this article, we first discuss how spreadsheet programs are actually functional programs. We then describe concepts in spreadsheet programming, followed by a brief history of spreadsheet systems. Widespread use of spreadsheets, coupled with their high error-proneness and the impact of spreadsheet errors, has motivated research into techniques aimed at the prevention, detection, and correction of errors in spreadsheets. We present an overview of research effort that seeks to rectify this problem.
Paper.pdf (660K)
We present an error-detection and -correction approach for spreadsheets that automatically generates questions about input/output pairs and, depending on the feedback given by the user, proposes changes to the spreadsheet that would correct detected errors. This approach combines and integrates previous work on automatic test-case generation and goal-directed debugging. We have implemented this method as an extension to MS Excel. We carried out an evaluation of the system using spreadsheets seeded with faults using mutation operators. The evaluation shows among other things that up to 93% of the first-order mutants and 98% of the second-order mutants were detected by the system using the automatically generated test cases.
Paper.pdf (780K)
We present a reasoning system for inferring dimension information in spreadsheets. This system can be used to check the consistency of spreadsheet formulas and can be employed to detect errors in spreadsheets. We have prototypically implemented the system as an add-in to Excel. In an evaluation of this implementation we were able to detect dimension errors in almost 50% of the investigated spreadsheets, which shows (i) that the system works reliably in practice and (ii) that dimension information can be well exploited to uncover errors in spreadsheets.
Paper.pdf (258K)
We present a visual language for strategies in game theory, which has potential applications in economics, social sciences, and in general science education. This language facilitates explanations of strategies by visually representing the interaction of players' strategies with game execution. We have utilized the cognitive dimensions framework in the design phase and recognized the need for a new cognitive dimension of "traceability" that considers how well a language can represent the execution of a program. We consider how traceability interacts with other cognitive dimensions and demonstrate its use in analyzing existing languages. We conclude that the design of a visual representation for execution traces should be an integral part of the design of visual languages because understanding a program is often tightly coupled to its execution.
Paper.pdf (399K)
The Inverse Ocean Modeling (IOM) system constructs and runs weak-constraint, four-dimensional variational data assimilation (W4DVAR) for any dynamical model and any observing array. The dynamics and the observing algorithms may be nonlinear but must be functionally smooth. The user need only provide the model and the observing algorithms, together with an interpolation scheme that relates the model numerics to the observer's coordinates. All other model-dependent elements of the Inverse Ocean Modeling assimilation algorithm, including adjoint generators and Monte Carlo estimates of posteriors, have been derived and coded as templates in Parametric FORTRAN. This language has been developed for the IOM but has wider application in scientific programming. Guided by the Parametric FORTRAN templates, and by model information entered via a graphical user interface (GUI), the IOM generates conventional FORTRAN code for each of the many algorithm elements, customized to the user's model. The IOM also runs the various W4DVAR assimilations, which are monitored by the GUI. The system is supported by a Web site that includes interactive tutorials for the assimilation algorithm.
In previous work we have developed a system that automatically checks for unit errors in spreadsheets. In this paper we describe our experiences using the system in a workshop on spreadsheet safety aimed at high school teachers and students. We present the results from a think-aloud study we conducted with five high school teachers and one high school student as the subjects. The study is the first ever to investigate the usability of a type system in spreadsheets. We discovered that end users can effectively use the system to debug a variety of errors in their spreadsheets. This result is encouraging given that type systems are difficult even for programmers. The subjects had difficulty debugging "non-local" unit errors. Guided by the results of the study we devised new methods to improve the error-location inference. We also extended the system to generate change suggestions for cells with unit errors, which when applied, would correct unit errors. These extensions solved the problem that the study revealed in the original system.
Paper.pdf (680K)
Using spreadsheets is the preferred method to calculate, display or store anything that fits into a table-like structure. They are often used by end users to create applications. But they have one critical drawback - they are very error-prone. To reduce the error-proneness, we purpose a new way of object-oriented modeling of spreadsheets prior to using them. These spreadsheet models, termed ClassSheets, are used to generate concrete spreadsheets on the instance level. By this approach sources of errors are reduced and spreadsheet applications are easier to understand.
In previous work we have tried to transfer ideas that have been successful in general-purpose programming languages and mainstream software engineering into the realm of spreadsheets, which is one important example of an end-user programming environment. More specifically, we have addressed the questions of how to employ the concepts of type checking, program generation and maintenance, and testing in spreadsheets. While the primary objective of our work has been to offer improvements for end-user productivity, we have tried to follow two particular principles to guide our research.
The dominant share of software development costs is spent on software maintenance, particularly the process of updating programs in response to changing requirements. Currently, such program changes tend to be performed using text editors, an unreliable method that often causes many errors. In addition to syntax and type errors, logical errors can be easily introduced since text editors cannot guarantee that changes are performed consistently over the whole program. All these errors can cause a correct and perfectly running program to become instantly unusable. It is not surprising that this situation exists because the ``text-editor method'' reveals a low-level view of programs that fails to reflect the structure of programs.
We address this problem by pursuing a programming-language-based approach to program updates. To this end we discuss in this paper the design and requirements of an update language for expressing update programs. We identify as the essential part of any update language a \emph{scope update} that performs coordinated update of the definition and all uses of a symbol. As the underlying basis for update languages, we define an update calculus for updating lambda-calculus programs. We develop a type system for the update calculus that infers the possible type changes that can be caused by an update program. We demonstrate that type-safe update programs that fulfill certain structural constraints preserve the type-correctness of lambda terms. The update calculus can serve as a basis for higher-level update languages, such as for Haskell or Java.
Paper.pdf (384K)
Parametric Fortran is an extension of Fortran that supports defining Fortran program templates by allowing the parameterization of arbitrary Fortran constructs. A Fortran program template can be translated into a regular Fortran program guided by values for the parameters. This paper describes the design, implementation, and applications of Parametric Fortran. Parametric Fortran is particularly useful in scientific computing. The applications include defining generic functions, removing duplicated code, and automatic differentiation. The described techniques have been successfully employed in a project that implements a generic inverse ocean modeling system.
Paper.pdf (415K)
We present a spreadsheet debugger targeted at end users. Whenever the computed output of a cell is incorrect, the user can supply an expected value for a cell, which is employed by the system to generate a list of change suggestions for formulas that, when applied, would result in the user-specified output. The change suggestions are ranked using a set of heuristics.
In previous work, we had presented the system as a proof of concept. In this paper, we describe a systematic evaluation of the effectiveness of inferred change suggestions and the employed ranking heuristics. Based on the results of the evaluation we have extended both, the change inference process and the ranking of suggestions. An evaluation of the improved system shows that change inference process and the ranking heuristics have both been substantially improved and that the system performs effectively.
Paper.pdf (282K)
Spreadsheets are widely used, and studies have shown that most end-user spreadsheets contain non-trivial errors. Most of the currently available tools that try to mitigate this problem require varying levels of user intervention. This paper presents a system, called UCheck, that detects errors in spreadsheets automatically. UCheck carries out automatic header and unit inference, and reports unit errors to the users. UCheck is based on two static analyses phases that infer header and unit information for all cells in a spreadsheet.
We have tested UCheck on a wide variety of spreadsheets and found that it works accurately and reliably. The system was also used in a continuing education course for high school teachers, conducted through Oregon State University, aimed at making the participants aware of the need for quality control in the creation of spreadsheets.
Paper.pdf (2M)
Haskell programmers who deal with complex data types often need to apply functions to specific nodes deeply nested inside of terms. Typically, implementations for those applications require so-called boilerplate code, which recursively visits the nodes and carries the functions to the places where they need to be applied. The scrap-your-boilerplate approach proposed by Lämmel and Peyton Jones tries to solve this problem by defining a general traversal design pattern that performs the traversal automatically so that the programmers can focus on the code that performs the actual transformation.
In practice we often encounter applications that require variations of the recursion schema and call for more sophisticated generic traversals. Defining such traversals from scratch requires a profound understanding of the underlying mechanism and is everything but trivial.
In this paper we analyze the problem domain of recursive traversal strategies, by integrating and extending previous approaches. We then extend the scrap-your-boilerplate approach by rich traversal strategies and by a combination of transformations and accumulations, which leads to a comprehensive recursive traversal library in a statically typed framework.
We define a two-layer library targeted at general programmers and programmers with knowledge in traversal strategies. The high-level interface defines a universal combinator that can be customized to different one-pass traversal strategies with different coverage and different traversal order. The lower-layer interface provides a set of primitives that can be used for defining more sophisticated traversal strategies such as fixpoint traversals. The interface is simple and succinct. Like the original scrap-your-boilerplate approach, it makes use of rank-2 polymorphism and functional dependencies, implemented in GHC.
Paper.pdf (196K)
In this paper we present a system that helps users test their spreadsheets using automatically generated test cases. The system generates the test cases by backward propagation and solution of constraints on cell values. These constraints are obtained from the formula of the cell that is being tested when we try to execute all feasible du-paths within the formula. AutoTest generates test cases that execute all feasible du-pairs. If infeasible du-associations are present in the spreadsheet, the system is capable of detecting and reporting all of these to the user. We also present a comparative evaluation of our approach against the "Help Me Test" mechanism in Forms/3 and show that our approach is faster and produces test suites that give better du-coverage.
Paper.pdf (522K)
Although researchers have developed several ways to reason about the location of faults in spreadsheets, no single form of reasoning is without limitations. Multiple types of errors can appear in spreadsheets, and various fault localization techniques differ in the kinds of errors that they are effective in locating. In this paper, we report empirical results from an emerging system that attempts to improve fault localization for end-user programmers by sharing the results of the reasoning systems found in WYSIWYT and UCheck. By evaluating the visual feedback from each fault localization system, we shed light on where these different forms of reasoning and combinations of them complement - and contradict - one another, and which heuristics can be used to generate the best advice from a combination of these systems.
Paper.pdf (934K)
Spreadsheets are the most popular programming systems in use today. Since spreadsheets are visual, first-order functional languages, research into the foundations of spreadsheets is therefore a highly relevant topic for the principles and, in particular, the practice, of declarative programming. Since the error rate in spreadsheets is very high and since those errors have significant impact, methods and tools that can help detect and remove errors from spreadsheets are very much needed. Type systems have traditionally played a strong role in detecting errors in programming languages, and it is therefore reasonable to ask whether type systems could not be helpful in improving the current situation of spreadsheet programming. In this paper we introduce a type system and a type inference algorithm for spreadsheets and demonstrate how this algorithm and the underlying typing concept can identify programming errors in spreadsheets. In addition, we also demonstrate how the type inference algorithm can be employed to infer models, or specifications, for spreadsheets, which can be used to prevent future errors in spreadsheets.
Paper.pdf (1M)
We present a study investigating the performance of a system for automatically inferring spreadsheet templates. These templates allow users to safely edit spreadsheets, that is, certain kinds of errors such as range, reference, and type errors can be provably prevented. Since the inference of templates is inherently ambiguous, such a study is required to demonstrate the effectiveness of any such automatic system. The study results show that the system considered performs significantly better than subjects with intermediate to expert level programming expertise. These results are important because the translation of existing spreadsheets into a system based on safety-guaranteeing templates cannot be performed without automatic support. We carried out post-hoc analyses of the video recordings of the subjects' interactions with the spreadsheets and found that expert-level subjects spend less time and inspect fewer cells in the spreadsheet and develop more accurate templates than less experienced subjects.
Paper.pdf (1M)
Although researchers have developed several ways to reason about the location of faults in spreadsheets, no single form of reasoning is without limitations. Multiple types of errors can appear in spreadsheets, and various fault localization techniques differ in the kinds of errors that they are effective in locating. Because end users who debug spreadsheets consistently follow the advice of fault localization systems, it is important to ensure that fault localization feedback corresponds as closely as possible to where the faults actually appear.
In this paper, we describe an emerging system that attempts to improve fault localization for end-user programmers by sharing the results of the reasoning systems found in WYSIWYT and UCheck. By understanding the strengths and weaknesses of the reasoning found in each system, we expect to identify where different forms of reasoning complement one another, when different forms of reasoning contradict one another, and which heuristics can be used to select the best advice from each system. By using multiple forms of reasoning in conjunction with heuristics to choose among recommendations from each system, we expect to produce unified fault localization feedback whose combination is better than the sum of the parts.
Paper.pdf (524K)
A huge discrepancy between theory and practice exists in one popular application area of functional programming-spreadsheets. Although spreadsheets are the most frequently used (functional) programs, they fall short of the quality level that is expected of functional programs, which is evidenced by the fact that existing spreadsheets contain many errors, some of which have serious impacts.
We have developed a template specification language that allows the definition of spreadsheet templates that describe possible spreadsheet evolutions. This language is based on a table calculus that formally captures the process of creating and modifying spreadsheets. We have developed a type system for this calculus that can prevent type, reference, and omission errors from occurring in spreadsheets. On the basis of the table calculus we have developed Gencel, a system for generating reliable spreadsheets. We have implemented a prototype version of Gencel as an extension of Excel.
Paper.pdf (693K)
We describe a type-inference algorithm that is based on labeling nodes with type information in a graph that represents type constraints. This algorithm produces the same results as the famous algorithm of Milner, but is much simpler to use, which is of importance especially for teaching type systems and type inference.
The proposed algorithm employs a more concise notation and yields inferences that are shorter than applications of the traditional algorithm. Simplifications result, in particular, from three facts: (1) We do not have to maintain an explicit type environment throughout the algorithm because the type environment is represented implicitly through node labels. (2) The use of unification is simplified through label propagation along graph edges. (3) The typing decisions in our algorithm are dependency-driven (and not syntax-directed), which reduces notational overhead and bookkeeping.
Paper.pdf (217K)
Many scientific applications benefit from simulation. However, programming languages used in simulation, such as C++ or Matlab, approach problems from a deterministic procedural view, which seems to differ, in general, from many scientists' mental representation. We apply a domain-specific language for probabilistic programming to the biological field of gene modeling, showing how the mental-model gap may be bridged. Our system assisted biologists in developing a model for genome evolution by separating the concerns of model and simulation and providing implicit probabilistic non-determinism.
Paper.pdf (199K)
Parametric Fortran is an extension of Fortran that supports the construction of generic programs by allowing the parameterization of arbitrary Fortran constructs. A parameterized Fortran program can be translated into a regular Fortran program guided by values for the parameters. This paper describes the extensions of Parametric Fortran by two new language features, accessors and list parameters. These features particularly address the code duplication problem, which is a major problem in the context of scientific computing. The described techniques have been successfully employed in a project that implements a generic inverse ocean modeling system.
Paper.pdf (147K)
At the heart of functional programming rests the principle of referential transparency, which in particular means that a function f applied to a value x always yields one and the same value y=f(x). This principle seems to be violated when contemplating the use of functions to describe probabilistic events, such as rolling a die: It is not clear at all what exactly the outcome will be, and neither is it guaranteed that the same value will be produced repeatedly. However, these two seemingly incompatible notions can be reconciled if probabilistic values are encapsulated in a data type.
In this paper, we will demonstrate such an approach by describing a probabilistic functional programming (PFP) library for Haskell. We will show that the proposed approach not only facilitates probabilistic programming in functional languages, but in particular can lead to very concise programs and simulations. In particular, a major advantage of our system is that simulations can be specified independently from their method of execution. That is, we can either fully simulate or randomize any simulation without altering the code which defines it.
Paper.pdf (160K)
Spreadsheets are widely used in all kinds of business applications. Numerous studies have shown that they contain many errors that sometimes have dramatic impacts. One reason for this situation is the low-level, cell-oriented development process of spreadsheets. We improve this process by introducing and formalizing a higher-level object-oriented model termed ClassSheet. While still following the tabular look-and-feel of spreadsheets, ClassSheets allow the developer to express explicitly business object structures within a spreadsheet, which is achieved by integrating concepts from the UML (Unified Modeling Language). A stepwise automatic transformation process generates a spreadsheet application that is consistent with the ClassSheet model. Thus, by deploying the formal underpinning of ClassSheets, a large variety of errors can be prevented that occur in many existing spreadsheet applications today. The presented ClassSheet approach links spreadsheet applications to the object-oriented modeling world and advocates an automatic model-driven development process for spreadsheet applications of high quality.
Paper.pdf (607K)
We present a semi-automatic debugger for spreadsheet systems that is specifically targeted at end-user programmers. Users can report expected values for cells that yield incorrect results. The system then generates change suggestions that could correct the error. Users can interactively explore, apply, refine, or reject these change suggestions. The computation of change suggestions is based on a formal inference system that propagates expected values backwards across formulas. The system is fully integrated into Microsoft Excel and can be used to automatically detect and correct various kinds of errors in spreadsheets. Test results show that the system works accurately and reliably.
Paper.pdf (1M)
We introduce a visual specification language for spreadsheets that allows the definition of spreadsheet templates. A spreadsheet generator can automatically create Excel spreadsheets from these templates together with customized update operations. It can be shown that spreadsheets created in this way are free from a large class of errors, such as reference, omission, and type errors. We present a formal definition of the visual language for templates and describe the process of generating spreadsheets from templates. In addition, we present an editor for templates and analyze the editor using the Cognitive Dimensions framework.
Paper.pdf (500K)
In previous work we have designed and implemented an automatic reasoning system for spreadsheets, called UCheck, that infers unit information for cells in a spreadsheet. Based on this unit information, UCheck can identify cells in the spreadsheet that contain erroneous formulas. However, the information about an erroneous cell is reported to the user currently in a rather crude way by simply coloring the cell, which does not tell anything about the nature of error and thus offers no help to the user as to how to fix it. In this paper we describe an extension of UCheck, called UFix, which improves the error messages reported to the spreadsheet user dramatically. The approach essentially consists of three steps: First, we identify different categories of spreadsheet errors from an end-user's perspective. Second, we map units that indicate erroneous formulas to these error categories. Finally, we create customized error messages from the unit information and the identified error category. In many cases, these error messages also provide suggestions on how to fix the reported errors.
Paper.pdf (1M)
Existing spreadsheet systems allow users to change cells arbitrarily, which is a major source of spreadsheet errors. We propose a system that prevents errors in spreadsheets by restricting spreadsheet updates to only those that are logically and technically correct. The system is based on the concept of table specifications that describe the principal structure of the initial spreadsheet and all of its future versions. We have developed a program generator that translates a table specification into an initial spreadsheet together with customized update operations for changing cells and inserting/deleting rows and columns for this particular specification. We have designed a type system for table specifications that ensures the following form of "spreadsheet maintenance safety": Update operations that are generated from a type-correct table specification are proved to transform the spreadsheet only according to the specification and to never produce any omission, reference, or type errors. Finally, we have developed a prototype as an extension to Excel, which has been shown by a preliminary usability study to be well accepted by end users.
Paper.pdf (235K)
We present a program-generation approach to address a software-reuse challenge in the area of scientific computing. More specifically, we describe the design of a program generator for the specification of subroutines that can be generic in the dimensions of arrays, parameter lists, and called subroutines. We describe the application of that approach to a real-world problem in scientific computing, which requires the generic description of inverse ocean modeling tools. In addition to a compiler that can transform generic specifications into efficient Fortran code for models, we have also developed a type system that can identify possible errors already in the specifications. This type system is important for the acceptance of the program generator among scientists because it prevents a large class of errors in the generated code.
Paper.pdf (363K)
This paper describes the design and implementation of a unit and header inference system for spreadsheets. The system is based on a formal model of units that we have described in previous work. Since the unit inference depends on information about headers in a spreadsheet, a realistic unit inference system requires a method for automatically determining headers. The present paper describes (1) several spatial-analysis algorithms for header inference, (2) a framework that facilitates the integration of different algorithms, and (3) the implementation of the system. The combined header and unit inference system is fully integrated into Microsoft Excel and can be used to automatically identify various kinds of errors in spreadsheets. Test results show that the system works accurately and reliably.
Paper.pdf (1M)
The structure of monadic functional programs allows the integration of many different features by just changing the definition of the monad and not the rest of the program, which is a desirable feature from a software engineering and software maintenance point of view. We describe an algorithm for the automatic transformation of a group of functions into such a monadic form. We identify two correctness criteria and argue that the proposed transformation is at least correct in the sense that transformed programs yield the same results as the original programs modulo monad constructors. The translation of a set of functions into monadic form is in most cases only a first step toward an extension of a program by new features. The extended behavior can be realized by choosing an appropriate monad type and by inserting monadic actions into the functions that have been transformed into monadic form. We demonstrate an approach to the integration of monadic actions that is based on the idea of specifying context-dependent rewritings.
Paper.pdf (255K)
We describe the design and implementation of a program generator that can produce extensions of Fortran that are specialized to support the programming of particular applications. Extensions are specified through parameter structures that can be referred to in Fortran programs to specify the dependency of program parts on these parameters. By providing parameter values, a parameterized Fortran program can be translated into a regular Fortran program. We describe as a real-world application of this program generator the implementation of a generic inverse ocean modeling tool. The program generator is implemented in Haskell and makes use of sophisticated features, such as multi-parameter type classes, existential types, and generic programming extensions and thus represents the application of an advanced applicative language to a real-world problem.
Paper.pdf (150K)
In this article we will illustrate with an example that modern functional programming languages like Haskell can be used effectively for programming search problems, in contrast to the widespread belief that Prolog is much better suited for tasks like these.
Paper.pdf (53K)
Existing spatiotemporal data models and query languages offer only basic support to query changes of data. In particular, although these systems often allow the formulation of queries that ask for changes at particular time points, they fall short of expressing queries for sequences of such changes. In this chapter we propose the concept of spatiotemporal patterns as a systematic and scalable concept to query developments of objects and their relationships. Based on our previous work on spatiotemporal predicates, we outline the design of spatiotemporal patterns as a query mechanism to characterize complex object behaviors in space and time. We will not present a fully-fledged design. Instead, we will focus on deriving constraints that will allow spatiotemporal patterns to become well-designed composable abstractions that can be smoothly integrated into spatiotemporal query languages. Spatiotemporal patterns can be applied in many different areas of science, for example, in geosciences, geophysics, meteorology, ecology, and environmental studies. Since users in these areas typically do not have extended formal computer training, it is often difficult for them to use advanced query languages. A visual notation for spatiotemporal patterns can help solving this problem. In particular, since spatial objects and their relationships have a natural graphical representation, a visual notation can express relationships in many cases implicitly where textual notations require the explicit application of operations and predicates. Based on our work on the visualization of spatiotemporal predicates, we will sketch the design of a visual language to formulate spatiotemporal patterns.
Paper.pdf (232K)
Existing solutions to data and schema integration require user interaction/input to generate a data transformation between two different schemas. These approaches are not appropriate in situations where many data transformations are needed or where data transformations have to be generated frequently. We describe an approach to an automatic XML-transformation generator that is based on a theory of information-preserving and -approximating XML operations. Our approach builds on a formal semantics for XML operations and their associated DTD transformation and on an axiomatic theory of information preservation and approximation. This combination enables the inference of a sequence of XML transformations by a search algorithm based on the operations' DTD transformations.
Paper.ps.gz (160K), Paper.pdf (135K)
We demonstrate an approach to transform keyword queries automatically into queries that combine keywords appropriately by boolean operations, such as and and or. Our approach is based on an analysis of relationships between the keywords using a taxonomy. The transformed queries will be sent to a search engine, and the returned results will be presented to the user. We evaluate the effectiveness of our approach by comparing the precision of the results returned for the generated query with the precision of the result for the original query. Our experiments indicate that our approach can improve the precision of the results considerably.
Paper.ps.gz (152K), Paper.pdf (256K)
Many software maintenance problems are caused by using text editors to change programs. A more systematic and reliable way of performing program updates is to express changes with an update language. In particular, updates should preserve the syntax- and type-correctness of the transformed object programs. We describe an update calculus that can be used to update lambda-calculus programs. We develop a type system for the update language that infers the possible type changes that can be caused by an update program. We demonstrate that type-safe update programs that fulfill certain structural constraints preserve the type-correctness of lambda terms.
Paper.ps.gz (164K), Paper.pdf (159K)
We present a visual language for querying and transforming XML data. The language is based on a visual document metaphor and the notion of document patterns and rules. It combines a dynamic form-based interface for defining queries and transformation rules with powerful pattern matching capabilities and offers thus a highly expressive visual language. The design of the visual language is specifically targeted at end users.
Paper.ps.gz (114K), Paper.pdf (193K)
We develop DBMS data models and query languages to deal with geometries changing over time. In contrast to most of the earlier work on this subject, these models and languages are capable of handling continuously changing geometries, or moving objects. We focus on two basic abstractions called moving point and moving region. A moving point can represent an entity for which only the position in space is relevant. A moving region captures moving as well as growing or shrinking regions. Examples for moving points are people, polar bears, cars, trains, or air planes; examples for moving regions are hurricanes, forest fires, or oil spills in the sea. The main line of research presented in this chapter takes a data type oriented approach. The idea is to view moving points and moving regions as three-dimensional (2D-space + time) or higher-dimensional entities whose structure and behavior is captured by modeling them as abstract data types. These data types can then be integrated as attribute types into relational, object-oriented, or other DBMS data models; they can be implemented as extension packages ("data blades") for suitable extensible DBMSs.
Queries about objects that change their spatial attributes over time become particularly interesting when they ask for changes in the spatial relationships between different objects. We propose a visual notation that is able to describe scenarios of changing object relationships. The visual language is based on the idea to analyze two-dimensional traces of moving objects to infer a temporal development of their mutual spatial relationships. We motivate the language design by successively simplifying object traces to their intrinsic properties. The notation can be effectively translated in to a calculus of spatio-temporal predicates that formally characterizes the evolution of spatial relationships. We also outline a user interface that supports specifications by menus and a drawing editor. The visual notation can be used directly as a visual query interface to spatio-temporal databases, or it can provide predicate specifications that can be integrated into textual query languages leading to heterogeneous languages.
Paper.ps.gz (258K), Paper.pdf (443K)
We describe the design of a rule-based language for expressing changes to Haskell programs in a systematic and reliable way. The update language essentially offers update commands for all constructs of the object language (a subset of Haskell). The update language can be translated into a core calculus consisting of a small set of basic updates and update combinators. The key construct of the core calculus is a scope update mechanism that allows (and enforces) update specifications for the definition of a symbol together with all of its uses. The type of an update program is given by the possible type changes it can cause for an object programs. We have developed a type-change inference system to automatically infer type changes for updates. Updates for which a type change can be successfully inferred and that satisfy an additional structural condition can be shown to preserve type correctness of object programs. In this paper we define the Haskell Update Language HULA and give a translation into the core update calculus. We illustrate HULA and its translation into the core calculus by several examples.
Paper.ps.gz (71K), Paper.pdf (134K)
In contrast to the field view of spatial data that basically views spatial data as a mapping from points into some features, the object view clusters points by features and their values into spatial objects of type point, line, or region, which can then be integrated as abstract data types into a database system. We can apply the ADT idea to model spatio-temporal data and to integrate such ADTs into data models. The basic idea is very simple and starts from the observation that anything that changes over time can be regarded as a function that maps from time into the data of consideration. The integration of spatio-temporal data types into data models supports the definition of a query languages because the interaction between ADT operations and general querying constructs is constrained to a few well-defined places, so that we are immediately able to pose spatio-temporal queries in a straightforward way.
Paper.ps.gz (96K), Paper.pdf (180K)
Integrating spatio-temporal data as abstract data types into already existing data models is a promising approach to creating spatio-temporal query languages. Based on a formal foundation presented elsewhere, we present the main aspects of an SQL-like, spatio-temporal query language, called STQL. As one of its essential features, STQL allows to query and to retrieve moving objects which describe continuous evolutions of spatial objects over time. We consider spatio-temporal operations that are particularly useful in formulating queries, such as the temporal lifting of spatial operations, the projection into space and time, selection, and aggregation. Another important class of queries is concerned with developments, which are changes of spatial relationships over time. Based on the notion of spatio-temporal predicates we provide a framework in STQL that allows a user to build more and more complex predicates starting with a small set of elementary ones. We also describe a visual notation to express developments.
Paper.ps.gz (74K), Paper.pdf (127K)
We have been working on a unit system for end-user spreadsheets that is based on the concrete notion of units instead of the abstract concept of types. In previous work, we defined such a system formally. In this paper, we describe a visual system to support the formal reasoning in two ways. First, it supports communicating and explaining the unit inference process to users. Second and more important, our approach allows users to change the system's reasoning by adding and customizing the system's inference rules.
Paper.ps.gz (92K), Paper.pdf (93K)
We define a unit system for end-user spreadsheets that is based on the concrete notion of units instead of the abstract concept of types. Units are derived from header information given by spreadsheets. The unit system contains concepts, such as dependent units, multiple units, and unit generalization, that allow the classification of spreadsheet contents on a more fine-grained level than types do. Also, because communication with the end user happens only in terms of objects that are contained in the spreadsheet, our system does not require end users to learn new abstract concepts of type systems.
Paper.ps.gz (225K), Paper.pdf (193K)
This paper investigates temporal changes of topological relationships and thereby integrates two important research areas: first, two-dimensional topological relationships that have been investigated quite intensively, and second, the change of spatial information over time. We investigate spatio-temporal predicates, which describe developments of well-known spatial topological relationships. A framework is developed in which spatio-temporal predicates can be obtained by temporal aggregation of elementary spatial predicates and sequential composition. We compare our framework with two other possible approaches: one is based on the observation that spatio-temporal objects correspond to three-dimensional spatial objects for which existing topological predicates can be exploited. The other approach is to consider possible transitions between spatial configurations. These considerations help to identify a canonical set of spatio-temporal predicates.
Paper.ps.gz (239K), Paper.pdf (405K)
We propose to view programs as abstract data types and to perform program changes by applying well-defined operations on programs. The ADT view of programs goes beyond the approach of syntax-directed editors and proof-editors since it is possible to combine basic update operations into larger update programs that can be stored and reused. It is crucial for the design of update operations and their composition to know which properties they can preserve when they are applied to a program.
In this paper we argue in favor of the abstract data type view of programs, and present a general framework in which different programming languages, update languages, and properties can be studied.
Paper.ps.gz (54K), Paper.pdf (120K)
We propose a new style of writing graph algorithms in functional languages which is based on an alternative view of graphs as inductively defined data types. We show how this graph model can be implemented efficiently, and then we demonstrate how graph algorithms can be succinctly given by recursive function definitions based on the inductive graph view. We also regard this as a contribution to the teaching of algorithm and data structures in functional languages since we can use the functional-style graph algorithms instead of the imperative algorithms that are dominant today.
Paper.ps.gz (125K), Paper.pdf (297K)
We propose three extensions to patterns and pattern matching in Haskell. The first, pattern guards, allows the guards of a guarded equation to match patterns and bind variables, as well as to test boolean condition. For this we introduce a natural generalization of guard expressions to guard qualifiers. A frequently-occurring special case is that a function should be applied to a matched value, and the result of this is to be matched against another pattern. For this we introduce a syntactic abbreviation, transformational patterns, that is particularly useful when dealing with views. These proposals can be implemented with very modest syntactic and implementation cost. They are upward compatible with Haskell; all existing programs will continue to work. We also offer a third, much more speculative proposal, which provides the transformational-pattern construct with additional power to explicitly catch pattern match failure. We demonstrate the usefulness of the proposed extension by several examples, in particular, we compare our proposal with views, and we also discuss the use of the new patterns in combination with equational reasoning.
Paper.dvi.gz (32K), Paper.ps.gz (72K), Paper.pdf (203K)
We propose a form-based interface to expresses XML queries and transformations by so-called "document patterns" that describe properties of the requested information and optionally specify how the found results should be reformatted or restructured. The interface is targeted at casual users who want a fast and easy way to find information in XML data resources. By using dynamic forms an intuitive and easy-to-use interface is obtained that can be used to solve a wide spectrum of tasks, ranging from simple selections and projections to advanced data restructuring tasks. The interface is especially suited for end users since it can be used without having to learn a programming or query language and without knowing anything about (query or XML) language syntax, DTDs or schemas. Nevertheless, DTDs can be well exploited, in particular, on the user interface level to support the semi-automatic construction of queries.
Paper.ps.gz (146K), Paper.pdf (108K)
XML is becoming one of the most influential standards concerning data exchange and Web-presentations. In this paper we present a visual language for querying and transforming XML data. The language is based on a visual document metaphor and the notion of document patterns. It combines an intuitive, dynamic form-based interface for defining queries and transformation rules with powerful pattern matching capabilities and offers thus a highly expressive yet easy to use visual language. Providing visual language support for XML not only helps end users, it is also a big opportunity for the VL community to receive greater attention.
Paper.ps.gz (54K), Paper.pdf (120K)
We show how to define recursion operators for random access data types, that is, ADTs that offer random access to their elements, and how algorithms on arrays and on graphs can be expressed by these operators. The approach is essentially based on a representation of ADTs as bialgebras that allows catamorphisms between ADTs to be defined by composing one ADT's algebra with the other ADT's coalgebra. The extension to indexed data types enables the development of specific recursion schemes, which are, in particular, suited to express a large class of graph algorithms.
Paper.ps.gz (127K), Paper.pdf (227K), Long Version.ps.gz (136K), Long Version.pdf (252K)
Maps are a fundamental metaphor in GIS. We introduce several new operations on maps that go far beyond well-known operations like overlay or reclassification. In particular, we identify and generalize operations that are of practical interest for spatial analysis and that can be useful in many GIS applications. We give a precise definition of these operations based on a formal model of spatial partitions. This provides a theoretical foundation for maps which also serves as a specification for implementations.
Paper.ps.gz (96K), Paper.pdf (178K)
In this paper we propose a visual interface for the specification of predicates to be used in queries on spatio-temporal databases. The approach is based on a visual specification method for temporally changing spatial situations. This extends existing concepts for visual spatial query languages, which are only capable of querying static spatial situations. We outline a preliminary user interface that supports the specification on an intuitive and easily manageable level, and we describe the design of the underlying visual language. The visual notation can be used directly as a visual query interface to spatio-temporal databases, or it can provide predicate specifications that can be integrated into textual query languages leading to heterogeneous languages.
Paper.ps.gz (72K), Paper.pdf (188K)
Spatio-temporal databases deal with geometries changing over time. The goal
of our work is to provide a DBMS data model and query language capable of
handling such time-dependent geometries, including those changing continuously
which describe moving objects. Two fundamental abstractions are moving point
and moving region, describing objects for which only the time-dependent
position, or position and extent, are of interest, respectively. We propose to
represent such time-dependent geometries as attribute data types with suitable
operations, that is, to provide an abstract data type extension to a DBMS data
model and query language.
Paper.ps.gz (143K),
Paper.pdf (518K)
The Voronoi diagram is a famous structure of computational geometry.
We show that there is a straightforward equivalent in graph theory
which can be efficiently computed. In particular, we give two
algorithms for the computation of graph Voronoi diagrams, prove a
lower bound on the problem, and we identify cases where the algorithms
presented are optimal. The space requirement of a graph Voronoi
diagram is modest, since it needs no more space than the graph itself.
Paper.ps.gz (123K),
Paper.pdf (252K)
The formal treatment of visual languages is often based on
graph representations. Since the matter of discourse is
visual languages, it would be convenient if the formal
manipulations could be performed in a visual way. We
introduce visual graphs to support this goal. In a visual
graph some nodes are shown as geometric figures, and
some edges are represented by geometric relationships
between these figures. We investigate mappings between
visual and abstract graphs and show their application in
semantics definitions for visual languages and in formal
manipulations of visual programs.
Paper.ps.gz (72K),
Paper.pdf (67K)
We introduce a visual language for the specification of
temporally changing spatial situations. The main idea is
to represent spatio-temporal (ST) objects in a two-dimensional
way by their trace. The intersections of these traces with
other objects are interpreted and translated into sequences
of spatial and spatio-temporal predicates, called developments,
that can then be used, for example, to query spatio-temporal
databases.
Summary.ps.gz (29K),
Summary.pdf (15K)
We define a formal model of spatio-temporal partitions which can be
used to model temporally changing maps. We investigate new applications
and generalizations of operations that are well-known for static, spatial
maps. We then define a small set of operations on spatio-temporal
partitions that are powerful enough to express all these tasks and more.
Spatio-temporal partitions combine the general notion of temporal objects
and the powerful spatial partition abstraction into a new, highly expressive
spatio-temporal data modeling tool.
Paper.ps.gz (73K),
Paper.pdf (256K)
Integrating spatio-temporal data as abstract data types into already
existing data models is a promising approach to creating spatio-temporal
query languages. In this context, an important new class of queries
can be identified which is concerned with developments of spatial objects
over time, that is, queries ask especially for changes in spatial relationships.
Based on a definition of the notion of spatio-temporal predicate we provide
a framework which allows to build more and more complex predicates starting
with a small set of elementary ones. These predicates can be well
used to characterize developments. We show how these concepts can
be realized within the relational data model. In particular, we demonstrate
how SQL can be extended to enable the querying of developments.
Paper.dvi.gz (27K),
Paper.ps.gz (38K),
Paper.pdf (133K)
Spatio-temporal databases deal with geometries changing over time.
In general, geometries cannot only change in discrete steps, but continuously,
and we are talking about moving objects. If only the position in
space of an object is relevant, then moving point is a basic abstraction;
if also the extent is of interest, then the moving region abstraction
captures moving as well as growing or shrinking regions. We propose
a new line of research where moving points and moving regions are viewed
as three-dimensional (2D space + time) or higher-dimensional entities whose
structure and behavior is captured by modeling them as abstract data types.
Such types can be integrated as base (attribute) data types into relational,
object-oriented, or other DBMS data models; they can be implemented as
data blades, cartridges, etc. for extensible DBMSs. We expect
these spatio-temporal data types to play a similarly fundamental role for
spatio-temporal databases as spatial data types have played for spatial
databases. The paper explains the approach and discusses several
fundamental issues and questions related to it that need to be clarified
before delving into specific designs of spatio-temporal algebras.
Paper.ps.gz (86K),
Paper.pdf (101K)
In this paper we describe the discrete interval encoding tree
for storing subsets of types having a total order and a predecessor and
a successor function. We consider for simplicity only the case for integer
sets; the generalization is not difficult. The discrete interval encoding
tree is based on the observation that the set of integers
{ i| a<=i<=b}
can be perfectly represented by the closed interval [ a, b]. The
general idea is to represent a set by a binary search tree of integers
in which maximal adjacent subsets are each represented by an interval.
For instance, inserting the sequence of numbers [6, 9, 2, 13, 8, 14, 10,
7, 5] into a binary search tree, respectively, into an discrete interval
encoding tree results in the tree structures shown below:
Paper.dvi.gz (10K),
Paper.ps.gz (30K),
Paper.pdf (149K)
The effective use of visual languages requires a precise understanding
of their meaning. Moreover, it is impossible to prove properties of visual
languages like soundness of transformation rules or correctness results
without having a formal language definition. Although this sounds obvious,
it is surprising that only little work has been done about the semantics
of visual languages, and even worse, there is no general framework available
for the semantics specification of different visual languages. We present
such a framework that is based on a rather general notion of abstract visual
syntax. This framework allows a logical as well as a denotational approach
to visual semantics, and it facilitates the formal reasoning about visual
languages and their properties. We illustrate the concepts of the proposed
approach by defining abstract syntax and semantics for the visual languages
VEX, Show and Tell, and Euler Circles. We demonstrate the semantics in
action by proving a rule for visual reasoning with Euler Circles and by
showing the correctness of a Show and Tell program.
Paper.ps.gz (87K),
Paper.pdf (97K)
We show how to define fold operators for abstract data types. The main
idea is to represent an ADT by a bialgebra, that is, an algebra/coalgebra
pair with a common carrier. A program operating on an ADT is given by a
mapping to another ADT. Such a mapping, called metamorphism, is
basically a composition of the algebra of the second with the coalgebra
of the first ADT. We investigate some properties of metamorphisms, and
we show that the metamorphic programming style offers far-reaching opportunities
for program optimization that cover and even extend those known for algebraic
data types.
Paper.dvi.gz (27K),
Paper.ps.gz (53K),
Paper.pdf (224K)
Haskell Source Files and Related Papers
Currently, there are strong efforts to integrate spatial and temporal
database technology into spatio-temporal database systems. This paper
views the topic from a rather fundamental perspective and makes several
contributions. First, it reviews existing temporal and spatial data models
and presents a completely new approach to temporal data modeling based
on the very general notion of temporal object. The definition of temporal
objects is centered around the observation that anything that changes over
time can be expressed as a function over time. For the modeling of spatial
objects the well known concept of spatial data types is employed. As specific
subclasses, linear temporal and spatial objects are identified. Second,
the paper proposes the database embedding of temporal objects by means
of the abstract data type approach to the integration of complex objects
into databases. Furthermore, we make statements about the expressiveness
of different temporal and spatial database embeddings. Third, we consider
the combination of temporal and spatial objects into spatio-temporal objects
in (relational) databases. We explain various alternatives for spatio-temporal
data models and databases and compare their expressiveness. Spatio-temporal
objects turn out to be specific instances of temporal objects.
Paper.ps.gz (75K),
Paper.pdf (69K)
Spatio-temporal databases deal with geometries changing over time. In
general, geometries cannot only change in discrete steps, but continuously,
and we are talking about moving objects. If only the position in space
of an object is relevant, then moving point is a basic abstraction; if
also the extent is of interest, then the moving region abstraction captures
moving as well as growing or shrinking regions. We propose a new line of
research where moving points and moving regions are viewed as three-dimensional
(2D space + time) or higher-dimensional entities whose structure and behavior
is captured by modeling them as abstract data types. Such types can be
integrated as base (attribute) data types into relational, object-oriented,
or other DBMS data models; they can be implemented as data blades, cartridges,
etc. for extensible DBMSs. We expect these spatio-temporal data types to
play a similarly fundamental role for spatio-temporal databases as spatial
data types have played for spatial databases. In this paper we consider
the need for modeling spatio-temporal data types on two different levels
of abstraction.
Paper.ps.gz (53K),
Paper.pdf (57K)
We demonstrate a systematic way of introducing state monads to improve
the efficiency of data structures. When programs are expressed as transformations
of abstract data types - which becomes possible by suitable definitions
of ADTs and ADT fold operations, we can observe restricted usage patterns
for ADTs, and this can be exploited to derive (i) partial data structures
correctly implementing the ADTs and (ii) imperative realizations of these
partial data structures. With a library of ADTs together with (some of)
their imperative implementations, efficient versions of functional programs
can be obtained without being concerned or even knowing about state monads.
As one example we demonstrate the optimization of a graph ADT resulting
in an optimal imperative implementation of depth-first search.
Paper.dvi.gz (29K),
Paper.ps.gz (55K),
Paper.pdf (239K)
Haskell Source Files and Related Papers
In this paper we introduce visual graphs as an intermediate representation
between concrete visual syntax and abstract graph syntax. In a visual graph
some nodes are shown as geometric figures, and some edges are represented
by geometric relationships between these figures. By carefully designing
visual graphs and corresponding mappings to abstract syntax graphs, semantics
definitions can, at least partially, employ a visual notation while still
being based on abstract syntax. Visual semantics thus offers the "best
of both worlds" by integrating abstract syntax and visual notation. We
also demonstrate that these concepts can be used to give visual semantics
for traditional textual formalisms. As an example we provide a visual definition
of Turing machines.
Summary.ps.gz (31K),
Summary.pdf (20K)
Although maps and partitions are ubiquitous in geographical information
systems and spatial databases, there is only little work investigating
their foundations. We give a rigorous definition for spatial partitions
and propose partitions as a generic spatial data type that can be used
to model arbitrary maps and to support spatial analysis. We identify a
set of three powerful operations on partitions and show that the type of
partitions is closed under them. These basic operators are sufficient to
express all known application-specific operations. Moreover, many map operations
will be considerably generalized in our framework. We also indicate that
partitions can be effectively used as a meta-model to describe other spatial
data types.
Paper.ps.gz (80K),
Paper.pdf (92K)
Functional programs, by nature, operate on functional, or persistent,
data structures. Therefore, persistent graphs are a prerequisite to express
functional graph algorithms. In this paper we describe two implementations
of persistent graphs and compare their running times on different graph
problems. Both implementations essentially represent graphs as adjacency
lists. The first structure uses the version tree implementation of functional
arrays to make adjacency lists persistent. An array cache of the newest
graph version together with a time stamping technique for speeding up deletions
makes it asymptotically optimal for a class of graph algorithms that use
graphs in a single-threaded way. The second approach uses balanced search
trees to implement adjacency lists. For both structures we also consider
several variations, e.g., ignoring edge labels or predecessor information.
Paper.ps.gz (60K),
Paper.pdf (68K)
The effective use of visual languages requires a precise understanding
of their meaning. Moreover, it is impossible to prove properties of visual
languages like soundness of transformation rules or correctness results
without having a formal language definition. Although this sounds obvious,
it is surprising that only little work has been done about the semantics
of visual languages, and even worse, there is no general framework available
for the semantics specification of different visual languages. We present
such a framework that is based on a rather general notion of abstract visual
syntax. This framework allows a logical as well as a denotational approach
to visual semantics, and it facilitates the formal reasoning about visual
languages and their properties. We illustrate the concepts of the proposed
approach by defining abstract syntax and semantics for the visual languages
VEX, Show and Tell, and Euler Circles. For the latter we also prove a rule
for visual reasoning.
Paper.ps.gz (80K),
Paper.pdf (95K)
We propose a separation of visual syntax into concrete and abstract
syntax, much like it is often done for textual languages. Here the focus
is on visual programming languages; the impact on visual languages in general
is not clear by now. We suggest to use unstructured labeled multi-graphs
as abstract visual syntax and show how this facilitates semantics definitions
and transformations of visual languages. In particular, disregarding structural
constraints on the abstract syntax level makes such manipulations simple
and powerful. Moreover, we demonstrate that - in contrast to the traditional
monolithic graph definition - an inductive view of graphs provides a very
convenient way to move in a structured (and declarative) way through graphs.
Again this supports the simplicity of descriptions for transformations
and semantics.
Paper.ps.gz (54K),
Paper.pdf (57K)
In many geographical applications there is a need to model spatial phenomena
not simply by sharp objects, but rather through indeterminate or vague
concepts. To support such applications we present a model of vague regions
which covers and extends previous approaches. The formal framework is based
on a general exact model of spatial data types. On the one hand, this simplifies
the definition of the vague model since we can build upon already existing
theory of spatial data types, on the other hand, this approach facilitates
the migration from exact to vague models. Moreover, exact spatial data
types are subsumed as a special case of the presented vague concepts. We
present examples and show how they are represented within our framework.
We give a formal definition of basic operations and predicates which particularly
allow a more fine-grained investigation of spatial situations than in the
pure exact case. We also demonstrate the integration of the presented concepts
into an SQL-like query language.
Paper.ps.gz (87K),
Paper.pdf (117K)
Graph algorithms expressed in functional languages often suffer from
their inherited imperative, state-based style. In particular, this impedes
formal program manipulation. We show how to model persistent graphs in
functional languages by graph constructors. This provides a decompositional
view of graphs which is very close to that of data types and leads to a
``more functional'' formulation of graph algorithms. Graph constructors
enable the definition of general fold operations for graphs. We present
a promotion theorem for one of these folds that allows program fusion and
the elimination of intermediate results. Fusion is not restricted to the
elimination of tree-like structures, and we prove another theorem that
facilitates the elimination of intermediate graphs. We describe an ML-implementation
of persistent graphs which efficiently supports the presented fold operators.
For example, depth-first-search expressed by a fold over a functional graph
has the same complexity as the corresponding imperative algorithm.
Paper.ps.gz (62K),
Paper.pdf (242K)
ML-Source.tar.gz (7K) ... for an improved
implementation, have a look at the
Functional Graph Library
Active patterns apply preprocessing functions to data type values before
they are matched. This is of use for unfree data types where more than
one representation exists for an abstract value: In many cases there is
a specific representation for which function definitions become very simple,
and active patterns just allow to assume this specific representation in
function definitions. We define the semantics of active patterns and describe
their implementation.
Paper.dvi.gz (30K),
Paper.ps.gz (57K),
Paper.pdf (245K)
After more than a decade of research, visual languages have still not
become everyday programming tools. On a short term, an integration of visual
languages with well-established (textual) programming languages may be
more likely to meet the actual requirements of practical software development
than the highly ambitious goal of creating purely visual languages. In
such an integration each paradigm can support the other where it is superior.
Particularly attractive is the use of visual expressions for the description
of domain-specific data structures in combination with textual notations
for abstract control structures. In addition to a basic framework for heterogeneous
languages, we outline the design of a development system that allows rapid
prototyping of implementations of heterogeneous languages. Examples will
be presented from the domains of logical, functional, and procedural languages.
Paper.ps.gz (66K),
Paper.pdf (63K)
A new data structure is presented which facilitates the search for shortest
paths in spatially embedded planar networks in a worst-case time of O(l
log r) where l is the number of edges in the shortest path
to be found and r is an upper bound on the number of so-called ``cross
edges'' (these are edges connecting, for any node v, different shortest
path subtrees rooted at v's successors). The data structure is based
on the idea to identify shortest path subtrees with the regions in the
plane they cover. In the worst-case the space requirement is O(rn),
which, in general, is O(n^2), but for regularly shaped networks
it is expected to be only O(n\sqrt(n)). A decomposition of
graphs into bi-connected components can be used to reduce the sizes of
the trees to be encoded and to reduce the complexity of the regions for
these trees. The decomposition also simplifies the algorithm for computing
encoding regions, which is based on minimum link paths in polygons. Approximations
for region boundaries can effectively be utilized to speed up the process
of shortest path reconstruction: For a realistically constrained class
of networks, i.e., networks in which the ratio of the shortest path distance
between any two points to the Euclidean distance between these points is
bounded by a constant, it is shown that an average searching time of O(l)
can be achieved.
Algorithms can be regarded as instructions for building and/or changing
data structures. In order to describe algorithms visually we need (i) a
visual representation of data structures, and (ii) a (visual) representation
of modifications.
The variables i and j below the array denote indices which
are interpreted by (nested) loops over A: Here, by default, the
first loop for i ranges from 1 to n, and the inner loop for
j
runs from n down to i. The lower bound of this loop is given
by the condition under the arrow, j> i, which also specifies
the direction of the loop (> means descending). The condition above the
arrow restricts the action described by the right side: The array elements
y
and z are only exchanged if y< z. Since the label
z
is located directly adjacent to y both labels are interpreted as
adjacent array elements.
The icon above the box describes how to apply the defined function.
The action rule inside the box initially binds x to A[1]
(since the label x is located at the beginning of the array icon)
and describes three array comprehensions: [a \in A | a<
x],
[b \in A | b=x], and [c \in
A
| c> x]. The pattern on the the right side appends the three
new arrays defined by the array comprehensions where the quicksort function
is recursively applied to the first and to the third one.
Observing that networks are ubiquitous in applications for spatial databases,
we define a new data model and query language that especially supports
graph structures. This model integrates concepts of functional data modeling
with order-sorted algebra. Besides object and data type hierarchies graphs
are available as an explicit modeling tool, and graph operations are part
of the query language. Graphs have three classes of components, namely
nodes, edges, and explicit paths. These are at the same time object types
within the object type hierarchy and can be used like any other type. Explicit
paths are useful because ``real world'' objects often correspond to paths
in a network. Furthermore, a dynamic generalization concept is introduced
to handle heterogeneous collections of objects in a query. In connection
with spatial data types this leads to powerful modeling and querying capabilities
for spatial databases, in particular for spatially embedded networks such
as highways, rivers, public transport, and so forth. We use multi-level
order-sorted algebra as a formal framework for the specification of our
model. Roughly spoken, the first level algebra defines types and operations
of the query language whereas the second level algebra defines kinds (collections
of types) and type constructors as functions between kinds and so provides
the types that can be used at the first level.
Paper.ps.gz (123K),
Paper.pdf (147K),
Long Version (Report).ps.gz (156K),
Long Version (Report).pdf (181K)
This thesis investigates the integration of graph structures into spatial
databases. We consider three different perspectives:
We propose to use order-sorted algebras (OSA) on multiple levels to
describe languages together with their type systems. It is demonstrated
that even advanced aspects can be modeled, including, parametric polymorphism,
complex relationships between different sorts of an operation's rank, the
specification of a variable number of parameters for operations, and type
constructors using values (and not only types) as arguments.
Paper.dvi.gz (16K),
Paper.ps.gz (44K),
Paper.pdf (221K)
We encourage a specific view of graph algorithms, which can be paraphrased
by "iterate over the graph elements in a specific order and perform computations
in each step". Data structures will be used to define the processing order,
and we will introduce recursive mapping/array definitions as a vehicle
for specifying the desired computations. Being concerned with the problem
of implementing graph algorithms, we outline the extension of a functional
programming language by graph types and operations. In particular, we explicate
an exploration operator that essentially embodies the proposed view of
algorithms. Fortunately, the resulting specifications of algorithms, in
addition to being compact and declarative, are expected to have an almost
as efficient implementation as their imperative counterparts.
Paper.dvi.gz (24K),
Paper.ps.gz (55K),
Paper.pdf (246K)
We present a functional DBPL in the style of FP that facilitates the
definition of precise semantics and opens up opportunities for far-reaching
optimizations. The language is integrated into a functional data model,
which is extended by arbitrary type hierarchies and complex objects. Thus
we are able to provide the clarity of FP-like programs together with the
full power of semantic data modeling. To give an impression of the special
facilities for optimizing functional database languages, we point out some
laws not presented before which enable access path selection already on
the algebraic level of optimization. The algebraic way of access path optimization
also gives new insights into optimization strategies.
Paper.ps.gz (68K),
Paper.pdf (217K)
The Graph Voronoi Diagram with Applications
Martin Erwig.
Networks, Vol 36, No. 3, 156-163, 2000
Visual Graphs
Martin Erwig
15th IEEE Symp. on Visual Languages
(VL'99), 122-129, 1999
Visual Specifications of Spatio-Temporal Developments
Martin Erwig and Markus Schneider
15th IEEE Symp. on Visual Languages
(VL'99), 187-188, 1999
The Honeycomb Model of Spatio-Temporal Partitions
Martin Erwig and Markus Schneider
Int. Workshop on Spatio-Temporal Database Management
(STDBM'99),
LNCS
1678, 39-59, 1999
Developments in Spatio-Temporal Query Languages
Martin Erwig and Markus Schneider
IEEE Int. Workshop on Spatio-Temporal Data Models and Languages,
441-449, 1999
Spatio-Temporal Data Types: An Approach to Modeling and Querying Moving
Objects in Databases
Martin Erwig, Ralf H. Güting, Markus Schneider, and Michalis Vazirgiannis.
GeoInformatica, Vol. 3, No. 3, 269-296, 1999
Diets for Fat Sets
Martin Erwig.
Journal of Functional Programming, Vol. 8, No. 6, 627-632, 1998

Abstract Syntax and Semantics of Visual Languages
Martin Erwig.
Journal of Visual Languages and Computing, Vol. 9, No. 5, 461-483, 1998
Categorical Programming with Abstract Data Types
Martin Erwig.
7th Int. Conf. on Algebraic Methodology and Software Technology
(AMAST'98),
LNCS
1548, 406-421, 1998
Temporal Objects for Spatio-Temporal Data Models and a Comparison of Their
Representations
Martin Erwig, Markus Schneider, and Ralf H. Güting
Int. Workshop on Advances in Database Technologies,
LNCS
1552, 454-465, 1998
Abstract and Discrete Modeling of Spatio-Temporal Data Types
Martin Erwig, Ralf H. Güting, Markus Schneider, and Michalis Vazirgiannis.
6th ACM Symp. on Geographic Information Systems
(ACMGIS'98),
131-136, 1998
The Categorical Imperative - Or: How to Hide Your State Monads
Martin Erwig.
10th Int. Workshop on Implementation of Functional Languages
(IFL'98), 1-25, 1998
Visual Semantics - Or: What You See is What You Compute
Martin Erwig.
14th IEEE Symp. on Visual Languages
(VL'98), 96-97, 1998
Partition and Conquer
Martin Erwig and Markus Schneider.
3rd Int. Conf. on Spatial Information Theory
(COSIT'97),
LNCS
1329, 389-408, 1997
Fully Persistent Graphs - Which One to Choose?
Martin Erwig.
9th Int. Workshop on Implementation of Functional Languages
(IFL'97),
LNCS
1467, 123-140, 1997
Semantics of Visual Languages
Martin Erwig.
13th IEEE Symp. on Visual Languages
(VL'97), 304-311, 1997
Abstract Visual Syntax
Martin Erwig.
2nd IEEE Int. Workshop on Theory of Visual Languages
(TVL'97), 15-25, 1997
Vague Regions
Martin Erwig and Markus Schneider.
5th Int. Symp. on Advances in Spatial Databases
(SSD'97),
LNCS
1262, 298-320, 1997
Functional Programming with Graphs
Martin Erwig.
2nd ACM SIGPLAN Int. Conf. on Functional Programming
(ICFP'97),
52-65, 1997
(also in: ACM SIGPLAN Notices, Vol. 32,No. 8, Aug
1997, pp. 52-65)
Active Patterns
Martin Erwig.
8th Int. Workshop on Implementation of Functional Languages
(IFL'96),
LNCS
1268, 21-40, 1996
Heterogeneous Visual Languages - Integrating Visual and Textual Programming
Martin Erwig and Bernd Meyer. 11th IEEE Symp. on Visual Languages
(VL'95),
318-325, 1995
Encoding Shortest Paths in Spatial Networks
Martin Erwig.
Networks, Vol. 26,291-303, 1995
DEAL - A Language for Depicting Algorithms
Martin Erwig. 10th IEEE Symp. on Visual Languages (VL'94),184-185,
1994
![]()

Explicit Graphs in a Functional Model for Spatial Databases
Martin Erwig and Ralf H. Güting.
IEEE Transactions on Knowledge and Data Engineering, Vol. 5,No. 6, 787-804, 1994
Graphs in Spatial Databases
Martin Erwig. Doctoral Thesis, FernUniversität Hagen, 1994
Specifying Type Systems with Multi-Level Order-Sorted Algebra
Martin Erwig.
3rd Int. Conf. on Algebraic Methodology and Software Technology (AMAST'93),177-184, 1993
Graph Algorithms = Iteration + Data Structures? The Structure of Graph
Algorithms and a Corresponding Style of Programming
Martin Erwig. 18th Int. Workshop on Graph Theoretic Concepts in Computer
Science (WG'92),
LNCS
657, 277-292, 1992
A Functional DBPL Revealing High Level Optimizations
Martin Erwig and Udo W. Lipeck.
3rd Int. Workshop on Database Programming Languages (DBPL3),306-321, 1991
last change: July 4, 2009
Martin Erwig
 erwig@eecs.oregonstate.edu