Research Issues in Spatio-Temporal Learning

The final discussion at the workshop identified the following points are areas for future research or important questions to be considered:

Benchmark Data Sets. Good benchmark data sets can spur research and help measure progress.
Generic Tools. As good algorithms and methodologies are developed, we need to deliver generic spatio-temporal learning tools that makes these methods accessible to practioners.
What could form the basis of a generic tool? The divide-and-conquer schema is general, but does not adequately model spatio-temporal correlations. HMM, conditional random field, and related probabilistic models have promise of being general and do adequately model spatio-temporal correlations. Graph transformer networks are very general and also support many different kinds of global loss functions. How hard is it to make global gradient descent work in these networks?
Literature on Transducers. We need to read the literature on finite-state transducers. There is a useful library package called libfsm.
Understandable models. The primary goal of many data mining applications is to understand the data (rather than necessarily to obtain optimal predictive accuracy). What methods are appropriate for this?
Alignment. Often, the input sequence x and the output sequence y are not of the same length and must be aligned somehow. This is a challenge for sliding window and HMM methods.
Relational Learning. Temporal and spatial problems often involve reasoning about the relationships among objects, events, etc. Can relational learning methods be applied? In the other direction, temporal and spatial learning studies special kinds of relationships. Can these be generalized to relationships with arbitrary topology?
Optimizing non-local loss functions (especially on-line). Many problems involve non-local loss functions (i.e., loss functions that involve predictions at several points in time or space). The talks in the workshop ignored this for the most part (with the exception of graph transformer networks). How can sliding window and probabilistic methods optimize non-local loss functions?
Evaluation Methods. How should temporal and spatial methods be evaluated? What are good evaluation metrics? What are some of the issues involved? For example, what are the appropriate methods for cross-validation or hold-out validation in temporal and spatial problems?
Active Learning. A naive approach to active learning would simply apply standard methods to label entire (x,y) pairs. But perhaps active learning should consider specific spatial regions or temporal sections?
Partial Orderings. Temporal and sequential data exhibits a total ordering. Are there applications that involve partial orderings? What methods are needed to handle such problems?