Research Issues in Spatio-Temporal Learning
The final discussion at the workshop identified the following points
are areas for future research or important questions to be considered:
- Benchmark Data Sets. Good benchmark data sets can spur research
and help measure progress.
- Generic Tools. As good algorithms and methodologies are
developed, we need to deliver generic spatio-temporal learning tools
that makes these methods accessible to practioners.
- What could form the basis of a generic tool? The
divide-and-conquer schema is general, but does not adequately model
spatio-temporal correlations. HMM, conditional random field, and
related probabilistic models have promise of being general and do
adequately model spatio-temporal correlations. Graph transformer
networks are very general and also support many different kinds of
global loss functions. How hard is it to make global gradient descent
work in these networks?
- Literature on Transducers. We need to read the literature on
finite-state transducers. There is a useful library package called libfsm.
- Understandable models. The primary goal of many data mining
applications is to understand the data (rather than necessarily to
obtain optimal predictive accuracy). What methods are appropriate for
this?
- Alignment. Often, the input sequence x and the output
sequence y are not of the same length and must be aligned
somehow. This is a challenge for sliding window and HMM methods.
- Relational Learning. Temporal and spatial problems often
involve reasoning about the relationships among objects, events, etc.
Can relational learning methods be applied? In the other direction,
temporal and spatial learning studies special kinds of relationships.
Can these be generalized to relationships with arbitrary topology?
- Optimizing non-local loss functions (especially on-line).
Many problems involve non-local loss functions (i.e., loss functions
that involve predictions at several points in time or space). The
talks in the workshop ignored this for the most part (with the
exception of graph transformer networks). How can sliding window and
probabilistic methods optimize non-local loss functions?
- Evaluation Methods. How should temporal and spatial methods
be evaluated? What are good evaluation metrics? What are some of the
issues involved? For example, what are the appropriate methods for
cross-validation or hold-out validation in temporal and spatial problems?
- Active Learning. A naive approach to active learning would
simply apply standard methods to label entire (x,y) pairs. But
perhaps active learning should consider specific spatial regions or
temporal sections?
- Partial Orderings. Temporal and sequential data exhibits a
total ordering. Are there applications that involve partial
orderings? What methods are needed to handle such problems?