Research on anomaly detection has been held back by the lack of good benchmark problems. Existing benchmarks are typically either proprietary or else very artificial. Furthermore, existing benchmarks do not provide a way to manipulate important problem dimensions. To address these issues, we developed a benchmarking methodology based on repurposing supervised learning data sets from the UCI repository.
The based approach is described in this paper: Emmott, A. F., Das, S., Dietterich, T. G., Fern, A., Wong, W.-K. (2013). Systematic construction of anomaly detection benchmarks from real data. ODD '13 Proceedings of the ACM SIGKDD Workshop on Outlier Detection and Description. (pp. 16-21). DOI PDF Preprint.
A journal paper is in preparation. The entire collection will be made available after the journal paper is accepted for publication.