Animals Thumbs



The hoofed animals dataset consists of 200 images showing a total of 715 animals belonging to the following six categories: cows, horses, sheep, goats, camels, and deer. It is aimed at helping evaluate the capability of an object recognition system to: (i) detect, recognize, and segment all instances of the categories present in the images, and (ii) establish relationships among the categories in terms of, e.g., their similarity, mutual containment, co-occurrence, sharing subparts, etc.

The hoofed animals dataset is designed to complement currently popular benchmarks, such as Caltech-256 and PASCAL. The major deficiencies of these datasets are that their images typically contain a single, prominently featured object from an object category, and that the categories used significantly differ in appearance and topology. In contrast, the hoofed animals dataset contains very similar categories, and therefore poses a challenge for an algorithm to resolve subtle cross-category differences. Since the animals are similar, they share a number of similar parts (e.g., horses and deer have similar limbs). Also, the animals have category-specific, discriminative subparts which allow for categorization (e.g., only deer have antlers). This makes the dataset suitable for evaluating which inter-category relationships an algorithm is capable of capturing (e.g., similarity in terms of shared parts, taxonomy in terms of shared and unshared parts, etc.). Another increase in complexity over popular benchmark datasets is that each image may contain multiple instances of multiple categories. Other challenges involve the following: the animals are articulated, non-rigid objects; (ii) they appear at different scales across the dataset; and (iii) may be partially occluded, amidst clutter.


HoofedAnimals.zip contains a total of 200 original images in PGM format, and the corresponding set of manually segmented images in PPM format. It also contains file ground_truth.txt which details how many instances of each category appear in the images. This ground-truth report is organized in a 200x6 matrix, where rows correspond to the images, in the order they are enumerated, and columns correspond to the categories in the following order: cows, horses, sheep, goats, camels, and deer. For example, the value 4 in row 16 and column 2 indicates that image 16.pgm contains 4 horses. The following table shows the number of occurrences of each category in the dataset:


Both masks and outer contours of manually segmented animals are provided as ground truth. The masks of different animals are marked with different (R,G,B) values: cows are red (255,0,0); horses are green (0,255,0);  sheep are blue (0,0,255); goats are  magenta  (255,0,255); camels are yellow (255,255,0); and deer are cyan (0,255,255). The background pixels that do not belong to the animals are black (0,0,0). In case the image shows multiple instances of one category their masks have decreasing intensities, where the corresponding RGB values are decreased by 10. For example, the image that shows 2 sheep and 5 goats has the following masks: (0,0,255), (0,0,245), (255,0,255), (245,0,245), (235,0,235), (225,0,225), and (215,0,215).


N. Ahuja and S. Todorovic, "Learning the taxonomy and models of categories present in arbitrary images," in Proc. IEEE Int. Conf. Computer Vision (ICCV 2007), Rio de Janeiro, Brazil, 2007.