Species Prediction Accuracy


Table 1 shows the percentage accuracy obtained in predicting the species of isolated leaves. The template database for this result consisted only of isolated leaves of the six species mentioned before. The training set (i.e. the template database) was constructed with twenty isolated leaves of each species. The overall accuracy obtained for isolated leaves is 96%.

 

Species

Total leaves

Classified as

Correctly classified

Accuracy

AC

AG

AM

AN

QG

QK

 

Acer Circinatum (AC)

40

40

 0

 0

 0

 0

 0

40

100%

 

Acer Glabrum (AG)

40

 0

38

 2

 0

 0

 0

38

 95%

 

Acer Macrophyllum (AM)

40

 2

 0

35

 0

 3

 0

35

 88%

 

Acer Negundo (AN)

18

 1

 0

 0

17

 0

 0

17

 94%

 

Quercus Garryana (QG)

40

 1

 0

 0

 0

39

 0

39

 98%

 

Quercus Kelloggii (QK)

40

 0

 0

 0

 0

 0

40

40

100%

 

Table 1: Predicting species of isolated leaves

Table 2 shows the percentage accuracy for predicting plant species of samples from the herbarium. The training set for this case is same as the one used above. The test set or the unknown image set was constructed with processed digitized samples from the herbarium. The over all accuracy in this case is 59%.

 

Species

Total leaves

Classified as

Correctly classified

Accuracy

 

AC

AG

AM

AN

QG

QK

 

Acer Circinatum (AC)

30

25

 3

 2

 0

 0

 0

25

 83%

 

Acer Glabrum (AG)

30

 1

20

 4

 1

 4

 0

20

 67%

 

Acer Macrophyllum (AM)

30

 0

16

14

 0

 0

 0

14

 47%

 

Acer Negundo (AN)

 8

 1

 4

 1

 2

 0

 0

 2

 25%

 

Quercus Garryana (QG)

30

 4

 2

 4

 0

19

 1

19

 63%

 

Quercus Kelloggii (QK)

20

 2

 3

 4

 0

 3

 8

 8

 40%

 Table 2: Predicting species of digitized herbarium sample images

Table 3 shows the performance accuracy for predicting plant species of isolated leaves. However, in this case, the training set consists of leaf shapes extracted from the digitized herbarium samples. The overall accuracy in this case is 61%.

 

Species

Total leaves

Classified as

Correctly classified

Accuracy

 

AC

AG

AM

AN

QG

QK

 

Acer Circinatum (AC)

40

34

 1

 1

 0

 4

 0

34

 85%

 

Acer Glabrum (AG)

40

 7

33

 0

 0

 0

 0

33

 83%

 

Acer Macrophyllum (AM)

40

 18

 7

 9

 0

 5

 1

 9

 23%

 

Acer Negundo (AN)

18

 15

 1

 2

 0

 0

 0

 0

  0%

 

Quercus Garryana (QG)

40

 3

 2

 0

 0

32

 3

32

 80%

 

Quercus Kelloggii (QK)

40

 3

 0

11

 0

 0

26

26

 65%

 

 Table 3: Predicting species of isolated leaves with training templates consisting of only herbarium sample images

 


Precision-Recall Plot

The standard measure of performance for information retrieval systems is the precision-recall plot. Consider a query to an information retrieval system (in this case, an image of an isolated leaf). We can view the information retrieval system as computing a ranking of all the documents (i.e. herbarium samples) in the database and returning the top K most relevant documents. In our application, the user wants the most relevant documents to be the ones from the same species as the query. The “precision” of the retrieval is the percentage of the K documents that belong to the correct species. The “recall” of the retrieval is the percentage of all documents for the correct species that are included in the top K retrieved documents. There is always a precision-recall tradeoff: If the information retrieval system returns the entire set of documents, then recall will be 100%, but precision will be very low. If the system returns just one document from the correct class, then precision will be 100%, but recall will be very low. The tradeoff can be visualized by plotting precision and recall as K is increased from one to some maximum value.

Figure 1 shows the precision-recall plot for the isolated leaf classification of the six plant species individually. Figure 2 shows the curves for all species combined and the curves for all species belonging to genus Acer and Quercus. The template database consists of 20 isolated leaves of each species. For small values of recalls (i.e. small setting of K), precision is over 90%. Precision gradually decreases as K is increased.

Figure 3 shows precision in retrieving isolated leaves when herbarium samples are used for querying. Acer Negundo has been skipped in the plots because of inadequate good quality herbarium samples. Figure 4 shows the precision in retrieving herbarium samples from the database with isolated leaves used as query images. Note that the precision has been calculated only for small values of recall since the database consists of a huge collection of unfiltered boundaries from the herbarium samples. That is, the boundaries also include stems, flowers and other unwanted boundaries.

 

Figure 1: Precision-recall curves for isolated leaves

  Figure 2: Precision-recall plot for the two genera and the overall performance  

Figure 3: Precision-recall for retrieval of isolated leaves with herbarium samples as query

Figure 4: Precision-recall for retrieval of herbarium samples with isolated leaves as queries


Home    Species Identification    Sample Images & Code    Paper