Improving the GBIF Backbone matching
In GBIF occurrence records are matched to a taxon in a backbone taxonomy using the species match API . This is important to reduce spelling variations and create consistent metrics and searches according to a single classification and synonymy. Over the past years we have been alerted to various bad matches . Most of the reported issues refer to a false fuzzy match for a name missing in our backbone. In order to improve the taxonomic classification of occurrence records, we are undertaking 2 activities. The first is to improve the algorithms we use to fuzzily match names, and the second will be to improve the algorithms used to assembled the backbone taxonomy itself. Here I explain some of the work currently underway to tackle the former, which is visible on the test environment. 1.Name parsing of undetermined species In occurrences we see many names with a partly undetermined name such as Lucanus spec. Erroneously these rank markers have been treated as real s...