Posts

Showing posts from March, 2015

Improving the GBIF Backbone matching

In GBIF occurrence records are matched to a taxon in a backbone taxonomy  using the  species match API . This is important to reduce spelling variations and create consistent metrics and searches according to a single classification and synonymy. Over the past years we have been alerted to various bad matches . Most of the reported issues refer to a false fuzzy match for a name missing in our backbone. In order to improve the taxonomic classification of occurrence records, we are undertaking 2 activities.  The first is to improve the algorithms we use to fuzzily match names, and the second will be to improve the algorithms used to assembled the backbone taxonomy itself.  Here I explain some of the work currently underway to tackle the former, which is visible on the test environment. 1.Name parsing of undetermined species In occurrences we see many names with a partly undetermined name such as Lucanus spec. Erroneously these rank markers have been treated as real s...

IPT v2.2 – Making data citable through DataCite

Image
GBIF is pleased to release  IPT 2.2 , now capable of automatically connecting with either  DataCite  or  EZID to assign DOIs to datasets. This new feature makes biodiversity data easier to access on the Web and facilitates tracking its re-use. DataCite integration explained DataCite specialises in assigning DOIs to datasets. It was established in 2009 with three fundamental goals (1) :                    Establish easier access to research data on the Internet Increase acceptance of research data as citable contributions to the scholarly record Support research data archiving to permit results to be verified and re-purposed for future study EZID is hosted by the California Digital Library  (a founding member of DataCite) and adds services on top of the DataCite DOI infrastructure such as their own easy-to-use programming interface . To integrate with DataCite and further these three goals for biodiversity ...