Posts

Showing posts from June, 2017

GBIF Name Parser

The GBIF name parser has been a fundamental library for GBIF to parse a scientific name string into a structured representation of a name. It has been refined over many years based on actual name strings encountered in the GBIF occurrence and checklist indices. Over the years the major design goals have not changed much and can be summarised as follows: extract canonical, code relevant name parts populate only the ParsedName class of the GBIF API ignore any superflous name parts irrelevant to the code, e.g. species authorships in infraspecific names, infrageneric placements of species or superflous infraspecific parts in quadrinomials deal with a wide variety of names that the ParsedName class can represent cultivar names bacterial strains & candidate names virus names named hybrids taxon concept references, sensu latu/strictu or aggregates legacy ranks extract notes often found in names: nomenclatural remarks determination notes like aff.  partially determined species, e.g. ...