Posts

Showing posts from December, 2012

"I noticed that the GBIF data portal has fewer records than it used to – what happened?"

If you are a regular user of the GBIF data portal at http://data.gbif.org , or keep an eye on the numbers given at http://www.gbif.org , you may have noticed that the number of indexed records took a dip, from well over 389m records to a little more than 383m. Why would that be? The main reason for this is that software and processing upgrades have made it easier to spot duplicates and old, no longer published versions of records and datasets. Since the previous version of the data index, some major removal of such records has taken place:    -           Several publishers migrated their datasets from other publishing tools to the Integrated Publishing Toolkit (IPT) and Darwin Core Archive, and in the process identified and removed duplicate records in the published source data. As an additional effect, the use of Darwin Core Archives in publishing allows the indexing process to automatically remove records from the index that ar...