BioCASe now producing DarwinCore Archives
Guest post from Jörg Holetschek, Botanic Garden and Botanical Museum Berlin-Dahlem. The traditional way of sharing occurrence data with GBIF has been web-service-based for years. Data publishers have used one of the existing provider software packages ( DiGIR , BioCASe or TAPIR Link ) to expose their data as a DiGIR-, BioCASe- or TAPIR-compliant web service. Biodiversity networks such as GBIF used harvesters to crawl and index the records published by these services, an approach that works fine for small and medium-sized datasets, but runs into difficulties when record numbers hit the millions: Harvesting can take days and puts a heavy load on both the publisher and the crawler. To overcome this, GBIF recently introduced DarwinCore Archives for storing all information of a dataset to be published in a single file. GBIF directly ingesting this file eliminates the time-consuming back-and-forth communication between data provider and harvester, speeding up the process and reducing load f...