Profiling memory usage of various String collections
Wanting to know the memory footprint and performance of different java options to keep simple string lookups in memory, I profiled different java.util collection classes that are filled with the same list of strings to see how much the memory usage differs. I then loaded the same data into 2 lucene in memory indices using lucenes RAMDirectory. I finally also evaluated an embedded file based & in memory H2 database. The data that has been loaded are 1.573.345 scientific name strings, the longest being about 150 characters. The original uncompressed text file is 31.6MB (zipped 8.1MB). To also test ID lookup in case of java.util.map or the KVP lucene index, the row number of each name has been used. The machine I used for testing was a MacPro 8-core 3GHz, 5GB RAM using Java6 with 2GB of memory (-Xmx2g) on Mac OSX 64bit. Here are the shortened results using System.currentTimeMillis() and JProfiler inspecting deep object copies in heap dumps (a seriously memory intensive thing too in s...