In the current SVN version three global features have been re-visited in terms of serialization. This was necessary as the index of the web demo with 300k images already exceed 1.5 GB.
|EdgeHistogram||320 bytes||40 bytes|
|JCD||1344 bytes||84 bytes|
|ColorLayout||14336 bytes||504 bytes|
This significant reduction in space leads to (i) smaller indexes, (ii) reduced I/O time, and (iii) therefore, to faster search.
How was this done? Basically it’s clever organization of bytes. In the case of JCD the histogram has 168 entries, each in [0,127], so basically half a byte.Therefore, you can stuff 2 of these values into one byte, but you have to take care of the fact, that Java only supports bit-wise operations on ints and bytes are signed. So the trick is to create an integer in [0, 2^8-1] and then subtract 128 to get it into byte range. The inverse is done for reading. The rest is common bit shifting.