Tag Archives: bytes

Serialization of LIRE global features updated

In the current SVN version three global features have been re-visited in terms of serialization. This was necessary as the index of the web demo with 300k images already exceed 1.5 GB.

Feature prior now
EdgeHistogram 320 bytes 40 bytes
JCD 1344 bytes 84 bytes
ColorLayout 14336 bytes 504 bytes

This significant reduction in space leads to (i) smaller indexes, (ii) reduced I/O time, and (iii) therefore, to faster search.

How was this done? Basically it’s clever organization of bytes. In the case of JCD the histogram has 168 entries, each in [0,127], so basically half a byte.Therefore, you can stuff 2 of these values into one byte, but you have to take care of the fact, that Java only supports bit-wise operations on ints and bytes are signed. So the trick is to create an integer in [0, 2^8-1] and then subtract 128 to get it into byte range. The inverse is done for reading. The rest is common bit shifting.

The code can be seen either in the JCD.java file in the SVN, or in the snippet at pastebin.com for your convenience.