Monthly Archives: December 2012

Lire 0.9.3 released

I just uploaded Lire 0.9.3 to the all new Google Code page. This is the first version with full support for Lucene 4.0. Run time and memory performance are comparable to the version using Lucene 3.6. I’ve made several improvements in terms of speed and memory consumption along the way, mostly within the CEDD feature. Also I’ve added two new features:

  • JointHistogram – a 64 bit RGB color histogram joined with pixel rank in the 8-neighborhood, normalized with max-norm, quantized to [0,127], and JSD for a distance function
  • Opponent Histogram – a 64 bit histogram utilizing the opponent color space, normalized with max-norm, quantized to [0,127], and JSD for a distance function

Both features are fast in extraction (the second one naturally being faster as it does not investigate the neighborhood) and yield nice, visually very similar results in search. See also the image below showing 4 queries, each with the new features. The first one of a pair is always based on JointHistogram, the second is based on the OpponentHistogram (click ko see full size).

Samples-JointHistogram

 

I also changed the Histogram interface to double[] as the double type is so much faster than float in 64 bit Oracle Java 7 VM. Major bug fix was in the JSD dissimilarity function. So many histograms now turned to use JSD instead of L1, depending on whether they performed better in the SIMPLIcity data set (see TestWang.java in the sources).

Final addition is the Lire-SimpleApplication, which provides two classes for indexing and search with CEDD, ready to compile with all libraries and an Ant build file. This may — hopefully — help those that still seek Java enlightenment :-D

Finally this just leaves to say to all of you: Merry Christmas and a Happy New Year!

News on LIRE performance

In the course of finishing the book, I reviewed several aspects of the LIRE code and came across some bugs, including one with the Jensen-Shannon divergence. This dissimilarity measure has never been used actively in any features as it didn’t work out in retrieval evaluation the way it was meant to. After two hours staring at the code the realization finally came. In Java the short if statement, “x ? y : z” is overruled by almost any operator including ‘+’. Hence,

System.out.print(true ? 1: 0 + 1) prints '1',

while

System.out.print((true ? 1: 0) + 1) prints '2'

With this problem identified I was finally able to fix the implementation of the Jensen-Shannon divergence implementation and came to new retrieval evaluation results on the SIMPLIcity data set:

map p@10 error rate
Color Histogram – JSD 0,450 0,704 0,191
Joint Histogram – JSD 0,453 0,691 0,196
Color Correlogram 0,475 0,725 0,171
Color Layout 0,439 0,610 0,309
Edge Histogram 0,333 0,500 0,401
CEDD 0,506 0,710 0,178
JCD 0,510 0,719 0,177
FCTH 0,499 0,703 0,209

Note that the color histogram in the first row now performs similarly to the “good” descriptors in terms of precision at ten and error rate. Also note that a new feature creeped in: Joint Histogram. This is a histogram combining pixel rank and RGB-64 color.

All the new stuff can be found in SVN and in the nightly builds (starting tomorrow :)