Tag Archives: cbir

Lire 0.9.3 released

I just uploaded Lire 0.9.3 to the all new Google Code page. This is the first version with full support for Lucene 4.0. Run time and memory performance are comparable to the version using Lucene 3.6. I’ve made several improvements in terms of speed and memory consumption along the way, mostly within the CEDD feature. Also I’ve added two new features:

  • JointHistogram – a 64 bit RGB color histogram joined with pixel rank in the 8-neighborhood, normalized with max-norm, quantized to [0,127], and JSD for a distance function
  • Opponent Histogram – a 64 bit histogram utilizing the opponent color space, normalized with max-norm, quantized to [0,127], and JSD for a distance function

Both features are fast in extraction (the second one naturally being faster as it does not investigate the neighborhood) and yield nice, visually very similar results in search. See also the image below showing 4 queries, each with the new features. The first one of a pair is always based on JointHistogram, the second is based on the OpponentHistogram (click ko see full size).

Samples-JointHistogram

 

I also changed the Histogram interface to double[] as the double type is so much faster than float in 64 bit Oracle Java 7 VM. Major bug fix was in the JSD dissimilarity function. So many histograms now turned to use JSD instead of L1, depending on whether they performed better in the SIMPLIcity data set (see TestWang.java in the sources).

Final addition is the Lire-SimpleApplication, which provides two classes for indexing and search with CEDD, ready to compile with all libraries and an Ant build file. This may — hopefully — help those that still seek Java enlightenment 😀

Finally this just leaves to say to all of you: Merry Christmas and a Happy New Year!

News on LIRE performance

In the course of finishing the book, I reviewed several aspects of the LIRE code and came across some bugs, including one with the Jensen-Shannon divergence. This dissimilarity measure has never been used actively in any features as it didn’t work out in retrieval evaluation the way it was meant to. After two hours staring at the code the realization finally came. In Java the short if statement, “x ? y : z” is overruled by almost any operator including ‘+’. Hence,

System.out.print(true ? 1: 0 + 1) prints '1',

while

System.out.print((true ? 1: 0) + 1) prints '2'

With this problem identified I was finally able to fix the implementation of the Jensen-Shannon divergence implementation and came to new retrieval evaluation results on the SIMPLIcity data set:

map p@10 error rate
Color Histogram – JSD 0,450 0,704 0,191
Joint Histogram – JSD 0,453 0,691 0,196
Color Correlogram 0,475 0,725 0,171
Color Layout 0,439 0,610 0,309
Edge Histogram 0,333 0,500 0,401
CEDD 0,506 0,710 0,178
JCD 0,510 0,719 0,177
FCTH 0,499 0,703 0,209

Note that the color histogram in the first row now performs similarly to the “good” descriptors in terms of precision at ten and error rate. Also note that a new feature creeped in: Joint Histogram. This is a histogram combining pixel rank and RGB-64 color.

All the new stuff can be found in SVN and in the nightly builds (starting tomorrow 🙂

Difference in face detection with default training sets

Recently I posted binaries and packaged libraries for face detection based on OpenCV an OpenIMAJ here and here. Basically both employ similar algorithms to detect faces in photos. As this is based on supervised classification not only the algorithm but also the employed training set has strong influence on the actual precision (and recall) of results. So out of interest I took a look on how well the results of both libraries are correlated:

imaj_20   1.000 0.933 0.695
imaj_40   0.933 1.000 0.706
opencv_   0.695 0.706 1.000

Above table shows the Pearson correlation of the face detection algorithm with the default models of OpenIMAJ (with a minimum face size of 20 and 40 pixels) and OpenCV. As can be seen the results correlate, but are not the same. Conclusion is: make sure that you check which one to use for your aplication and eventually train one yourself (as actually recommended by the documentation of both libraries).

This experiment has been done on just 171 images, but experiments with larger data sets have shown similar results.

WIAMIS 2012 started off well :)

WIAMIS 2012 has started in the morning and first kenote was Prof. Mubarak Shah from University of Central Florida. He talked about primitives for detection of human actions. Especially the visualization of his ideas and approaches was really great! Currently the retrieval session is going on.

My own presentation on user intentionsin video production is scheduled on Friday as the very last presentation, just before the closing remarks.

Social Media, Tagging and Images Semantics

Social Media LandscapeRecently there was quite a buzz around the whole social media topic. Many researchers saw indications that the willingness or people to share and annotate content might lead to new ways of indexing, searching and consuming multimedia. The biggest problems with the buzz is … that it’s BIG 🙂 Many research groups produced even more papers and with the rising number of papers the scientific impact got smaller and smaller. However Neela Sawant, Jia Li and James Z. Wang took a close look at more than 200 papers and provide a survey on part of the topic with the journal article “Automatic image semantic interpretation using social action and tagging data” in the Multimedia Tools and Applications journal.

Links

Visual Attention in Lire

While doing my preparations for my multimedia information systems lecture I finally came around to implement the visual attention model of Stentiford myself. I just check in the sources (SVN). The algorithm gives actually really nice results compared to its actual simplicity (implementationwise). You can see an example in the following figure. On the left hand side there is the original image and on the right hand side a visualization of the attention map. The light areas (especially the white ones) are deemed centers of attention. Sky and sand are so to say just random noise (there is a lot of “random” in this approach).

Links

Lire 0.8 released

I just released LIRe v0.8. LIRe – Lucene Image Retrieval – is a Java library for easy content based image retrieval. Based on Lucene it doesn’t need a database and works reliable and rather fast. Major change in this version is the support of Lucene 3.0.1, which has a changed API and better performance on some OS. A critical bug was fixed in the Tamura feature implementation. It now definitely performs better 🙂 Hidden in the depths of the code there is an implementation of the approximate fast indexing approach of G. Amato. It copes with the problem of linear search and provides a method for fast approximate retrieval for huge repositories (millions?). Unfortunately I haven’t tested with millions, just with tens thousands, which proves that it works, but it doesn’t show how fast.

Links

Lire v0.8 is on its way … just some more tests

I just checked in my latest code for LIRe and it looks like it’s nearly v0.8 release ready. Major changes include the use of Lucene 3.0.1, some bug fixes on descritors, several new test files (including one that shows how to do an LSA with image features) and of course an updated demo application. While everything needs a bit more testing as well as an documentation update, I can offer a pre-compiled demo here. All changed and added sources can be found in the SVN.

Links

Best Paper Award

“SPCD – SPATIAL COLOR DISTRIBUTION DESCRIPTOR – A Fuzzy Rule based Composite Descriptor Appropriate for Hand Drawn Color Sketches Retrieval” received the best paper award at the 2nd International Conference on Agents and Artificial Intelligence (ICAART), Valencia-Spain, January 22-24, 2010.

Congratulations to Savvas!

Links


Lire: Submission to the ACM Multimedia Open Source Contest 2008

I recently submitted Lire and LireDemo to the ACM Multimedia Open Source Software Competition 2008. As I’d really like to go there I hope it will judged as relevant contribution and a demo at the ACM Multimedia is requested. Note that I’ve integrated a new feature in LireDemo for the ACM Multimedia submission: Now its easier to test Lire by just indexing random photos from Flickr. By just hitting the “Index” button without giving a directory of images the download will start automatically.

Links: