The current LireDemo 0.9.4 beta release features a new indexing routine, which is much faster than the old one. It’s based on the producer-consumer principle and makes — hopefully — optimal use of I/O and up to 8 cores of a system. Moreover, the new PHOG feature implementation is included and you can give it a try. Furthermore JCD, FCTH and CEDD got a more compact representation of their descriptors and use much less storage space now. Several small changes include parameter tuning on several descriptors and so on. All the changes have been documented in the CHANGES.txt file in the SVN.
In the current SVN version three global features have been re-visited in terms of serialization. This was necessary as the index of the web demo with 300k images already exceed 1.5 GB.
This significant reduction in space leads to (i) smaller indexes, (ii) reduced I/O time, and (iii) therefore, to faster search.
How was this done? Basically it’s clever organization of bytes. In the case of JCD the histogram has 168 entries, each in [0,127], so basically half a byte.Therefore, you can stuff 2 of these values into one byte, but you have to take care of the fact, that Java only supports bit-wise operations on ints and bytes are signed. So the trick is to create an integer in [0, 2^8-1] and then subtract 128 to get it into byte range. The inverse is done for reading. The rest is common bit shifting.
The code can be seen either in the JCD.java file in the SVN, or in the snippet at pastebin.com for your convenience.
The LIRE web demo now includes an RGB color histogram as well as the MPEG-7 edge histogram implementation. The color histogram works well for instance for line art, such as this query.The edge histogram works fine for clear, gloabl edge distributions like queries such as this one. However, it’s performing different from PHOG. An example for the difference is this PHOG query compared to the according edge histogram query. The image below shows both queries.
A new web based LIRE demo is online. Within this demo you are able to search in an index of 300.000 images from the MIRFLICKR data set. Currently online queries from within the index are allowed, so no custom query images can be uploaded. The backend is plain LIRE, so there’s no search server and alike, and it’s the current SVN version. Search is done based on hashing, so the results are approximate, but they are immediately there. Also it’s just a selection of global features, but it’s enough to get the idea. The image below shows the result of two example searches.
The Kindle version of our book “Visual Information Retrieval using Java and LIRE” is now available on amazon.com (as well as Amazon in Germany, France, Italy, and Canada). It’s a good deal with 10$ (or something like 7.90 €) for the book, which is far cheaper than the PDF version and the paperback.
The realization that setting up the project is not too trivial led to the video howto. It’s available on YouTube and shows all steps from (an already started) fresh IntelliJ IDEA to running a Junit test for LIRE. Make sure you watch the video in 1080p / full HD to be able to read all the text.
With the implementation of the PHOG descriptor I came around the situation that no well-performing Canny Edge Detector in pure Java was available. “Pure” in my case means, that it just takes a Java BufferedImage instance and computes the edges. Therefore, I had to implement my own
As a result there is now a “simple implementation” available as part of LIRE. It takes a BufferedImage and returns another BufferedImage, which contains all the edges as black pixels, while the non-edges are white. Thresholds can be changed and the blurring filter using for preprocessing can be changed in code. Usage is dead simple:
BufferedImage in = ImageIO.read(new File("testdata/wang-1000/128.jpg"));
CannyEdgeDetector ced = new CannyEdgeDetector(in, 40, 80);
ImageIO.write(ced.filter(), "png", new File("out.png"));
The result is the picture below:
Yesterday I checked in the latest LIRE revision featuring the PHOG descriptor. I basically goes along image edge lines (using the Canny Edge Detector) and makes a fuzzy histogram of gradient directions. Furthermore it does that on different pyramid levels, meaning that the image is split up like a quad-tree and all sub-images get their histogram. All histograms of levels & sub-images are concatenated and used for retrieval. First tests on the SIMPLIcity data set have shown that the current configuration of PHOG included in LIRE outperforms the EdgeHistogram descriptor.
You can find the latest version of LIRE in the SVN & in the nightly builds.
- A. Bosch, A. Zisserman & X. Munoz. 2007. Representing shape with a spatial pyramid kernel. In Proceedings of CIVR ’07 — [DOI] [PDF]
People lately asked whether LIRE can do more than linear search and I always answered: Yes, it should … but you know I never tried. But: Finally I came around to index the MIR-FLICKR data set and some of my Flickr-crawled photos and ended up with an index of 1,443,613 images. I used CEDD as main feature and a hashing algorithm to put multiple hashes per images into Lucene — to be interpreted as words. By tuning similarity, employing a Boolean query, and adding a re-rank step I ended up with a pretty decent approximate retrieval scheme, which is much faster and does not loose too many images on the way, which means the method has an acceptable recall. The image below shows the numbers along with a sample query. Linear search took more than a minute, while the hashing based approach did (nearly) the same thing in less than a second. Note that this is just a sequential, straight forward approach, so no optimization has been done to the performance. Also the hashing approach has not yet been investigated in detail, i.e. there are some parameters that still need some tuning … but let’s say it’s a step into the right direction.
LIRE is not a sleeping beauty, so there’s something going on in the SVN. I recently checked in updates on Lucene (now 4.2) and Commons Math (now 3.1.1). Also I removed some deprecation things still left from Lucene 3.x.
Most notable addition however is the Extractor / Indexor class pair. They are command line applications that allow to extract global image features from images, put them into an intermediate data file and then — with the help of Indexor — write them to an index. All images are referenced relatively to the intermediate data file, so this approach can be used to preprocess a whole lot of images from different computers on a network file system. Extractor also uses a file list of images as input (one image per line) and can be therefore easily run in parallel. Just split your global file list to n smaller, non overlapping ones and run n Extractor instances. As the extraction part is the slow one, this should allow for a significant speed-up if used in parallel.
Extractor is run with
$> Extractor -i <infile> -o <outfile> -c <configfile>
- <infile> gives the images, one per line. Use “dir /s /b *.jpg > list.txt” to create a compatible list on Windows.
- <outfile> gives the location and name of the intermediate data file. Note: It has to be in a folder parent to all images!
- <configfile> gives the list of features as a Java Properties file. The supported features are listed below the post. The properties file looks like:
Indexor is run with
Indexor -i <input-file> -l <index-directory>
- <input-file> is the output file of Extractor, the intermediate data file.
- <index-directory> is the directory of the index the images will be added (appended, not overwritten)
Features supported by Extractor: