It’s been more than a year since we made a release, so there have been lots of changes, fixes and new features. Most important of all is the integration of the SIMPLE descriptor, a local feature that works extremely well for content based image retrieval. This has been done with a lot of help from Nektarios Anagnostopoulos and Savvas Chatzichristofis!
Besides that we switched to OpenCV for SURF and SIFT extraction, added numerous bug fixes, updated Lucene to 4.10.2 and much more. Best you give it a try.
I’ve just uploaded LIRE 0.9.4 beta to the Google Code downloads page. This is an intermediate release that reflects several changes within the SVN trunk. Basically I put it online as there are many, many bugs solved in this one and it’s performing much, much faster than the 0.9.3 release. If you want to get the latest version I’d recommend to stick to the SVN. However, currently I’m changing a lot of feature serialization methods, so there’s no guarantee that an index created with 0.9.4 beta will work out with any newer version. Note also that the release does not work with older indexes 😉
Major changes include, but are not limited to:
New features: PHOG, local binary patterns and binary patterns pyramid
Parallel indexing: a producer-consumer based indexing application that makes heavy use of available CPU cores. On a current Intel Core i7 or considerably large Intel Xeon system it is able to reduce extraction to a marginal overhead to disk I/O.
Intermediate byte based feature data files: a new way to extract features in a distributed way
In-memory cached ImageSearcher: as long as there is enough memory all linear searching is done in memory without much disk I/O (cp. class GenericFastImageSearcher and set caching to true)
Approximate indexing based on hashing: tests with 1.5 million led to search time < 300ms (cp. GenericDocumentBuilder with hashing set to true and BitSamplingImageSearcher)
Footprint of many global descriptors has been significantly reduced. Examples: EdgeHistogram 40 bytes, ColorLayout 504 bytes, FCTH 96 bytes, …
New unit test for benchmarking features on the UCID data set.
The current LireDemo 0.9.4 beta release features a new indexing routine, which is much faster than the old one. It’s based on the producer-consumer principle and makes — hopefully — optimal use of I/O and up to 8 cores of a system. Moreover, the new PHOG feature implementation is included and you can give it a try. Furthermore JCD, FCTH and CEDD got a more compact representation of their descriptors and use much less storage space now. Several small changes include parameter tuning on several descriptors and so on. All the changes have been documented in the CHANGES.txt file in the SVN.
I just uploaded Lire 0.9.3 to the all new Google Code page. This is the first version with full support for Lucene 4.0. Run time and memory performance are comparable to the version using Lucene 3.6. I’ve made several improvements in terms of speed and memory consumption along the way, mostly within the CEDD feature. Also I’ve added two new features:
JointHistogram – a 64 bit RGB color histogram joined with pixel rank in the 8-neighborhood, normalized with max-norm, quantized to [0,127], and JSD for a distance function
Opponent Histogram – a 64 bit histogram utilizing the opponent color space, normalized with max-norm, quantized to [0,127], and JSD for a distance function
Both features are fast in extraction (the second one naturally being faster as it does not investigate the neighborhood) and yield nice, visually very similar results in search. See also the image below showing 4 queries, each with the new features. The first one of a pair is always based on JointHistogram, the second is based on the OpponentHistogram (click ko see full size).
I also changed the Histogram interface to double as the double type is so much faster than float in 64 bit Oracle Java 7 VM. Major bug fix was in the JSD dissimilarity function. So many histograms now turned to use JSD instead of L1, depending on whether they performed better in the SIMPLIcity data set (see TestWang.java in the sources).
Final addition is the Lire-SimpleApplication, which provides two classes for indexing and search with CEDD, ready to compile with all libraries and an Ant build file. This may — hopefully — help those that still seek Java enlightenment 😀
Finally this just leaves to say to all of you: Merry Christmas and a Happy New Year!
I just submitted my code to the SVN and created a download for Lire 0.9.3_alpha. This version features support for Lucene 4.0, which changed quite a bit in its API. I did not have the time to test the Lucene 3.6 version against the new one, so I actually don’t know which one is faster. I hope the new one, but I fear the old one 😉
This is a pre-release for Lire for Lucene 4.0
Global features (like CEDD, FCTH, ColorLayout, AutoColorCorrelogram and alike) have been tested and considered working. Filters, like the ReRankFilter and the LSAFilter also work. The image shows a search for 10 images with ColorLayout and the results of re-ranking the result list with (i) CEDD and (ii) LSA. Visual words (local features), metric indexes and hashing have not been touched yet, beside making it compile, so I strongly recommend not to use them. However, due to a new weighting approach I assume that the visual word implementation based on Lucene 4.0 will — as soon as it is done — be much better in terms for retrieval performance.
I just uploaded version 0.9.2 of Lire and LireDemo to Google Code. Yes, Google Code! I also migrated (more or less in a under cover action some month ago) the SVN trunk to Google Code and will move on with development there. Main reasons were that ads were getting more and more aggressive over at sf.net and the interface of a Google Code project is so much cleaner and easier to handle from a project manager point of view.
Lire 0.9.2 fixes two bugs in KMeans and GenericImageSearcher. Both were critical. The KMeans fix allows now for the use of the bag of visual words approach. The GenericImageSearcher fix makes search much faster.
LireDemo 0.9.1 was released earlier today. Changes include several bug fixes, whereas the most critical was the one that prevented the indexing process from indexing a whole directory. Also the an SVN version of Lire has been compiled and added.
I just released Lire and Lire Demo in version 0.9 on sourceforge.net. Basically it’s the alpha version with additional speed and stability enhancements for bag of visual words (BoVW) indexing. While this has already been possible in earlier versions I re-furbished vocabulary creation (k-means clustering) and indexing to support up to 4 CPU cores. I also integrated a function to add documents to BoVW indexes incrementally. So a list of major changes since Lire 0.8 includes
Major speed-up due to change and re-write of indexing strategies for local features
Auto color correlation and color histogram features improved
Re-ranking filter based on global features and LSA
Parallel bag of visual words indexing and search supporting SURF and SIFT including incremental index updates (see also in the wiki)
Added functionality to Lire Demo including support for new Lire features and a new result list view
Finally I found some time to go through Lire and fix several of the — for me — most annoying bugs. While this is still work in progress I have a preview with the demo uploaded to sf.net. New features are:
Auto Color Correlogram and Color Histogram features improved
Re-ranking based on different features supported
Enhanced results view
Much faster indexing (parallel, use -server switch for your JVM)
Much faster search (re-write of the searhc code in Lire)
New developer menu for faster switching of search features
Re-ranking of results based on latent semantic analysis
You can find the updated Lire Demo along with a windows launcher here, Mac and Linux users please run it using “java -jar … ” or double click (if your windows manager supports actions like that
I just released LIRe v0.8. LIRe – Lucene Image Retrieval – is a Java library for easy content based image retrieval. Based on Lucene it doesn’t need a database and works reliable and rather fast. Major change in this version is the support of Lucene 3.0.1, which has a changed API and better performance on some OS. A critical bug was fixed in the Tamura feature implementation. It now definitely performs better Hidden in the depths of the code there is an implementation of the approximate fast indexing approach of G. Amato. It copes with the problem of linear search and provides a method for fast approximate retrieval for huge repositories (millions?). Unfortunately I haven’t tested with millions, just with tens thousands, which proves that it works, but it doesn’t show how fast.