Lire - Lucene Image REtrieval

The LIRE (Lucene Image REtrieval) library a simple way to create a Lucene index of image features for content based image retrieval (CBIR). There is no complete list of features, but these are some of them:

Furthermore, methods for searching the index based on Lucene are provided.

The LIRE library started out as part of the Caliph & Emir project and aimed to provide the CBIR features of Caliph & Emir to other Java projects in an easy and light weight way. In the meantime it has turned out as big and interesting project itself.

With Lire you can easily create an index and search through the index. LIRE 1.0 also supports local features based on bag of visual words and the SIMPLE approach, see Builders.

I recommend to start with taking a look at the SimpleApplication package of LIRE, which covers the most needed stuff including indexing, search and extraction of image features for use in other applications. It's also a good idea to work on the current SVN version of LIRE: How to check out and set up LIRE in the IDEA IDE.

Note at this point that LIRE comes with Apache Ant build files, named build.xml. You can use the tasks to create the jar from the source code as soon as you have Ant installed, or you are using an IDE prepared for that, like IDEA, Eclipse or NetBeans. Apache Ant can be found at the Apache Ant Project Page

If you are searching for the Solr plugin of LIRE ... it's still under construction. Some global features are working fine and its based on Solr 4.10.2. It can be found at BitBucket. It has been reported working on distributed installations.

How does Lire actually work?

Lire employs global image features for content based image retrieval. For more information on the underlying methods and techniques you should consult the basic literature on content based images retrieval:

Further it uses the Java search engine Lucene to provide

Performance

Parallel indexing with the ParallelIndexer running with 8 threads on a AMD A10 with 4 cores and 4.4 GHz, Windows 7 64 bits extracting 7 features at once including hashing is down to ~180 ms per image. On a Intel Core i7, ie. the 4770K, it runs a lot faster, using an SSD then speeds up the process even more. Extracting single features with the ParallelIndexer is on a core i7 typically faster than 1 MP images can be read from a (magnetic) hard disk.

Search is a matter of index size and is down to a few ms for 100k and less images, and increases linearly.