User Tools

Site Tools


lire:lire

Lire - Lucene Image REtrieval

The LIRE (Lucene Image REtrieval) library a simple way to create a Lucene index of image features for content based image retrieval (CBIR). The implemented features are

  • ScalableColor, ColorLayout and EdgeHistogram (MPEG-7)
  • CEDD and FCTH (contributed by Savvas Chatzichristofis)
  • Color histograms (HSV and RGB), Tamura & Gabor, auto color correlogram, JPEG coefficient histogram (common global descriptors)
  • Visual words based on SIFT and SURF

Furthermore methods for searching the index based on Lucene are provided.

The LIRE library started out as part of the Caliph & Emir project and aimed to provide the CBIR features of Caliph & Emir to other Java projects in an easy and light weight way. In the meantime it has turned out as big and interesting projekt itself.

With Lire you can easily create an index and search through the index.

How does Lire actually work?

Lire employs global image features for content based image retrieval. For more information on the underlying methods and techniques you should consult the basic literature on content based images retrieval:

Further it uses the Java search engine Lucene to provide

  • linear search (opening each and every indexed document and comparing it to the query feature)
  • approximate indexing (based on the ideas of G. Amato on inverted files for image retrieval)

Performance

The performance of Lire has been tested initially with a test data set consisting of 3890 mixed size digital photos (1-2 MP) on an AMD Athlon XP 2600, 1GB RAM, JSDK 1.5.0_05 running Windows XP. Parameters for the Java VM were -server -Xms256M -Xmx512M.

Test A

Creation (FastDocumentBuilder):

  • 1200 seconds for all files
  • 308 ms per image on average

Searching with default Searcher … (averaged on 50 searches)

  • BufferedImage as input: 341 ms per search
  • Document as input: 64 ms per search

Test B

Creation (with ExtensiveDocumentBuilder):

  • 2813 seconds for all files
  • 723 ms per image on average (note: this is outdated as the number of features has increased with v0.6)

Searching with default Searcher on this index B (averaged on 50 searches)

  • BufferedImage as input: 589 ms per search
  • Document as input: 100 ms per search

Tips and Tricks

In general the DocumentBuilder from DocumentBuilderFactory.getFastDocumentBuilder() is the best choice for fast retrieval. The most time consuming task in there is the extraction of the features from the image itself, which is for the FastDocumentBuilder only one single very feature, which is a fast one base on color distribution in the image (MPEG-7 ColorLayout Descriptor).

The performance of the DocumentBuilder from DocumentBuilderFactory.getDefaultDocumentBuilder() better than the DocumentBuilderFactory.getExtensiveDocumentBuilder() one, but returns intuitively the best results.

lire/lire.txt · Last modified: 2012/07/06 09:23 by mlux