Searching with Lire in big datasets

October 27, 2011 on 10:17 am | Tags: , , , , | In General, Java, Software | No Comments

Having received several complaints about the slowness of Lire when searching in 100k+ documents I took my time to write a small how to to explain approaches for search in big (relatively) data sets.

Lire has the ability to create indexes with lots of different features (descriptors, like RGB color histograms or CEDD). While this opens the opportunity to flexibility at search time as we can select the feature at the time we create a query, the index tends to get bigger and bigger and searcher take longer and longer.

With a data set of 121,379 images the index created with the features selected for default in Lire Demo has a size of 14,3 GB on the disk. In contrast to that an index just storing the CEDD feature along with the image identifier has a size of 29 MB.

Due to the size of the index also linear search tends to get slower. While for the index stripped down to the CEDD feature and the identifier searching takes (on a AMD Quad-Core computer with 4GB RAM and Java 1.7) roughly 0.33 seconds, searching the big index takes 7 minutes and 3 seconds.

So if you want to index and search big data sets (> 100.000 images for instance) I recommend to

  • select which features you need,
  • create the index with a minimum set of features, and
  • eventually split the index per feature and select the index on the fly instead of the feature
  • also you can load the index into RAM

For more on loading the index to RAM and the option to use local features read on in the developer wiki.

Lire publication in the top 10 downloads of ACM SIGMM

October 24, 2011 on 1:42 pm | Tags: , , , | In General, Java, Software | 1 Comment

As to be found in this month’s SIGMM record, which is the electronic SIGMM newsletter, a publication about Lire is in the top 10 downloads of the ACM special interest group on multimedia for September 2011.

I co-authored the paper with Savvas Chatzichristofis:

Mathias Lux, Savvas A. Chatzichristofis. Lire: lucene image retrieval: an extensible java CBIR library. In ACM Multimedia 2008

It’s also the paper I recommend to include in references if Lire is used within a scientific publication, so my thanks also go to the authors citing and therefore pointing to our work!

Lire and Lire Demo v 0.9 released

October 20, 2011 on 12:37 pm | Tags: , , , , , , , | In Dev, General, Java, Multimedia, Software | No Comments

I just released Lire and Lire Demo in version 0.9 on sourceforge.net. Basically it’s the alpha version with additional speed and stability enhancements for bag of visual words (BoVW) indexing. While this has already been possible in earlier versions I re-furbished vocabulary creation (k-means clustering) and indexing to support up to 4 CPU cores. I also integrated a function to add documents to BoVW indexes incrementally. So a list of major changes since Lire 0.8 includes

  • Major speed-up due to change and re-write of indexing strategies for local features
  • Auto color correlation and color histogram features improved
  • Re-ranking filter based on global features and LSA
  • Parallel bag of visual words indexing and search supporting SURF and SIFT including incremental index updates (see also in the wiki)
  • Added functionality to Lire Demo including support for new Lire features and a new result list view

Download and try:

Lire Demo 0.9 alpha 2 just released

August 5, 2011 on 11:41 am | Tags: , , , , , | In Dev, Java, Multimedia, Software | No Comments

Finally I found some time to go through Lire and fix several of the — for me — most annoying bugs. While this is still work in progress I have a preview with the demo uploaded to sf.net. New features are:

  • Auto Color Correlogram and Color Histogram features improved
  • Re-ranking based on different features supported
  • Enhanced results view
  • Much faster indexing (parallel, use -server switch for your JVM)
  • Much faster search (re-write of the searhc code in Lire)
  • New developer menu for faster switching of search features
  • Re-ranking of results based on latent semantic analysis

You can find the updated Lire Demo along with a windows launcher here, Mac and Linux users please run it using “java -jar … ” or double click (if your windows manager supports actions like that :)

The source is — of course — GPL and available in the SVN.

Visual VM is Part of Java 1.6 Update 7

July 14, 2008 on 1:59 pm | Tags: , | In Development, Java | 2 Comments

visualvm.pngJava 1.6 u7 was released recently by Sun. While not bringing major changes it brought along some bug fixes and solved some security issues. However there is one main addition: The VisualVM. This is a really great developer tool: It connects to running VMs and shows “some statistics” about them. Besides memory usage and threads information it also allows to do some basic profiling. In my opinion Sun did a good job on including VisualVM in the package! Not that this thing is build on the NetBeans Platform ;-)

Links:

Lire: Submission to the ACM Multimedia Open Source Contest 2008

June 17, 2008 on 10:00 am | Tags: , , , , , | In Development, General, Java, Lire, LireDemo, OpenSource, Research | No Comments

I recently submitted Lire and LireDemo to the ACM Multimedia Open Source Software Competition 2008. As I’d really like to go there I hope it will judged as relevant contribution and a demo at the ACM Multimedia is requested. Note that I’ve integrated a new feature in LireDemo for the ACM Multimedia submission: Now its easier to test Lire by just indexing random photos from Flickr. By just hitting the “Index” button without giving a directory of images the download will start automatically.

Links:

LIRE v0.6 released: New Image Features

June 9, 2008 on 3:35 pm | Tags: , | In Imaging, Java, Lire, LireDemo, Release, Releases, Software | 2 Comments

The new release contains three additional features: (i) Tamura texture features, (ii) Color and Edge Directivity Descriptor (CEDD) and (iii) a configurable color histogram implementation. While the last one was integrated for comparison only the other two provide additional improvements, especially the CEDD feature. Furthermore a FastMap implementation was included in the release for optimization of the indexing process in a later release. Also some bugs were fixed in the MPEG-7 EdgeHistogram descriptor provided in the cbir-library jar file and in color-only search. Note that due to the increased number of features the extensive document builder, which extracts all available features, needs significantly more time for extraction than in the last release.

Links:

Lire SVN build for Java 1.5

May 30, 2008 on 1:29 pm | Tags: , , , | In CaliphEmir, Dev, Development, Imaging, Java, Lire, LireDemo, Releases | No Comments

Due to requests I took some time and built a Java 1.5 version instead of the 1.6 versions. A simple compile with 1.5 wouldn’t help as I use the swing layout classes of NetBeans (now integrated in Java 1.6), so imports have to be re-adjusted and the library has to be added. Furthermore I created an explicit build target in Caliph to create a 1.5 version of the cbir jar file. This snapshot works fine with MacOS (as far as I’ve heard) and on Windows.

Files:

Lire development: a big next step ..

May 29, 2008 on 9:12 am | Tags: , , , , | In Dev, Development, General, Imaging, Java, Lire, LireDemo, Multimedia, OpenSource, Releases | No Comments

While it has been quiet for some time around Lire, recently development has been pushed forward. I switched to SVN for development and integrated simple RGB color histograms as a feature for comparison with the MPEG-7 features. Savvas Chatzichristofis (or on facebook, his image search engine) contributed the CEDD feature, which works great! Marko Keuschnig and Christian Penz contributed implementations for the Gabor texture feature and the Tamura texture features, where the latter is already in the SVN. I also integrated the new features in LireDemo. A new version – already compiled – can be downloaded here: liredemo-svn-2008-05-29-jdk16.tar.bz2 Note that Java 1.6 is required.

NetBeans 6.1 Released

April 30, 2008 on 12:51 pm | Tags: , , , | In Development, General, Java, Netbeans, Releases | No Comments

The new NetBeans IDE 6.1 has been released 2 days ago. Changes are more incremental than fundamental, but it features now support for JavaScript and code completion for JavaDoc. Furthermore support for MySQL has been added. Release notes can be found here.

Next Page »

© 2004-2010 by Mathias Lux
>> Contents of this page are licensed under the Creative Commons Attribution-Share Alike 3.0 Austria License license <<