Run time for search in LIRE heavily depends on the method used for indexing and search. There are two main ways to store data and two search strategies for linear search and there is approximate indexing of course. The two storing strategies are to (i) store the actual feature vector in a Lucene text field and (ii) to use the Lucene DocValues data format. While the former allows for easy access, more flexibility and compression, the latter is much faster when accessing raw byte data. Linear search then needs to open each and every document and compare the query vector to the one stored in the document. For linear search in Lucene text fields, caching boosts performance, so the byte data of the feature vectors is read once from the index and stored in memory. For the DocValues data storage format access is fast enough to allow for linear search. With approximate indexing a query string is used on the inverted index and only the first k best matching candidates are used to fin the n << k actual results by linear search. So first a text search is done, then a linear search on much less images is performed . In our tests we used k=500 and n=10.
Tests on 499,207 images have shown that with this order approximate search is already outperforming linear search. The following numbers are given in ms search time. Note at this point that the average value per search differs for a different number of test runs due to the context of the runs, ie. the state of the Java VM, OS processes, file systems, etc. But the trend can be seen.
ms per search avg. on 10 runs
ms per search avg. on 100 runs
|Cached linear search on text fields (*)
|Not cached linear search on text fields
|Linear search on DocValues
|Approximate search by Metric Spaces (**)
(*) Start-up latency when filling the cache was 6.098 seconds
(**) Recall with 10 results on ten runs was 0.76, on 100 run recall was 0.72
As a conclusion with nearly 500,000 images the DocValues approach might be the best choice, as the approximate indexing is loosing around 25% of the results while not boosting runtime performance that much. Further optimization would be for instance query bundling or index splitting in combination with multithreading.
 Gennaro, Claudio, et al. “An approach to content-based image retrieval based on the Lucene search engine library.” Research and Advanced Technology for Digital Libraries. Springer Berlin Heidelberg, 2010. 55-66.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. (c) Mathias Lux, 2015
I just moved Caliph & Emir to Github. Due to necessary bug fixes, which now allow it to be run on Java 7 & 8 I took the chance to move it to the more sleek and focused Github portal. I also released the binary version as v0.9.27 tested with Java 7 and 8 on Windows 7. If you use Caliph & Emir for research, please don’t forget to cite
Lux, Mathias. “Caliph & Emir: MPEG-7 photo annotation and retrieval.” Proceedings of the 17th ACM international conference on Multimedia. ACM, 2009.
Hi there! I’m writing this basically in my own interest, but it may save me and you time I currently get a lot of mails regarding LIRE on how to do that, how to use that, or how to fix that. While I totally appreciate the popularity, I still have a “day job”, by being an associate professor at Klagenfurt University in Austria and I cannot answer all the mails. However, there are skilled and motivated people there, subscribed to the mailing list, who can answer many of the questions, and there are others, who might have the same questions, but have to ask them over again because of those private conversations.
Therefore, I ask for all contact on LIRE support to be handled on the mailing list and not via private messages or mails to me. This helps all of us, as questions will less often be asked twice, and other people get a chance to help.
If the mailing list does not work for you because you need to keep a lid on a not-yet released project, or you want to have priority support, you can make use of the consulting services I offer in context of LIRE and content based retrieval.
Due to a security breach on the wiki, the developer documentation was spammed by a bot net, leading to a warning from my hosting company to get it under control. After getting it under control by basically taking the wiki online and replacing it with a static error page, we cam up with a new way of handling the developer documentation.
First it’s now based on markdown files, which are located in the LIRE source SVN. We are using MkDocs, an awesome python script, to generate the static html files and serve them from the former wiki location.
So how to contribute now? It’s rather easy: Either write a new markdown file for a topic, or checkout an existing one, edit it to your will and send us a patch on the mailing list.
It’s been more than a year since we made a release, so there have been lots of changes, fixes and new features. Most important of all is the integration of the SIMPLE descriptor, a local feature that works extremely well for content based image retrieval. This has been done with a lot of help from Nektarios Anagnostopoulos and Savvas Chatzichristofis!
Besides that we switched to OpenCV for SURF and SIFT extraction, added numerous bug fixes, updated Lucene to 4.10.2 and much more. Best you give it a try.
>> Head over to the downloads
I just put up the new LIRE demo at http://demo-itec.uni-klu.ac.at/liredemo/. It’s based on the LIRE Solr plugin, which now supports arbitrary LIRE features and has been updated to fit the current Solr version 4.10.2.
Check out the new search options for searching for tags in combination with image similarity. Basically you can use the first parameter box to search for any string (ie. tags:dog) and the use the sort option below the images to re-rank the images according to the similarity of the selected picture.
Btw. thanks go to my department, the Department of Information Technology at the Faculty of Technical Sciences of the Alpen-Adria-Universität Klagenfurt for running the demo on their servers.
Our submission to the interactive Art Track of ACM Multimedia 2014 has been accepted: “Gone: An Interactive Experience for Two People” by Michael Riegler, Mathias Lux, Christian Zellot, Lukas Knoch, Horst Schnattler, Sabrina Napetschnig, Julian Kogler, Claus Degendorfer, Norbert Spot und Manuel Zoderer.
Our project is an interactive installation, where two people interact based solely on audio clues triggered by one person. The other person moves an avatar through a virtual space base on the audio clues.
The software for the installation is open source and available on bitbucket.
I just got word that our joint submission with Giuseppe Becchi, Marco Bertini and colleagues from Firenze has been accepted for presentation and publication at the open source software track at ACM Multimedia 2014 in Orlando, FL:
Giuseppe Becchi, Marco Bertini, Lorenzo Cioni, Alberto Del Bimbo, Andrea Ferracani, Daniele Pezzatini and Mathias Lux (2014) Loki+Lire: A Framework to Create Web-Based Multimedia Search Engines, in Proceedings ACM Multimedia 2014, Orlando, FL (to appear )
Check the web site for more information: Loki: A Cross-Media Search Engine
The power of crowds – leveraging a large number of human contributors and the capabilities of human computation – has enormous potential to address key challenges in the area of multimedia research. Crowdsourcing offers a time- and resource-efficient method for collecting large volumes of input for system design and evaluation, making it possible to optimize multimedia systems more rapidly and to address human factors more effectively. At present, crowdsourcing remains notoriously difficult to exploit effectively in multimedia settings: the challenge arises from the fact that a community of users or workers is a complex and dynamic system highly sensitive to changes in the form and the parameterization of their activities.
The submission deadline has been extended to July 15, 2014
The third CrowdMM workshop takes place in Orlando, FL, right along ACM Multimedia 2014. For more information, topics and important dates visit: http://www.crowdmm.org/call-for-papers/