People lately asked whether LIRE can do more than linear search and I always answered: Yes, it should … but you know I never tried. But: Finally I came around to index the MIR-FLICKR data set and some of my Flickr-crawled photos and ended up with an index of 1,443,613 images. I used CEDD as main feature and a hashing algorithm to put multiple hashes per images into Lucene — to be interpreted as words. By tuning similarity, employing a Boolean query, and adding a re-rank step I ended up with a pretty decent approximate retrieval scheme, which is much faster and does not loose too many images on the way, which means the method has an acceptable recall. The image below shows the numbers along with a sample query. Linear search took more than a minute, while the hashing based approach did (nearly) the same thing in less than a second. Note that this is just a sequential, straight forward approach, so no optimization has been done to the performance. Also the hashing approach has not yet been investigated in detail, i.e. there are some parameters that still need some tuning … but let’s say it’s a step into the right direction.
LIRE is not a sleeping beauty, so there’s something going on in the SVN. I recently checked in updates on Lucene (now 4.2) and Commons Math (now 3.1.1). Also I removed some deprecation things still left from Lucene 3.x.
Most notable addition however is the Extractor / Indexor class pair. They are command line applications that allow to extract global image features from images, put them into an intermediate data file and then — with the help of Indexor — write them to an index. All images are referenced relatively to the intermediate data file, so this approach can be used to preprocess a whole lot of images from different computers on a network file system. Extractor also uses a file list of images as input (one image per line) and can be therefore easily run in parallel. Just split your global file list to n smaller, non overlapping ones and run n Extractor instances. As the extraction part is the slow one, this should allow for a significant speed-up if used in parallel.
Extractor is run with
$> Extractor -i <infile> -o <outfile> -c <configfile>
- <infile> gives the images, one per line. Use “dir /s /b *.jpg > list.txt” to create a compatible list on Windows.
- <outfile> gives the location and name of the intermediate data file. Note: It has to be in a folder parent to all images!
- <configfile> gives the list of features as a Java Properties file. The supported features are listed below the post. The properties file looks like:
Indexor is run with
Indexor -i <input-file> -l <index-directory>
- <input-file> is the output file of Extractor, the intermediate data file.
- <index-directory> is the directory of the index the images will be added (appended, not overwritten)
Features supported by Extractor:
The 11th International Content Based Multimedia Indexing Workshop is to bring together the various communities involved in all aspects of content-based multimedia indexing, retrieval, browsing and presentation. Following the ten successful previous events of CBMI (Toulouse 1999, Brescia 2001, Rennes 2003, Riga 2005, Bordeaux 2007, London 2008, Chania 2009, Grenoble 2010, Madrid 2011, and Annecy 2012), the University of Pannonia, Hungary organizes the 11th Context Based Multimedia Indexing Workshop on June 17-19 2013 in the historical town of Veszprém, Hungary, near the spectacular Lake Balaton. The workshop will host invited keynote talks and regular, special and demo sessions with contributed research papers.
For more information see http://cbmi2013.mik.uni-pannon.hu/
Finally! Our book on LIRE is published! It was an incredibly long way to get there, but it was worth it! We are also very happy with our publisher, especially as the ebook PDF version starts as low as 20$! So it’s definitely more affordable than comparable books! The book gives an introduction to the fields of information retrieval and visual information retrieval and points out selected methods as well as their use and implementation within LIRE. It’s intended to be a fully fledged course for those that want either to employ LIRE in their projects or those who want to build upon and extend LIRE. Find the book here.
Visual Information Retrieval using Java and LIRE
Synthesis Lectures on Information Concepts, Retrieval, and Services
January 2013, 112 pages, (doi:10.2200/S00468ED1V01Y201301ICR025)
Mathias Lux (Alpen Adria Universität Klagenfurt, AT)
Oge Marques (Florida Atlantic University, USA)
Abstract. Visual information retrieval (VIR) is an active and vibrant research area, which attempts at providing means for organizing, indexing, annotating, and retrieving visual information (images and videos) from large, unstructured repositories.
The goal of VIR is to retrieve matches ranked by their relevance to a given query, which is often expressed as an example image and/or a series of keywords. During its early years (1995-2000), the research efforts were dominated by content-based approaches contributed primarily by the image and video processing community. During the past decade, it was widely recognized that the challenges imposed by the lack of coincidence between an image’s visual contents and its semantic interpretation, also known as semantic gap, required a clever use of textual metadata (in addition to information extracted from the image’s pixel contents) to make image and video retrieval solutions efficient and effective. The need to bridge (or at least narrow) the semantic gap has been one of the driving forces behind current VIR research. Additionally, other related research problems and market opportunities have started to emerge, offering a broad range of exciting problems for computer scientists and engineers to work on.
In this introductory book, we focus on a subset of VIR problems where the media consists of images, and the indexing and retrieval methods are based on the pixel contents of those images — an approach known as content-based image retrieval (CBIR). We present an implementation-oriented overview of CBIR concepts, techniques, algorithms, and figures of merit. Most chapters are supported by examples written in Java, using Lucene (an open-source Java-based indexing and search implementation) and LIRE (Lucene Image REtrieval), an open-source Java-based library for CBIR.
Due to comments from a friend I re-investigated the Tanimoto coefficient implementation of the CEDD class and could just delete an unnecessary ”if” statement. This results in a 10% speedup in tests of linear search with 150k images. With the new implementation search takes ~460 ms (averaged over 10 runs) on a 64-bit VM, quad-core, 8GB RAM with the index being on a SSD. Find the new version in the SVN.
October 21–25, 2013 Barcelona, Spain.
Since the founding of ACM SIGMM in 1993, ACM Multimedia has been the worldwide premier conference and a key world event to display scientific achievements and innovative industrial products in the multimedia field. At ACM Multimedia 2013, we will celebrate its twenty-first iteration with an extensive program consisting of technical sessions covering all aspects of the multimedia field in forms of oral and poster presentations, tutorials, panels, exhibits, demonstrations and workshops, bringing into focus the principal subjects of investigation, competitions of research teams on challenging problems, and also an interactive art program stimulating artists and computer scientists to meet and discover together the frontiers of artistic communication
- Abstracts for Papers Due: March 1, 2021
- Full/short Papers Due: March 8, 2013, see here
- Workshop Proposals Due: January 9, 2013, see here
- Tutorials Due: January 15, 2013, see here
More information at http://www.acmmm13.org
I just uploaded Lire 0.9.3 to the all new Google Code page. This is the first version with full support for Lucene 4.0. Run time and memory performance are comparable to the version using Lucene 3.6. I’ve made several improvements in terms of speed and memory consumption along the way, mostly within the CEDD feature. Also I’ve added two new features:
- JointHistogram - a 64 bit RGB color histogram joined with pixel rank in the 8-neighborhood, normalized with max-norm, quantized to [0,127], and JSD for a distance function
- Opponent Histogram - a 64 bit histogram utilizing the opponent color space, normalized with max-norm, quantized to [0,127], and JSD for a distance function
Both features are fast in extraction (the second one naturally being faster as it does not investigate the neighborhood) and yield nice, visually very similar results in search. See also the image below showing 4 queries, each with the new features. The first one of a pair is always based on JointHistogram, the second is based on the OpponentHistogram (click ko see full size).
I also changed the Histogram interface to double as the double type is so much faster than float in 64 bit Oracle Java 7 VM. Major bug fix was in the JSD dissimilarity function. So many histograms now turned to use JSD instead of L1, depending on whether they performed better in the SIMPLIcity data set (see TestWang.java in the sources).
Final addition is the Lire-SimpleApplication, which provides two classes for indexing and search with CEDD, ready to compile with all libraries and an Ant build file. This may — hopefully — help those that still seek Java enlightenment
Finally this just leaves to say to all of you: Merry Christmas and a Happy New Year!
In the course of finishing the book, I reviewed several aspects of the LIRE code and came across some bugs, including one with the Jensen-Shannon divergence. This dissimilarity measure has never been used actively in any features as it didn’t work out in retrieval evaluation the way it was meant to. After two hours staring at the code the realization finally came. In Java the short if statement, “x ? y : z” is overruled by almost any operator including ‘+’. Hence,
System.out.print(true ? 1: 0 + 1) prints '1',
System.out.print((true ? 1: 0) + 1) prints '2'
With this problem identified I was finally able to fix the implementation of the Jensen-Shannon divergence implementation and came to new retrieval evaluation results on the SIMPLIcity data set:
|Color Histogram - JSD||0,450||0,704||0,191|
|Joint Histogram - JSD||0,453||0,691||0,196|
Note that the color histogram in the first row now performs similarly to the “good” descriptors in terms of precision at ten and error rate. Also note that a new feature creeped in: Joint Histogram. This is a histogram combining pixel rank and RGB-64 color.
All the new stuff can be found in SVN and in the nightly builds (starting tomorrow
I just submitted my code to the SVN and created a download for Lire 0.9.3_alpha. This version features support for Lucene 4.0, which changed quite a bit in its API. I did not have the time to test the Lucene 3.6 version against the new one, so I actually don’t know which one is faster. I hope the new one, but I fear the old one
This is a pre-release for Lire for Lucene 4.0
Global features (like CEDD, FCTH, ColorLayout, AutoColorCorrelogram and alike) have been tested and considered working. Filters, like the ReRankFilter and the LSAFilter also work. The image shows a search for 10 images with ColorLayout and the results of re-ranking the result list with (i) CEDD and (ii) LSA. Visual words (local features), metric indexes and hashing have not been touched yet, beside making it compile, so I strongly recommend not to use them. However, due to a new weighting approach I assume that the visual word implementation based on Lucene 4.0 will — as soon as it is done — be much better in terms for retrieval performance.
- Downloads at the Google Code Page
Topics of interest include, but are not limited to:
– Multimedia content analysis and understanding
– Content-based browsing, indexing and retrieval of images, video and audio
– Advanced descriptors and similarity metrics for multimedia
– Audio and music analysis, and machine listening
– Audio-driven multimedia content analysis
– 2D/3D feature extraction
– Motion analysis and tracking
– Multi-modal analysis for event recognition
– Human activity/action/gesture recognition
– Video/audio-based human behavior analysis
– Emotion-based content classification and organization
– Segmentation and reconstruction of objects in 2D/3D image sequences
– 3D data processing and visualization
– Content summarization and personalization strategies
– Semantic web and social networks
– Advanced interfaces for content analysis and relevance feedback
– Content-based copy detection
– Analysis and tools for content adaptation
– Analysis for coding efficiency and increased error resilience
– Multimedia analysis hardware and middleware
– End-to-end quality of service support
– Multimedia analysis for new and emerging applications
– Advanced multimedia applications
- Proposal for Special Sessions: 4th January 2013
- Notification of Special Sessions Acceptance: 11th January 2013
- Paper Submission: 8th March 2013
- Notification of Papers Acceptance: 3rd May 2013
- Camera-ready Papers: 24th May 2013
See http://wiamis2013.wp.mines-telecom.fr/ for more information.