Category Archives: General

Special Issue on “Multimedia Analysis with Collective Intelligence”

Multimedia analysis has attracted extensive research interests and nowadays it forms the basis of a wide variety of applications and services, such as search, recommendation, advertising, and personalization. Existing technical approaches usually need to be guided with prior knowledge, such as data with labels. But collecting such knowledge is usually not easy, and the problem becomes even worse when we need to deal with big data. Therefore, a lot of research efforts turn to mine the knowledge by exploring collective intelligence. For example, crowds of grassroots users were allowed to generate, annotate and share their own data on public websites like Facebook, Flickr, and Youtube. Collective intelligence is widely embedded in these data as well as their tags, comments and ratings. Such intelligence can be leveraged in multimedia classification, search, recommendation, etc. Moreover, user behaviors interacting with computer and web also contain collective intelligence implicitly, such as users’ click-through, browsing, and viewing history. The widely existed collective intelligence offers us opportunities to tackle the difficulties in multimedia analysis. This special issue is intended to bring together the greatest research efforts along this direction and introduce them to readers.

Scope

The scope of this special issue is to cover all aspects that relate to multimedia analysis with collective intelligence. Topics of interest include, but are not limited to

  • Automatic multimedia data collection and labeling.
  • Interactive multimedia data collection and labeling.
  • Label denoising and refinement.
  • Multimedia feature learning with collective intelligence, including global and local feature extraction, keypoint detection, visual vocabulary construction, feature selection, etc.
  • Multimedia modeling with collective intelligence and applications, including classification, clustering, recommendation, etc.
  • User behavior modeling and mining.
  • Social media pattern recognition and mining.
  • Novel crowdsourcing systems, techniques, and interfaces.
  • Information for Authors

Authors should prepare their manuscript according to the Guide for Authors available from the online submission page of the ‘Journal of Visual Communication and Image Representation’ at http://ees.elsevier.com/jvci/. When submitting via this page, please select “VSI:CollectiveIntelligence” as the Article Type. Prospective authors should submit high quality, original manuscripts that have not appeared, nor are under consideration, in any other journals. All submissions will be peer reviewed following the JVCI reviewing procedures.

Important Dates

  • Manuscript Submission Deadline: February 28, 2017
  • Notification of Acceptance/Rejection: August 15, 2017
  • Final Manuscript Due to JVCI: August 31, 2017
  • Expected Publication Date: December, 2017

See also the call for submissions.

ICMR 2017 Deadline Approaching

This year the first time: Open Source Software Track at ICMR 2017, and the deadline is approaching fast: Jan 27, 2017.

ICMR 2017 is calling for papers presenting significant and innovative open source contributions of researchers and software developers to advance the field of multimedia retrieval by providing the community with implementations of middleware, frameworks, toolkits, libraries, players, authoring tools, and other multimedia software. We strongly believe that making available open source resources significantly advances the field by providing a common set of tools for (i) building upon previous work, (ii) improving multimedia research prototypes and (iii) allowing others to replicate research results more easily.

Continue reading

Lire 1.0b4 released

In the wake of integrating Lire in Solr 6.3 and updating it to Java 8, we put a milestone online: Lire 1.0 beta 4. Extending on the last version released (beta 2) it has now a more robust implementation of the metric spaces plugin, there is a new extraction tool, which stores extracted features as Base64 encoded strings along with an indexing tool that can read them and add hashes for faster search. As outlined in the last post, Gradle is now the main build system, which makes using Lire for development a whole lot easier.

Links

Gradle as Build System for LIRE & LireSolr

GradleIn the process of updating to Lucene version 6.3.0 we switched the main build system to Gradle. The source code is now better organized, following the structure of Maven with all the source divided into main and test, and the resources being separated. It allows for the use of Maven repositories, automates most tasks out-of-the-box and is extended easily using the Groovy programming language.

Gradle also comes with a wrapper, so as long Java is installed, the build process is fully automated and does not need additional software installed.

Moreover, build.gradle files can be easily used to import projects into JetBrains IDEA. LireDemo and SimpleApplication are sub projects with their own build.gradle files. Checkout the new structure at https://github.com/dermotte/lire.

CBMI 2017 – Call For Papers

CBMI aims at bringing together the various communities involved in all aspects of content-based multimedia indexing for retrieval, browsing, management, visualization and analytics.

Authors are encouraged to submit previously unpublished research papers in the broad field of content-based multimedia indexing and applications. In addition to multimedia and social media search and retrieval, we wish to highlight related and equally important issues that build on content-based indexing, such as multimedia content management, user interaction and visualization, media analytics, etc.

Deadlines

  • Full/short paper submission deadline: February 28, 2017
  • Demo paper submission deadline: February 28, 2017
  • Special Session proposals submission: November 16, 2016

more information …

ICMR 2017 – Call for Open Source Software Papers

ICMR 2017 is calling for papers presenting significant and innovative open source contributions of researchers and software developers to advance the field of multimedia retrieval by providing the community with implementations of middleware, frameworks, toolkits, libraries, players, authoring tools, and other multimedia software. We strongly believe that making available open source resources significantly advances the field by providing a common set of tools for (i) building upon previous work, (ii) improving multimedia research prototypes and (iii) allowing others to replicate research results more easily.

Each open source software paper should not be longer than 4 pages.

Important Dates

Paper Submission: January 27, 2017
Notification of Acceptance: March 29, 2017
Camera-Ready Papers Due: April 26, 2017

More information …

LIRE Use Case “What Anime is this?”

screenshot-animeI do not often hear of application built with LIRE, however if I do I really appreciate it. The use case of What Anime is this? is exceptional in many ways. First of all LIRE was very well applied and can really solve a problem there, and second, Soruly Ho tuned it to search through over 360 million images on a single server with incredibly reasonable response time.

The web page built by Soruly Ho provides a search interface to (re-)find frames in Anime videos. Not being into Anime myself I still know it’s a hand drawn or computer animation and it’s hugely popular among fans … and there are a lot of them.

Soruly Ho was so nice to compile background information on his project:

Thanks to the LIRE Solr Integration Project, I was able to develop the first prototype just 12 hours after I met LIRE, without touching a line of the source code! After setting up the web server and Solr, I just have to write a few scripts to put all the pieces together. To analyze the video, I use ffmpeg to extract each frame as a jpg file with the timecode as the file name. Then, the ParallelSolrIndexer analyze all these images and generate an XML file. Before loading this XML into Solr, I use a Python script to put the video path and timecode to the title field. Finally, I write a few lines of Javascript to use Solr REST API to submit the image URL to the LireRequestHandler. After some magic, it would return a list of matching images sorted by similarity, with the original video path and timecode in the title field. The idea is pretty simple. Every developer can build this.

But scaling is challenging. There are over 15,000 hours of video indexed in my search engine. Assume they are all 24 fps, there would be 1.3 billion frames in total. This is too big to fit in my server (which is just a high-end PC). Video always play forward in time, so I use a running window to remove duplicate frames. Unlike real life video, most anime are actually drawn in 12 fps or less, this method significantly reduces number of frames by 70%. Out of many feature classes supported by LIRE, I only use the Color Layout Descriptor and drop others to save space, memory and computation time for analysis. Now, each analyzed frame in my Solr index only occupies 197 Bytes. Still, solely relying on one image descriptor already achieves very high accuracy. Even after such optimization, the remaining 366 million frames are still too much that the query would often timeout. So I studied and modified a little bit of the LireRequestHandler. (It is great that LIRE is free and open source!) Instead of using the performance-killing BooleanClause.Occur.SHOULD, I search the hashes with BooleanClause.Occur.MUST one by one until a good match is found. I am only interested to images with similarity > 90%, i.e. there is at least one common hash if I select 10 out of 100 hash values at random. The search would complete in at most 10 iterations, otherwise, I assume there is no match. But random is not good because results are inconsistent, thus, cannot be cached. So I ran an analysis on the hash distribution, and always start searching from the least populated hash. So, similarity calculation is performed on a smaller set of images. The Color Layout Descriptor does not produce an evenly distributed hash on anime. Least populated hash matches only a few frames while most populated hash matches over 277 million frames. The last performance issue is keeping a 67.5GB index with just 32GB RAM, which I think can be solved with just more RAM.

The actual source I have modified and my hash distribution table, can be found on Github.

You can try What Anime is this? yourself at https://whatanime.ga/. Thanks to Soruly Ho for sharing his thoughts and building this great search engine!

CBMI 2016 Deadline extended to March 7, 2016

The 14th International Workshop on Content-based Multimedia Indexing aims at bringing together the various communities involved in all aspects of content-based multimedia indexing for retrieval, browsing, visualization and analytics.

In addition to multimedia and social media search and retrieval, we wish to highlight related and equally important issues that build on content-based indexing, such as multimedia content management, user interaction and visualization, media analytics, etc.

Find the call and the new dates at http://cbmi2016.upb.ro/

14th CBMI Deadlines Approaching

CBMI aims at bringing together the various communities involved in all aspects of content-based multimedia indexing for retrieval, browsing, visualization and analytics.

In addition to multimedia and social media search and retrieval, we wish to highlight related and equally important issues that build on content-based indexing, such as multimedia content management, user interaction and visualization, media analytics, etc.

Additional special sessions are planned in areas such as deep learning, medical image retrieval, and eLearning.

Deadlines

  • February 1: Full/short paper submission deadline
  • February 1: Special session paper submission deadline
  • February 29: Demo paper submission deadline

A search runtime analysis of LIRE on 500k images

Run time for search in LIRE heavily depends on the method used for indexing and search. There are two main ways to store data and two search strategies for linear search and there is approximate indexing of course. The two storing strategies are to (i) store the actual feature vector in a Lucene text field and (ii) to use the Lucene DocValues data format. While the former allows for easy access, more flexibility and compression, the latter is much faster when accessing raw byte[] data. Linear search then needs to open each and every document and compare the query vector to the one stored in the document. For linear search in Lucene text fields, caching boosts performance, so the byte[] data of the feature vectors is read once from the index and stored in memory. For the DocValues data storage format access is fast enough to allow for linear search. With approximate indexing a query string is used on the inverted index and only the first k best matching candidates are used to fin the n << k actual results by linear search. So first a text search is done, then a linear search on much less images is performed [1]. In our tests we used k=500 and n=10.

Tests on 499,207 images have shown that with this order approximate search is already outperforming linear search. The following numbers are given in ms search time. Note at this point that the average value per search differs for a different number of test runs due to the context of the runs, ie. the state of the Java VM, OS processes, file systems, etc. But the trend can be seen.

ms per search avg. on 10 runs

ms per search avg. on 100 runs

Cached linear search on text fields (*)

969.8

867.5

Not cached linear search on text fields

5,090.7

n.a.

Linear search on DocValues

636.3

634.1

Approximate search by Metric Spaces (**)

523.9

370.4

(*) Start-up latency when filling the cache was 6.098 seconds

(**) Recall with 10 results on ten runs was 0.76, on 100 run recall was 0.72

As a conclusion with nearly 500,000 images the DocValues approach might be the best choice, as the approximate indexing is loosing around 25% of the results while not boosting runtime performance that much. Further optimization would be for instance query bundling or index splitting in combination with multithreading.

[1] Gennaro, Claudio, et al. “An approach to content-based image retrieval based on the Lucene search engine library.” Research and Advanced Technology for Digital Libraries. Springer Berlin Heidelberg, 2010. 55-66.

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License. (c) Mathias Lux, 2015