Category Archives: General

CfP: 11th International Workshop on Text-based Information Retrieval

Intelligent algorithms for data mining and information retrieval are the key technology to cope with the information need challenges in our media-centered society. Methods for text-based information retrieval receive special attention, which results from the important role of written text, from the high availability of the World Wide Web, and from the enormous impact of Web communities and social media on our life.

The development of advanced information retrieval solutions requires the understanding and the combination of methods from different research areas, including machine learning, data mining, computer linguistics, artificial intelligence, user interaction and modeling, Web engineering, or distributed systems. This workshop provides a common platform for presenting and discussing new solutions, novel ideas, or specific tools focusing on text-based information retrieval. The following list organizes classic and ongoing topics for which contributions are welcome, but are not limited to:

  • Theory. Retrieval models, language models, similarity measures, formal analysis
  • Web Search. Ranking, indexing, semantic search, query classification and segmentation, relevance feedback, vertical search
  • Personalization and User Mining. Just-in-time retrieval, personalized retrieval, context detection, profile mining
  • Multilinguality. Cross-language retrieval, machine translation, language identification
  • Evaluation. Corpus construction, experiment design, performance measures
  • Text Mining and Classification. Web mining, text reuse, topic identification, sentiment analysis
  • NLP. Information extraction, text summarization and simplification, named entity recognition, question answering
  • Social Media Analysis. Community mining, social network analysis, trend analysis, information diffusion
  • Information Quality. Text quality assessment, quality-based ranking, readability assessment, trust and author reputation
  • Big Data Text Analytics. Parallel and distributed retrieval, online algorithms, scalability
  • Semantic Web. Meta data analysis and tagging, knowledge extraction, inference, maintenance

The workshop is held for the eleventh time. In the past, it was characterized by a stimulating atmosphere, and it attracted high quality contributions from all over the world.

Accepted papers will appear in the proceedings of DEXA’14 Workshops published by the Conference Publishing Services (CPS) of IEEE Computer Society.

Submission Details

  • Submissions to TIR 2014 must be original, unpublished contributions.
  • Papers are limited to 5 pages in IEEE format (two columns, A4) and must be written in English.
  • Submission is made electronically in PDF format using our conference management systemConfDriver.
  • Submitted papers will be peer-reviewed by at least three experts from the related field.
  • At least one author of each accepted paper is required to register for the DEXA’14 conference, attend the workshop, and present the paper.

Important Dates

  • April 24, 2014: Deadline for paper submission (24:00 CET)
  • May 12, 2014: Notification to authors
  • May 20, 2014: Camera-ready copy due
  • September 1 – 5, 2014: DEXA’14 conference

Organizing Committee

  • Maik Anderka (Co-Chair), University of Paderborn, Germany
  • Michael Granitzer (Co-Chair), University of Passau, Germany
  • Benno Stein (Co-Chair), Bauhaus-Universität Weimar, Germany

See also http://tir.webis.de/

ACM MMSys 2014 Dataset Track – Approaching Submission Deadline

As an integral part of the ACM MMSys conference since 2011, the Dataset Track provides an opportunity for researchers and practitioners to make their work available (and citable) to the multimedia community. MMSys encourages and recognizes dataset sharing, and seeks contributions in all areas of multimedia (not limited to MM systems). Authors publishing datasets will benefit by increasing the public awareness of their effort in collecting the datasets.

Submission deadline is Nov. 11th 2013! Make sure not to miss it! See also the Call for Papers

 

ACM Multimedia Presentation & LIRE Solr

Today I gave a talk on LIRE at the ACM Multimedia conference in the open source software competition, currently taking place in Barcelona. It gave me the opportunity to present a local installation of the LIRE Solr plugin and the possibilities thereof. Find the slides of the talk at slideshare: LIRE presentation at the ACM Multimedia Open Source Software Competition 2013

The Solr plugin itself is fully functional for Solr 4.4 and the source is available at https://bitbucket.org/dermotte/liresolr. There is a markdown document README.md explaining what can be done with plugin and how to actually install it. Basically it can do content based search, content based re-ranking of text searches and brings along a custom field implementation & sub linear search based on hashing.

Lire 0.9.4 beta 2 released

The beta update features (i) improvements on local feature handling. i.e. stronger quantization of local feature histograms and several bug fixes, (ii) critical bug fixes for CEDD and JCD, which were not thread safe, and (iii) improvements on the ParallelExtractor and Indexor classes as well as the intermediate binary format.

Links

CfP: ACM MMSys 2014 Dataset Track

The ACM Multimedia Systems conference (http://www.mmsys.org) provides a forum for researchers, engineers, and scientists to present and share their latest research findings in multimedia systems. While research about specific aspects of multimedia systems is regularly published in the various proceedings and transactions of the networking, operating system, real-time system, and database communities, MMSys aims to cut across these domains in the context of multimedia data types. This provides a unique opportunity to view the intersections and interplay of the various approaches and solutions developed across these domains to deal with multimedia data types. Furthermore, MMSys provides an avenue for communicating research that addresses multimedia systems holistically.

As an integral part of the conference since 2011 2012, the Dataset Track provides an opportunity for researchers and practitioners to make their work available (and citable) to the multimedia community. MMSys encourages and recognizes dataset sharing, and seeks contributions in all areas of multimedia (not limited to MM systems). Authors publishing datasets will benefit by increasing the public awareness of their effort in collecting the datasets.

In particular, authors of datasets accepted for publication will receive:

  • Dataset hosting from MMSys for at least 5 years
  • Citable publication of the dataset description in the proceedings published by ACM
  • 15 minutes oral presentation time at the MMSys 2014 Dataset Track

All submissions will be peer-reviewed by at least two members of the technical program committee of the MMSys 2014. Datasets will be evaluated by the committee on the basis of the collection methodology and the value of the dataset as a resource for the research community.

Submission Guidelines 

Authors interested in submitting a dataset should

(A) Make their data available by providing a public URL for download

(B) Write a short paper describing:

  1. motivation for data collection and intended use of the data set,
  2. the format of the data collected, 
  3. the methodology used to collect the dataset, and 
  4. basic characterizing statistics from the dataset.

Papers should be at most 6 pages long (in PDF format) prepared in the ACM style and written in English.

Important dates

  • Data set paper submission deadline: November 11, 2013
  • Notification: December 20, 2013
  • MMSys conference : March 19 – 21, 2014

MMsys Datasets

Previous accepted datasets can be accessed at

Contact

For further queries and extra information, please contact us at mlux@itec.uni-klu.ac.at. Most recent information can be found on http://www.mmsys.org

2013-07-07 (ml): Updated URLs and “2011″

LIRE Lucene 4.x Performance

While I know that the performance did not skyrocket with Lucene 4.0 I finally came around to find out why. Unfortunately the field compression technique applied in Lucene 4.x compresses each and every stored field … and decompresses it upon access. This makes up for a nice overhead when reading the index in a linear way, which is excactly one of the main methods of LIRE.

CompressedFieldsThe image shows a screen shot of the CPU sampler in VisualVM. 58.7% of the CPU time go to the LZ4 decompression routine. That’s quite a lot and makes a huge difference for search. If anyone has a workaround of sort, I’d be happy :)

Update (2013-07-03): With the great help of the people from the lucene-user list I found at least a speed-up. In the current SVN version, there is a nove LireCustomCodec for stored fields, which speeds up decompression a lot. Moreover there is now an in-memory caching approach implemented in the GeneriecFastImageSearcher class, which is turned off by default, but speeds up search time (as a trade off for memory and init time) by holding image features in-memory. It has been tested with up to 1.5M images.

Open Source Software Competition of the ACM MM 2012

The ACM Multimedia Open-Source Software Competition celebrates the invaluable contribution of researchers and software developers who advance the field by providing the community with implementations of codecs, middleware, frameworks, toolkits, libraries, applications, and other multimedia software. This year will be the sixth year in running the competition as part of the ACM Multimedia program.

To qualify, software must be provided with source code and licensed in such a manner that it can be used free of charge in academic and research settings. For the competition, the software will be built from the sources. All source code, license, installation instructions and other documentation must be available on a public web page. Dependencies on non-open source third-party software are discouraged (with the exception of operating systems and commonly found commercial packages available free of charge). To encourage more diverse participation, previous years’ non-winning entries are welcome to re-submit for the 2013 competition. Student-led efforts are particularly encouraged.

Authors are highly encouraged to prepare as much documentation as possible, including examples of how the provided software might be used, download statistics or other public usage information, etc. Entries will be peer-reviewed to select entries for inclusion in the conference program as well as an overall winning entry, to be recognized formally at ACM Multimedia 2013. The criteria for judging all submissions include broad applicability and potential impact, novelty, technical depth, demo suitability, and other miscellaneous factors (e.g., maturity, popularity, student-led, no dependence on closed source, etc.).

Authors of the winning entry, and possibly additional selected entries, will be invited to demonstrate their software as part of the conference program. In addition, accepted overview papers will be included in the conference proceedings.

more information …

Important Dates

  • Open Source Software Submission Deadline: May 13, 2013
  • Notification of Acceptance: June 30, 2013

Lire web demo updated – two additional global features

The LIRE web demo now includes an RGB color histogram as well as the MPEG-7 edge histogram implementation. The color histogram works well for instance for line art, such as this query.The edge histogram works fine for clear, gloabl edge distributions like queries such as this one. However, it’s performing different from PHOG. An example for the difference is this PHOG query compared to the according edge histogram query. The image below shows both queries.

PHOG-EH