Face Detection in Java

March 15, 2012 on 1:21 pm | Tags: , , , | In Dev, Java, Multimedia | 4 Comments

Face detection is basically a common tasks in image retrieval and management. However, finding a stable, well maintained and free-to-use Java library for face detection may prove hard. The OpenIMAJ project contains a common approach and yields rather fine results. However, the packaged version of all the JARs used in OpenIMAJ is quite bunch of classes making up a 30 MB jar file.

For those of you just interested in face detection I compiled and packaged the classes needed for this tasks in a ~5MB file. Finding the faces then with this library is actually a 3 lines of code task:

MBFImage image = ImageUtilities.readMBF(new FileInputStream(“image.jpg”));
FaceDetector<DetectedFace,FImage> fd = new HaarCascadeDetector(80);
List<DetectedFace> faces = fd. detectFaces (Transforms.calculateIntensity(image));

All the imports needed along with their dependencies are packaged in the facedetect-openimaj.jar file (see archive below).

Files

 

Extracting SURF features – convenience command line utility for Windows

March 2, 2012 on 4:36 pm | Tags: , , | In Dev, General, Multimedia, Software | No Comments

Sometimes you just need a small command line utility to extract some local feature from an image … and you have no time to set up and compile OpenCV right this time. Here’s the solution: I did the task (actually for my students and for me, but still you might use it :) .

The utility is absolutely basic stuff. Just start “extractSurf.exe” on Windows 7, give it an image as first parameter and it will spit out the surf feature descriptors (on stdout) headed by the x and y coordinates and the response value. Source – of course – is also provided … but it’s not magic. It’s all about the convenience of the binary.

Links to the OpenCV wiki on how to compile the stuff are provided in a small README in the source archive.

Links

Nice but out of reach: popular multimedia platforms

March 1, 2012 on 3:49 pm | Tags: , , , , | In General, Multimedia | 1 Comment

Netflix was reported last year to be the source of nearly 30% of the North American internet backbone traffic. Well that’s impressive, but that’s something that many non North Americans can’t understand … and there’s a simple reason for that: the service is not available in many countries. Several well known and well received services are restricted to a range of IP adresses that are considered in a geographic location where users have access to this services. Here is a small but still interesting list of services that have obviously impact on the usage of the internet, but cannot be accessed in many European countries.

  • Netflix – major video streaming service (subscription based)
  • Pandora – music streaming service / adaptive online radio (ad supported)
  • Hulu – major video streaming service of already aired TV content (ad supported)
  • Vevo – music video streaming service (ad supported). Most of the music videos on Vevo are available on YouTube for Austrians, but most of these music videos are not accessible of Germans.
  • NBC - video streaming service of already aired NBC TV content.
  • ABC - video streaming service of already aired ABC TV content.

 

LIRe presentation and poster at ACM MM 2011

November 29, 2011 on 11:18 pm | Tags: , , , | In Conference, Dev, General, Multimedia, Software | No Comments

Just finished my presentation at ACM MM’s open source competition in 2011. Many interested researchers and developers came by to discuss ideas and developments. I’m looking forward to turning many of those idea into code ;)

For those of you interested in the poster I uploaded it here.

I also uploaded the presentation to slideshare.

Lire and Lire Demo v 0.9 released

October 20, 2011 on 12:37 pm | Tags: , , , , , , , | In Dev, General, Java, Multimedia, Software | No Comments

I just released Lire and Lire Demo in version 0.9 on sourceforge.net. Basically it’s the alpha version with additional speed and stability enhancements for bag of visual words (BoVW) indexing. While this has already been possible in earlier versions I re-furbished vocabulary creation (k-means clustering) and indexing to support up to 4 CPU cores. I also integrated a function to add documents to BoVW indexes incrementally. So a list of major changes since Lire 0.8 includes

  • Major speed-up due to change and re-write of indexing strategies for local features
  • Auto color correlation and color histogram features improved
  • Re-ranking filter based on global features and LSA
  • Parallel bag of visual words indexing and search supporting SURF and SIFT including incremental index updates (see also in the wiki)
  • Added functionality to Lire Demo including support for new Lire features and a new result list view

Download and try:

Lire Demo 0.9 alpha 2 just released

August 5, 2011 on 11:41 am | Tags: , , , , , | In Dev, Java, Multimedia, Software | No Comments

Finally I found some time to go through Lire and fix several of the — for me — most annoying bugs. While this is still work in progress I have a preview with the demo uploaded to sf.net. New features are:

  • Auto Color Correlogram and Color Histogram features improved
  • Re-ranking based on different features supported
  • Enhanced results view
  • Much faster indexing (parallel, use -server switch for your JVM)
  • Much faster search (re-write of the searhc code in Lire)
  • New developer menu for faster switching of search features
  • Re-ranking of results based on latent semantic analysis

You can find the updated Lire Demo along with a windows launcher here, Mac and Linux users please run it using “java -jar … ” or double click (if your windows manager supports actions like that :)

The source is — of course — GPL and available in the SVN.

Final Call for Papers: Special Issue on Searching Speech

February 7, 2011 on 4:00 pm | Tags: , , , , | In CfP, Conference, Multimedia | 2 Comments

ACM Transactions on Information Systems is soliciting contributions to a special issue on the topic of “Searching Speech”. The special issue will be devoted to algorithms and systems that use speech recognition and other types of spoken audio processing techniques to retrieve information, and, in particular, to provide access to spoken audio content or multimedia content with a speech track.

Submission Deadline: 1 March 2011

The field of spoken content indexing and retrieval has a long history dating back to the development of the first broadcast news retrieval systems in the 1990s. More recently, however, work on searching speech has been moving towards spoken audio that is produced spontaneously and in conversational settings. In contrast to the planned speech that is typical for the broadcast news domain, spontaneous, conversational speech is characterized by high variability and the lack of inherent structure. Domains in which researchers face such challenges include: lectures, meetings, interviews, debates, conversational broadcast (e.g., talk-shows), podcasts, call center recordings, cultural heritage archives, social video on the Web, spoken natural language queries and the Spoken Web.
We invite the submission of papers that describe research in the following areas:

  • Integration of information retrieval algorithms with speech recognition and audio analysis techniques
  • Interfaces and techniques to improve user interaction with speech collections
  • Indexing diverse, large scale collections
  • Search effectiveness and efficiency, including exploitation of additional information sources

For more information see http://tois.acm.org/announcement.html

ACM Multimedia Call for Volunteers out …

September 21, 2010 on 10:19 am | Tags: , , , , , | In CfP, Conference, General, Multimedia | No Comments

ACM Multimedia – taking place in Firenze, Italy, last week of Oct. – still is in need for student volunteers. I personally think this is a great opportunity to learn about big conferences and the community. If you are interested ensure to send your bid until Oct. 7th 2010. Find the call here.

Links

Converting video for flash video players to H.264/AAC

July 10, 2009 on 2:58 pm | Tags: , , , , , | In Development, Flash, General, Multimedia | 1 Comment

Have you ever tried to put a video online? Well actually it is quite easy if you user YouTube. No matter what codec you use you have a good chance to get a decent result. If you want to host the video yourself you basically need a flash video player (assuming that flash is the most widely spread tool on multiple platforms) like the JW FLVPlayer. Finally you’ll need to get your video file to a format flash can play using progressive download (which means you can watch it while downloading, just like on YouTube).

Since Adobe Flash Player 9 Update 3Flash can play back MP4 files with H.264 video and AAC audio streams [see here], so we can just focus on this one. First step is to get a ffmpeg version compiled with libx264 and libfaac. You might check this on the command line, just execute ffmpeg without parameters:

FFmpeg version SVN-r16573, Copyright (c) 2000-2009 Fabrice Bellard, et al.
configuration: [...] –enable-libfaac –enable-libgsm –enable-libx264 [...]

The bold ones should be there to support the needed codecs. I used FFmpeg Revision 16537 from this page, which works fine.

If the libraries are there you can proceed to the next step:

ffmpeg -i <inputfile> -b 1024k -vcodec libx264 \\
-acodec libfaac -ab 160k <output.mp4>

This converts your input file to the needed mp4 file. You can also change the size of the file with the switch “-s”, like for instance “-s 320×240″. Take a close look on the switches “-b” and “-ab” which define video and audio bitrate. If the sum of both bitrates is too high for the network the user will not be able to watch the video smoothly.

One might think s/he’s finished, but no … unfortunately progressive download doesn’t work with too many mp4 files. The file index (an atom == “mp4 metadata unit”) containing the file index (== the description where the video and the audio stream are located in the file and how they are stored) is at the end of the MP4 file. So the flash player has to download the whole file before starting the playback, ka-ching!

Fortunately there is an ffmpeg tool called qt-faststart (linux users will find it in the tools folder of ffmpeg) moving the index from end to start. For windows user a precompiled binary can be found here. Use this to move the metadata:

qt-faststart <infile.mp4> <outfile.mp4>

Now you are done with the file. Use for instance the JW FLVPlayer setup wizard to create an HTML snippet. Note that in height you have to add 19 pixels to your video dimensions, as this is the height of the control bar of the player :-D

Music portal reduced to max: Grooveshark

June 25, 2009 on 10:00 am | Tags: , , | In General, Multimedia, Streaming | No Comments

Lately it was getting more and more challenging to hear a song you want online. YouTube sorts out based on Geo-IP, samples in online stores get shorter if even there. But I was pointed to a straightforward portal: Grooveshark. You just searcfh for the song you’d like to hear, press play and there you are. If you want to listen to multiple songs, there’s a queue usable without registration. Nice!

Links

Next Page »

© 2004-2010 by Mathias Lux
>> Contents of this page are licensed under the Creative Commons Attribution-Share Alike 3.0 Austria License license <<