Monthly Archives: July 2008

Lire accepted for the Open Source Contest @ ACM MM

Although its quite some time ago that I got the acceptance mail I forgot to blog the good news: Lire (Lucene Image Retrieval) has been acccepted to be presented at the ACM Multimedia within the Open Source Contest track. As it is a contest I assume we have chances to win something? :)

Visual VM is Part of Java 1.6 Update 7

visualvm.pngJava 1.6 u7 was released recently by Sun. While not bringing major changes it brought along some bug fixes and solved some security issues. However there is one main addition: The VisualVM. This is a really great developer tool: It connects to running VMs and shows “some statistics” about them. Besides memory usage and threads information it also allows to do some basic profiling. In my opinion Sun did a good job on including VisualVM in the package! Not that this thing is build on the NetBeans Platform ;-)

Links:

Best of compiler construction

In general one should not post errors of other … but well I just give the best answers, no questions, no names … and of course: it’s a good laugh :-D

Notation for describing graphs is not quite clear to some people. I also miss the mother nodes: […] nodes in the dependency graph are depending on father nodes, sister nodes and brother nodes […]

Asking for advantages and disadvantages of a certain approach I got within the same answer: […] advantage: program is faster, disadvantage: program would be slower […]

Please note that I assume that these answer come out of the stress of the people writing the test and not the missing knowledge.

Games implementation week for high school students

Today an activity week for high school students at Klagenfurt University ended. The high school students chose a topic and worked a week on an implementation. Due to the success of our computer games course there was a topic on computer games. The two university students, Christian and Daniel, who instructed and helped the high school students throughout the whole week, did a very good job and several impressive arcade games were implemented within the week.

For me it was great to see what interested high school students can manage in one week. I’ll remember and tell students in my course about it :)

Less than 20% of Flickr images tagged …

While writing a scientific paper on tag recommendation I checked – just out of curiosity – the share of images tagged by their uploaders on Flickr. I found out that 4 out of five images are untagged and that less than 15% of images have 2 or more tags.

My method and detailed results: In general one would need a random sample for such an investigation, but a truly random sample is hard to obtain without access to the data base. Therefore I just grabbed 20,004 images from the RSS feed for recent uploads and counted the number of tagged images. Easy enough I also computed the confidence interval:

  • In my sample 3,650 images were tagged with at least one tag, that makes p1=18.25%
    • With alpha=0.99 p1 is in [16.84, 19.66].
    • That leaves more than 4 out of 5 images untagged.
  • Also in my sample 2,628 images were tagged with at least two tags, that makes p2=13,14%
    • With alpha=0.99 p2 is in [11.9, 14.37].
    • That means that less than 15% of the images images have more than one tag.

Finding duplicate code …

I recently found myself in a scenario, where I tried to figure out how implementation clusters have been implicitly created within a group of students. All of them were given a task (with 4 sub tasks) for a whole semester. Everyone was meant to do the task alone, but collaboration was allowed. However I needed to know who helped whom and – of course – who helped whom with source code.

A colleague had a similar problem and he pointed me to PMD CPD (= PMD Copy & Paste Detector) . This tool works lightning fast and has a GUI :) Also its open source -> respect!

Links:

Compiler Construction & Lord of the Rings

Recently I had an idea on compilers and code generation. Generating assembler code (I’m talking about MIPS code) and with it labels I thought: Why not using cool names for labels? Instead of

LB_WH002_001G:
beq $a1, $v1,  LB_WH002_002G
add $a1, $a1, $v1
j  LB_WH002_001G
LB_WH002_002G:
...

One could be more creative and use for instance Names from Lord of the Rings like:

THE_SHIRE:
beq $a1, $v1,  MORDOR
add $a1, $a1, $v1
j  THE_SHIRE
MORDOR:
...

While this definitely does not increase readability of assembler code I still think it doesn’t lessen it much. Instead it introduces “geekiness” to assembler :)

For those who are not creative themselves and need a lot of labels I recommend the numerous fanrtasy fantasy name generators available on the internet like this, this or this one.