<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SemanticMetadata.net &#187; Tagging</title>
	<atom:link href="http://www.semanticmetadata.net/category/tagging/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.semanticmetadata.net</link>
	<description></description>
	<lastBuildDate>Fri, 12 Mar 2010 09:04:50 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.1</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How to get a lot of photos &#8230;</title>
		<link>http://www.semanticmetadata.net/2009/05/14/how-to-get-a-lot-of-photos/</link>
		<comments>http://www.semanticmetadata.net/2009/05/14/how-to-get-a-lot-of-photos/#comments</comments>
		<pubDate>Thu, 14 May 2009 13:43:32 +0000</pubDate>
		<dc:creator>Mathias Lux</dc:creator>
				<category><![CDATA[Dev]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[General]]></category>
		<category><![CDATA[Tagging]]></category>
		<category><![CDATA[flickr]]></category>

		<guid isPermaLink="false">http://www.semanticmetadata.net/?p=450</guid>
		<description><![CDATA[I&#8217;m currently testing a new implementation of an approximate search index for content based image retrieval. Especially the performance tests have become interesting as I didn&#8217;t have access to a real big data size. So what to do?
Actually I programmed a lot of spiders and grabbers before, so I knew that there is a lot [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m currently testing a new implementation of an approximate search index for content based image retrieval. Especially the performance tests have become interesting as I didn&#8217;t have access to a real big data size. So what to do?</p>
<p>Actually I programmed a lot of spiders and grabbers before, so I knew that there is a lot of data available on Flickr <img src='http://www.semanticmetadata.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  But I was still searching for an easy way. Now here is my approach (using of course bash):</p>
<p><code>wget -q -O - http://api.flickr.com/services/feeds/photos_public.gne?format=atom | grep -o .............static.*m.jpg | wget -i -</code></p>
<p>Why should this work?</p>
<ul>
<li>The first wget command gets a list of recent photos as atom feed.</li>
<li>The grep command gets out all the medium sized (suffix &#8220;m.jpeg&#8221;) pictures</li>
<li>The lot of dots and the static are just a nice trick to get the right ones, the real image content.</li>
<li>Finally the second wget downloads the images from the server.</li>
</ul>
<p>Issuing this command one should get ~ 25 photos in one go. Using a bash loop or a cronjob you can get of course a lot more in an unattended way <img src='http://www.semanticmetadata.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.semanticmetadata.net/2009/05/14/how-to-get-a-lot-of-photos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Less than 20% of Flickr images tagged &#8230;</title>
		<link>http://www.semanticmetadata.net/2008/07/03/less-than-20-of-flickr-images-tagged/</link>
		<comments>http://www.semanticmetadata.net/2008/07/03/less-than-20-of-flickr-images-tagged/#comments</comments>
		<pubDate>Thu, 03 Jul 2008 12:42:59 +0000</pubDate>
		<dc:creator>Mathias Lux</dc:creator>
				<category><![CDATA[Tagging]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[flickr]]></category>
		<category><![CDATA[Imaging]]></category>
		<category><![CDATA[stats]]></category>

		<guid isPermaLink="false">http://www.semanticmetadata.net/2008/07/03/less-than-20-of-flickr-images-tagged/</guid>
		<description><![CDATA[While writing a scientific paper on tag recommendation I checked &#8211; just out of curiosity &#8211; the share of images tagged by their uploaders on Flickr. I found out that 4 out of five images are untagged and that less than 15% of images have 2 or more tags.
My method and detailed results: In general [...]]]></description>
			<content:encoded><![CDATA[<p>While writing a scientific paper on tag recommendation I checked &#8211; just out of curiosity &#8211; the share of images tagged by their uploaders on <a href="http://flickr.com">Flickr</a>. I found out that 4 out of five images are untagged and that less than 15% of images have 2 or more tags.</p>
<p>My method and detailed results: In general one would need a random sample for such an investigation, but a truly random sample is hard to obtain without access to the data base. Therefore I just grabbed 20,004 images from the RSS feed for recent uploads and counted the number of tagged images. Easy enough I also computed the confidence interval:</p>
<ul>
<li>In my sample 3,650 images were tagged with at least one tag, that makes p1=18.25%</li>
<ul>
<li> With alpha=0.99 p1 is in [16.84, 19.66].</li>
<li>That leaves more than 4 out of 5 images untagged.</li>
</ul>
<li>Also in my sample 2,628 images were tagged with at least two tags, that makes p2=13,14%</li>
<ul>
<li>With alpha=0.99 p2 is in [11.9, 14.37].</li>
<li>That means that less than 15% of the images images have more than one tag.</li>
</ul>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.semanticmetadata.net/2008/07/03/less-than-20-of-flickr-images-tagged/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Del.icio.us Tag Co-Occurrence Demo App</title>
		<link>http://www.semanticmetadata.net/2008/02/23/delicious-tag-co-occurrence/</link>
		<comments>http://www.semanticmetadata.net/2008/02/23/delicious-tag-co-occurrence/#comments</comments>
		<pubDate>Sat, 23 Feb 2008 09:00:46 +0000</pubDate>
		<dc:creator>Mathias Lux</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Tagging]]></category>
		<category><![CDATA[Web2.0]]></category>
		<category><![CDATA[del.icio.us]]></category>
		<category><![CDATA[folksonomy]]></category>
		<category><![CDATA[funstuff]]></category>
		<category><![CDATA[lsa]]></category>

		<guid isPermaLink="false">http://www.semanticmetadata.net/2008/02/23/delicious-tag-co-occurrence/</guid>
		<description><![CDATA[You might all know del.icio.us, the social bookmarking service. As I use this a lot and also did some research in this direction recently (see e.g. here) I wanted to try out more   Within my preparations for the Multimedia Information Systems course this semester I checked in how far LSA (latent semantic analysis) [...]]]></description>
			<content:encoded><![CDATA[<p>You might all know <a href="http://del.icio.us">del.icio.us</a>, the social bookmarking service. As I use this a lot and also did some research in this direction recently (see e.g. <a href="http://www.uni-weimar.de/medien/webis/research/tir/tir-07/proceedings/lux07-aspects-of-broad-folksonomies.pdf">here</a>) I wanted to try out more <img src='http://www.semanticmetadata.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  Within my preparations for the Multimedia Information Systems course this semester I checked in how far LSA (latent semantic analysis) can be applied to tags and made also a small demo application. The demo fetches the RSS feed from a given user name (leave the name field blank for random names, separate multiple names by commas) and computes a co-occurrence matrix after a latent semantic analysis. Note that only the last ~30 entries are in the feed. You might then select a tag from the combo box and find the related ones.</p>
<p>The tool can be accessed via Java Web Start <a href="http://www.semanticmetadata.net/webstart/folksonomy/launch.jnlp">here</a>. Drop me a line whether you like it or not.</p>
<p><strong>Related links:</strong></p>
<ul>
<li><a href="http://www.semanticmetadata.net/webstart/folksonomy/launch.jnlp">Start the demo application</a> (Java Web Start)</li>
<li><a href="http://math.nist.gov/javanumerics/jama/">Jama</a> Matrix Library</li>
<li><a href="http://en.wikipedia.org/wiki/Latent_semantic_analysis">LSA</a> on Wikipedia</li>
<li><a href="http://www.jgoodies.com/freeware/looks/">JGoodies Looks</a> &#8211; used to male it looking good <img src='http://www.semanticmetadata.net/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.semanticmetadata.net/2008/02/23/delicious-tag-co-occurrence/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
