<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Some considerations when adding Tags to AMO</title>
	<atom:link href="http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/feed/" rel="self" type="application/rss+xml" />
	<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/</link>
	<description>because at 3am anything sounds good</description>
	<lastBuildDate>Thu, 10 May 2012 20:52:06 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
	<item>
		<title>By: All Night Diner : The Tagging Plan for AMO</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30731</link>
		<dc:creator>All Night Diner : The Tagging Plan for AMO</dc:creator>
		<pubDate>Fri, 06 Mar 2009 20:25:11 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30731</guid>
		<description>[...] thanks for all the great feedback. Something as seemingly simple as tagging gets complex quickly when thought out and the varied [...]</description>
		<content:encoded><![CDATA[<p>[...] thanks for all the great feedback. Something as seemingly simple as tagging gets complex quickly when thought out and the varied [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Austin King</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30661</link>
		<dc:creator>Austin King</dc:creator>
		<pubDate>Thu, 05 Mar 2009 02:12:37 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30661</guid>
		<description>Tags and localization is a tough beast. Untested idea: What about separating tags per locale. This will encourage the use of common idioms within a locale and not pollute the display of other locales. Additionally it may provide some performance benefits on the backend where character set optimizations can be made.</description>
		<content:encoded><![CDATA[<p>Tags and localization is a tough beast. Untested idea: What about separating tags per locale. This will encourage the use of common idioms within a locale and not pollute the display of other locales. Additionally it may provide some performance benefits on the backend where character set optimizations can be made.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: sethb</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30649</link>
		<dc:creator>sethb</dc:creator>
		<pubDate>Wed, 04 Mar 2009 21:44:56 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30649</guid>
		<description>Other suggestions (not ideal, but just to add them):

*  Tag/word suggestion feature to help with normalization (i.e. Did you mean &#039;résumé&#039;?&quot;)
*  Tag editing or suggested changes after tag is submitted (i.e.  flag-a-tag, community suggesting/editing?)
*  Use Unicode and then have a second step that asks the tagger to enter using the English set above?  This would require friendly UI that somehow explains why we need to do both.  This option essentially combines 1 and 2.
*  Creating another way to identify a tag that doesn&#039;t rely on character.  Can you look for context or usage in meta information?  Or make sure the tag reflects the category where it will live and then ask the user if a potential conflict occurs?

Just some ideas that sprung up when chatting with others.</description>
		<content:encoded><![CDATA[<p>Other suggestions (not ideal, but just to add them):</p>
<p>*  Tag/word suggestion feature to help with normalization (i.e. Did you mean &#8216;résumé&#8217;?&#8221;)<br />
*  Tag editing or suggested changes after tag is submitted (i.e.  flag-a-tag, community suggesting/editing?)<br />
*  Use Unicode and then have a second step that asks the tagger to enter using the English set above?  This would require friendly UI that somehow explains why we need to do both.  This option essentially combines 1 and 2.<br />
*  Creating another way to identify a tag that doesn&#8217;t rely on character.  Can you look for context or usage in meta information?  Or make sure the tag reflects the category where it will live and then ask the user if a potential conflict occurs?</p>
<p>Just some ideas that sprung up when chatting with others.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Staś Małolepszy</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30644</link>
		<dc:creator>Staś Małolepszy</dc:creator>
		<pubDate>Wed, 04 Mar 2009 18:37:36 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30644</guid>
		<description>Here&#039;s an idea I talked about in the call earlier today:

When the users wants to tag an add-on, they are presented with a couple of text input fields, laid out vertically, where they can insert the tags, e.g. résumé. Next to each field, horizontally, there is a link saying &quot;add alternative ASCII-only spelling to help searches&quot;. When the user clicks on it, it is replaced by another text input and a &quot;+&quot; sign allowing to add another additional text input field. In those fields, users can type the alternative spelling for their tag, &quot;resume&quot; in this example.

There are two things that can happen next:

1.

The add-on is now tagged with two tags: &quot;résumé&quot; and &quot;resume&quot;, but only &quot;résumé&quot; is a primary tag, meaning that only &quot;résumé&quot; is displayed in the add-on&#039;s tag cloud. &quot;Resume&quot; is an auxiliary tag, used only for searches.

The problem here is the inverse scenario: I believe it is rather unlikely that someone typing &quot;resume&quot; will think of the alternative spelling: &quot;résumé&quot;. Hence the second solution:

2.

As soon as the users provides two spelling versions of one tag, we can use them to create a two-way mapping. Whenever someone searches for &quot;résumé, their query will be first checked against the mapping returning &quot;resume&quot; and the search results can now include add-ons tagged with both spellings. This works the other way round too.

So in fact, instead of mapping Unicode letters onto ASCII letters, we&#039;re mapping words onto words (in both directions: unicode-&gt;ascii and ascii-&gt;unicode). And that&#039;s generated by users, so we have good chances of covering the most popular words first.

Thoughts?</description>
		<content:encoded><![CDATA[<p>Here&#8217;s an idea I talked about in the call earlier today:</p>
<p>When the users wants to tag an add-on, they are presented with a couple of text input fields, laid out vertically, where they can insert the tags, e.g. résumé. Next to each field, horizontally, there is a link saying &#8220;add alternative ASCII-only spelling to help searches&#8221;. When the user clicks on it, it is replaced by another text input and a &#8220;+&#8221; sign allowing to add another additional text input field. In those fields, users can type the alternative spelling for their tag, &#8220;resume&#8221; in this example.</p>
<p>There are two things that can happen next:</p>
<p>1.</p>
<p>The add-on is now tagged with two tags: &#8220;résumé&#8221; and &#8220;resume&#8221;, but only &#8220;résumé&#8221; is a primary tag, meaning that only &#8220;résumé&#8221; is displayed in the add-on&#8217;s tag cloud. &#8220;Resume&#8221; is an auxiliary tag, used only for searches.</p>
<p>The problem here is the inverse scenario: I believe it is rather unlikely that someone typing &#8220;resume&#8221; will think of the alternative spelling: &#8220;résumé&#8221;. Hence the second solution:</p>
<p>2.</p>
<p>As soon as the users provides two spelling versions of one tag, we can use them to create a two-way mapping. Whenever someone searches for &#8220;résumé, their query will be first checked against the mapping returning &#8220;resume&#8221; and the search results can now include add-ons tagged with both spellings. This works the other way round too.</p>
<p>So in fact, instead of mapping Unicode letters onto ASCII letters, we&#8217;re mapping words onto words (in both directions: unicode-&gt;ascii and ascii-&gt;unicode). And that&#8217;s generated by users, so we have good chances of covering the most popular words first.</p>
<p>Thoughts?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Kaiser</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30636</link>
		<dc:creator>Robert Kaiser</dc:creator>
		<pubDate>Wed, 04 Mar 2009 14:33:24 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30636</guid>
		<description>Why make the separator a space and not a comma? In doing the tagging system for my personal PHP-based community system, I found it&#039;s easiest to use comma as a separator as it feel natural to most people (it&#039;s even the separator of lists of attributes in normal written language) and it easily allows for spaces withing tags without the workaround of quoting.</description>
		<content:encoded><![CDATA[<p>Why make the separator a space and not a comma? In doing the tagging system for my personal PHP-based community system, I found it&#8217;s easiest to use comma as a separator as it feel natural to most people (it&#8217;s even the separator of lists of attributes in normal written language) and it easily allows for spaces withing tags without the workaround of quoting.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: kourge</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30618</link>
		<dc:creator>kourge</dc:creator>
		<pubDate>Wed, 04 Mar 2009 04:35:50 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30618</guid>
		<description>From a pure linguistic view, I totally agree with comment #4 and #6&#039;s statement that letters do not translate one to one. &quot;ß&quot; is more of a ligature of the two letters &quot;ss&quot; (the &quot;Eszett&quot; as it is called in German), and &quot;Þ&quot; is the letter &quot;thorn&quot;, which is transliterated to &quot;th&quot; most of the time. (&quot;Thou&quot; used to be spelt &quot;Þu&quot;.) And then there&#039;s the problem of transliterating a letter like &quot;ö&quot;. If used in context of diaeresis, the diacritic mark can simply be removed, but if used in German, it is an umlaut, in which case &quot;ö&quot; should be transliterated into &quot;oe&quot;.

Reckless normalization can be evil. What &lt;em&gt;can&lt;/em&gt; be done is suggestion through normalization; if two phrases / keywords normalize to something really similar, then the similar phrase can be shown, among other similar phrases, as suggestions in an autocomplete drop down.</description>
		<content:encoded><![CDATA[<p>From a pure linguistic view, I totally agree with comment #4 and #6&#8242;s statement that letters do not translate one to one. &#8220;ß&#8221; is more of a ligature of the two letters &#8220;ss&#8221; (the &#8220;Eszett&#8221; as it is called in German), and &#8220;Þ&#8221; is the letter &#8220;thorn&#8221;, which is transliterated to &#8220;th&#8221; most of the time. (&#8220;Thou&#8221; used to be spelt &#8220;Þu&#8221;.) And then there&#8217;s the problem of transliterating a letter like &#8220;ö&#8221;. If used in context of diaeresis, the diacritic mark can simply be removed, but if used in German, it is an umlaut, in which case &#8220;ö&#8221; should be transliterated into &#8220;oe&#8221;.</p>
<p>Reckless normalization can be evil. What <em>can</em> be done is suggestion through normalization; if two phrases / keywords normalize to something really similar, then the similar phrase can be shown, among other similar phrases, as suggestions in an autocomplete drop down.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mike Morgan</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30614</link>
		<dc:creator>Mike Morgan</dc:creator>
		<pubDate>Wed, 04 Mar 2009 03:55:09 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30614</guid>
		<description>Tagging to me falls under the same jurisdiction as URLs.  The realm of possible tags across all languages defeats the purpose of tagging, really.

By agreeing on some common nomenclature we could set a precedent and promote simpler tags (you wouldn&#039;t tag something with strange english words).  I don&#039;t think this would work for most other things like web writing, novels, movies, etc., but we&#039;re not talking about War and Peace here.

My opinion may be unpopular, but in many cases it doesn&#039;t make sense to destroy your &quot;hit rate&quot; just to be 100% inclusive.  I think the altruistic approach of &quot;make everything universal at all costs&quot; is definitely situational.  For things like tagging and URLs where uniqueness and visual recognition are paramount I&#039;m not fond of fragmenting otherwise unique and simple phrases into 40 different synonymous alternatives.</description>
		<content:encoded><![CDATA[<p>Tagging to me falls under the same jurisdiction as URLs.  The realm of possible tags across all languages defeats the purpose of tagging, really.</p>
<p>By agreeing on some common nomenclature we could set a precedent and promote simpler tags (you wouldn&#8217;t tag something with strange english words).  I don&#8217;t think this would work for most other things like web writing, novels, movies, etc., but we&#8217;re not talking about War and Peace here.</p>
<p>My opinion may be unpopular, but in many cases it doesn&#8217;t make sense to destroy your &#8220;hit rate&#8221; just to be 100% inclusive.  I think the altruistic approach of &#8220;make everything universal at all costs&#8221; is definitely situational.  For things like tagging and URLs where uniqueness and visual recognition are paramount I&#8217;m not fond of fragmenting otherwise unique and simple phrases into 40 different synonymous alternatives.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: AndyEd</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30590</link>
		<dc:creator>AndyEd</dc:creator>
		<pubDate>Tue, 03 Mar 2009 20:18:45 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30590</guid>
		<description>I&#039;m a big advocate of pruning the tag space as it gets large with essentially &quot;mark as duplicate&quot;.  This can either rewrite the original tag or allow the original to point to the master version of the tag.  This would solve your problem around localized versions as well.</description>
		<content:encoded><![CDATA[<p>I&#8217;m a big advocate of pruning the tag space as it gets large with essentially &#8220;mark as duplicate&#8221;.  This can either rewrite the original tag or allow the original to point to the master version of the tag.  This would solve your problem around localized versions as well.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Anonymous</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30579</link>
		<dc:creator>Anonymous</dc:creator>
		<pubDate>Tue, 03 Mar 2009 18:30:43 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30579</guid>
		<description>You can&#039;t in general translate Unicode characters 1:1 to ASCII.  For instance, ß doesn&#039;t translate to &quot;s&quot;, it translates to &quot;ss&quot;.  Furthermore, you might want to handle multiple transliterations, to allow &quot;ö&quot; to match either &quot;oe&quot; or &quot;o&quot;.</description>
		<content:encoded><![CDATA[<p>You can&#8217;t in general translate Unicode characters 1:1 to ASCII.  For instance, ß doesn&#8217;t translate to &#8220;s&#8221;, it translates to &#8220;ss&#8221;.  Furthermore, you might want to handle multiple transliterations, to allow &#8220;ö&#8221; to match either &#8220;oe&#8221; or &#8220;o&#8221;.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alan</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/comment-page-1/#comment-30557</link>
		<dc:creator>Alan</dc:creator>
		<pubDate>Tue, 03 Mar 2009 14:29:45 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/?p=75#comment-30557</guid>
		<description>As Firefox&#039;s Places uses commas and not spaces to separate tags (removing the need for quote-enclosed spaces) it would be nice to see this standardised across the Mozilla Project. Having to learn tagging rules for each part of the project just seems unnecessary, especially when each feature is designed from the ground-up.

A natural English speaker myself, I can&#039;t help but feel limiting the input to US-ASCII (or normalising it to such) goes against the community spirit and essentially implies that non-English speakers are second class users. Would there be any way we could add language notation to the tags, or do simple things such as to hide Cyrillic/Greek/Ideographs from English users by default and vice-versa? I know this would be much harder within, for example Romance Languages - how do you differentiate French/Spanish/Portuguese/Italian without looking at the characters they use and white-  or black-listing them from a massive list? 

On the tagging front, another idea is to follow Amazon&#039;s implementation where a product must be tagged with a tag a certain number of times by users before that tag is &#039;accepted&#039;. These tags can be shown to the user to essentially approve (tick a check-box by each tag you agree with and add any of your own) so as to limit tags which may be obscure/incorrect.

As far as resume Vs. résumé could we not simply have a list of cognates in English which we apply at the point of tagging to catch those few words which have spelling variations. If not we&#039;re still going to need a solution to British Vs. American spellings. Or will people be encouraged to tag &#039;colour color&#039; etc?

Just a couple of my thoughts.</description>
		<content:encoded><![CDATA[<p>As Firefox&#8217;s Places uses commas and not spaces to separate tags (removing the need for quote-enclosed spaces) it would be nice to see this standardised across the Mozilla Project. Having to learn tagging rules for each part of the project just seems unnecessary, especially when each feature is designed from the ground-up.</p>
<p>A natural English speaker myself, I can&#8217;t help but feel limiting the input to US-ASCII (or normalising it to such) goes against the community spirit and essentially implies that non-English speakers are second class users. Would there be any way we could add language notation to the tags, or do simple things such as to hide Cyrillic/Greek/Ideographs from English users by default and vice-versa? I know this would be much harder within, for example Romance Languages &#8211; how do you differentiate French/Spanish/Portuguese/Italian without looking at the characters they use and white-  or black-listing them from a massive list? </p>
<p>On the tagging front, another idea is to follow Amazon&#8217;s implementation where a product must be tagged with a tag a certain number of times by users before that tag is &#8216;accepted&#8217;. These tags can be shown to the user to essentially approve (tick a check-box by each tag you agree with and add any of your own) so as to limit tags which may be obscure/incorrect.</p>
<p>As far as resume Vs. résumé could we not simply have a list of cognates in English which we apply at the point of tagging to catch those few words which have spelling variations. If not we&#8217;re still going to need a solution to British Vs. American spellings. Or will people be encouraged to tag &#8216;colour color&#8217; etc?</p>
<p>Just a couple of my thoughts.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

