<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: Ten Tips for Website Localization</title>
	<atom:link href="http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/feed/" rel="self" type="application/rss+xml" />
	<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/</link>
	<description>because at 3am anything sounds good</description>
	<pubDate>Sat, 05 Jul 2008 10:44:14 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.5.1</generator>
		<item>
		<title>By: utf-8 guy</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-1333</link>
		<dc:creator>utf-8 guy</dc:creator>
		<pubDate>Wed, 06 Feb 2008 13:25:10 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-1333</guid>
		<description>Some global tips.

1. UTF8 is it, there are few reasons to use any other encoding. If you don't know what these reasons are -- use UTF8.

2. PHP sucks, it's a web scripting language whose native functions lack support for multibyte strings. See point 1 and the multibyte tip above. If possible use a language where strlen returns the length of the string instead of the byte count. If you must use PHP, familiarize yourself with the multibyte or iconv functions.

3. Consider pre-processing localization strings unless an app requires allowing users to switch language at runtime. 

4. If you serve UTF8 and store data as UTF8 -- make sure user submitted data is UTF-8 encoded. 

 $valid_utf8 = iconv('UTF-8', 'UTF-8', $user_data);</description>
		<content:encoded><![CDATA[<p>Some global tips.</p>
<p>1. UTF8 is it, there are few reasons to use any other encoding. If you don&#8217;t know what these reasons are &#8212; use UTF8.</p>
<p>2. PHP sucks, it&#8217;s a web scripting language whose native functions lack support for multibyte strings. See point 1 and the multibyte tip above. If possible use a language where strlen returns the length of the string instead of the byte count. If you must use PHP, familiarize yourself with the multibyte or iconv functions.</p>
<p>3. Consider pre-processing localization strings unless an app requires allowing users to switch language at runtime. </p>
<p>4. If you serve UTF8 and store data as UTF8 &#8212; make sure user submitted data is UTF-8 encoded. </p>
<p> $valid_utf8 = iconv(&#8217;UTF-8&#8242;, &#8216;UTF-8&#8242;, $user_data);</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: matt</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-1324</link>
		<dc:creator>matt</dc:creator>
		<pubDate>Wed, 06 Feb 2008 04:23:06 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-1324</guid>
		<description>Don't you wish these things were everyday knowledge? I'm consistently amazed by how many people know almost nothing about character encoding and the like.</description>
		<content:encoded><![CDATA[<p>Don&#8217;t you wish these things were everyday knowledge? I&#8217;m consistently amazed by how many people know almost nothing about character encoding and the like.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ville Pohjanheimo</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-16</link>
		<dc:creator>Ville Pohjanheimo</dc:creator>
		<pubDate>Tue, 31 Jul 2007 15:31:42 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-16</guid>
		<description>In section "Don't concatenate strings" you make a good point, but you should know that the given better option is hugely flawed too. Plenty of languages have trouble with this scheme. The problems come about as %s is (re)used in other strings. This is a problem in several languages when different %s require different (e.g.) endings or prepositions.

In Finnish (my language) this results in being forced to add strange sounding modifiers to the sentence. In the given example the English equivalent would be something like "Sometimes I eat the food product %s and sometimes I don't." The point of adding "the food product" is that in Finnish I could now use the basic form of %s i.e. no ending would be necessary. 

An authentic example of the same can be seen on AMO, where the Finnish version replaces the composite string "Browse Extensions by Category" with the English equivalent of "Browse by Category the Add-on type Extensions" where Extensions is of course replaced by Themes etc. when appropriate. The Finnish translation could translate "Themes" and "Extensions" to their correct forms, but it's hard to find out all the places where a given "%s" string is used and there's really no way to know where it's going to be used in the future. Thus an ugly fix like the one above is a necessary eye-sore. BTW. I don't mean to belittle the quality of AMO here in any way.... the same applies for Firefox et al. too!

The fix would be to use only full sentences when ever possible. That would also fix the annoyance of capital letters mid sentence .... trying to guess where a "%s" string will be placed in sentence (and thus whether it should grammatically be with a capital letter or not) is a terrible pain and error prone.</description>
		<content:encoded><![CDATA[<p>In section &#8220;Don&#8217;t concatenate strings&#8221; you make a good point, but you should know that the given better option is hugely flawed too. Plenty of languages have trouble with this scheme. The problems come about as %s is (re)used in other strings. This is a problem in several languages when different %s require different (e.g.) endings or prepositions.</p>
<p>In Finnish (my language) this results in being forced to add strange sounding modifiers to the sentence. In the given example the English equivalent would be something like &#8220;Sometimes I eat the food product %s and sometimes I don&#8217;t.&#8221; The point of adding &#8220;the food product&#8221; is that in Finnish I could now use the basic form of %s i.e. no ending would be necessary. </p>
<p>An authentic example of the same can be seen on AMO, where the Finnish version replaces the composite string &#8220;Browse Extensions by Category&#8221; with the English equivalent of &#8220;Browse by Category the Add-on type Extensions&#8221; where Extensions is of course replaced by Themes etc. when appropriate. The Finnish translation could translate &#8220;Themes&#8221; and &#8220;Extensions&#8221; to their correct forms, but it&#8217;s hard to find out all the places where a given &#8220;%s&#8221; string is used and there&#8217;s really no way to know where it&#8217;s going to be used in the future. Thus an ugly fix like the one above is a necessary eye-sore. BTW. I don&#8217;t mean to belittle the quality of AMO here in any way&#8230;. the same applies for Firefox et al. too!</p>
<p>The fix would be to use only full sentences when ever possible. That would also fix the annoyance of capital letters mid sentence &#8230;. trying to guess where a &#8220;%s&#8221; string will be placed in sentence (and thus whether it should grammatically be with a capital letter or not) is a terrible pain and error prone.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Wil Clouser</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-13</link>
		<dc:creator>Wil Clouser</dc:creator>
		<pubDate>Fri, 27 Jul 2007 19:02:00 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-13</guid>
		<description>Thanks Fred and Karellen.  I fixed the typos.</description>
		<content:encoded><![CDATA[<p>Thanks Fred and Karellen.  I fixed the typos.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Ten Tips for Website Localization &#124; Flawless Mind</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-12</link>
		<dc:creator>Ten Tips for Website Localization &#124; Flawless Mind</dc:creator>
		<pubDate>Fri, 27 Jul 2007 12:55:19 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-12</guid>
		<description>[...] Ten Tips for Website Localization Â  [...]</description>
		<content:encoded><![CDATA[<p>[...] Ten Tips for Website Localization Â  [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Fred</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-11</link>
		<dc:creator>Fred</dc:creator>
		<pubDate>Fri, 27 Jul 2007 08:58:57 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-11</guid>
		<description>Nicely written, Wil.

Regarding the fallback code: I still believe this is a very ugly solution as it requires code changes (and necessarily patch reviews etc) twice, once for putting it in and once for removing it. I have to admit though that there may not be a better solution out there, at least not while gettext doesn't come with a fallback procedure of its own (beyond its current "display the msgid instead").

(Another short note: either you use sprintf() with echo or you just use printf() with no "echo". printf prints on its own.)</description>
		<content:encoded><![CDATA[<p>Nicely written, Wil.</p>
<p>Regarding the fallback code: I still believe this is a very ugly solution as it requires code changes (and necessarily patch reviews etc) twice, once for putting it in and once for removing it. I have to admit though that there may not be a better solution out there, at least not while gettext doesn&#8217;t come with a fallback procedure of its own (beyond its current &#8220;display the msgid instead&#8221;).</p>
<p>(Another short note: either you use sprintf() with echo or you just use printf() with no &#8220;echo&#8221;. printf prints on its own.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Barry</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-10</link>
		<dc:creator>Barry</dc:creator>
		<pubDate>Fri, 27 Jul 2007 03:40:03 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-10</guid>
		<description>Good entry.  Me like.

(I think the comments count on your blog is off by one.)</description>
		<content:encoded><![CDATA[<p>Good entry.  Me like.</p>
<p>(I think the comments count on your blog is off by one.)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Robert Kaiser</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-8</link>
		<dc:creator>Robert Kaiser</dc:creator>
		<pubDate>Thu, 26 Jul 2007 20:09:51 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-8</guid>
		<description>Everything would be even better if we had the new L20n framework ready, I guess... We should really get something moving there again... ;-)</description>
		<content:encoded><![CDATA[<p>Everything would be even better if we had the new L20n framework ready, I guess&#8230; We should really get something moving there again&#8230; <img src='http://micropipes.com/blog/wp-includes/images/smilies/icon_wink.gif' alt=';-)' class='wp-smiley' /></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mozilla Webdev &#187; Blog Archive &#187; Tips for Localization</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-7</link>
		<dc:creator>Mozilla Webdev &#187; Blog Archive &#187; Tips for Localization</dc:creator>
		<pubDate>Thu, 26 Jul 2007 19:43:08 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-7</guid>
		<description>[...] posted a blog that is a great resource for people looking to localize their sites.Â  For those of you who went to our OSCON presentation, this is a great follow-up to the concepts we [...]</description>
		<content:encoded><![CDATA[<p>[...] posted a blog that is a great resource for people looking to localize their sites.Â  For those of you who went to our OSCON presentation, this is a great follow-up to the concepts we [...]</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Karellen</title>
		<link>http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-6</link>
		<dc:creator>Karellen</dc:creator>
		<pubDate>Thu, 26 Jul 2007 17:06:41 +0000</pubDate>
		<guid isPermaLink="false">http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/#comment-6</guid>
		<description>"With the creation of UTF-8, characters from all over the world are assigned numbers between 0 and 65,535."

Actually, UTF-8 can hold not just the BMP, but all unicode characters from 0x00 to 0x10FFFF, or numbers between 0 and 1,114,111.

UCS-2 is the encoding limited to the first 65535 chars.</description>
		<content:encoded><![CDATA[<p>&#8220;With the creation of UTF-8, characters from all over the world are assigned numbers between 0 and 65,535.&#8221;</p>
<p>Actually, UTF-8 can hold not just the BMP, but all unicode characters from 0&#215;00 to 0&#215;10FFFF, or numbers between 0 and 1,114,111.</p>
<p>UCS-2 is the encoding limited to the first 65535 chars.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
