<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>All Night Diner &#187; Mozilla</title>
	<atom:link href="http://micropipes.com/blog/tag/mozilla/feed/" rel="self" type="application/rss+xml" />
	<link>http://micropipes.com/blog</link>
	<description>because at 3am anything sounds good</description>
	<lastBuildDate>Mon, 03 May 2010 17:34:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>addons.mozilla.org ♥s unit tests.  Again.</title>
		<link>http://micropipes.com/blog/2010/05/03/addons-mozilla-org-%e2%99%a5s-unit-tests-again/</link>
		<comments>http://micropipes.com/blog/2010/05/03/addons-mozilla-org-%e2%99%a5s-unit-tests-again/#comments</comments>
		<pubDate>Mon, 03 May 2010 17:34:44 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[CakePHP]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=142</guid>
		<description><![CDATA[AMO has had an on-again off-again relationship with unit tests.  A little over a year ago we had a thousand unit tests that sort of, mostly, ran.  The problem is, PHP unit testing just isn&#8217;t as good as it should be.  CakePHP relies on SimpleTest, one of the main PHP test suites. [...]]]></description>
			<content:encoded><![CDATA[<p><a href="https://addons.mozilla.org">AMO</a> has had an on-again off-again relationship with unit tests.  A little over a year ago we had <a href="http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/">a thousand unit tests</a> that sort of, mostly, ran.  The problem is, PHP unit testing just isn&#8217;t as good as it should be.  CakePHP relies on <a href="http://www.simpletest.org/">SimpleTest</a>, one of the main PHP test suites.  It worked relatively well for a small number of tests, but as our suite grew, so did our troubles.</p>
<p>Our main issue was hitting a memory limit or the max execution time.  We hit the limits often for a variety of reasons, some legitimate bugs, and some because we tried to hack around things to make the tests run.  If we change the limits we affect the tests because they are running within the same environment.  There wasn&#8217;t really a concept of fixtures then, although it looks like <a href="http://bakery.cakephp.org/articles/view/testing-models-with-cakephp-1-2-test-suite">CakePHP has stepped up there</a>.  The simple test web runner was hard to use and the mock objects were sometimes a little too mocked and missing some attributes.</p>
<p>All in all it was a heroic effort to get that many tests, but we didn&#8217;t maintain it because they were so slow to write and difficult to run.  Testing can be a pain to write, sure, but it shouldn&#8217;t be a burden like that.  Enter <a href="http://docs.djangoproject.com/en/dev/topics/testing/">Django&#8217;s testing suite</a> (built on top of <a href="http://docs.python.org/library/unittest.html">Python&#8217;s unittest</a>).  It has most of our complaints handled out of the box.  It&#8217;s very well documented, considers a lot of aspects of testing, supports fixtures, a built-in client, etc.  It&#8217;s a well thought out framework to build tests on.</p>
<p>We&#8217;re being more vigilant about requiring tests this time around, but they also aren&#8217;t as frustrating to write.  When you write them they actually work and they stay working.  Most of what you want is built in already.  For example, I wrote the password reset form we needed on AMO in Django.  With CakePHP and SimpleTest I&#8217;d have no idea how to test that the email was actually working.  It&#8217;s apparently possible <a href="http://www.curioussymbols.com/simplemail/">with a SimpleTest add-on</a> and enough code that I have to scroll in my browser.  With Django&#8217;s test suite the actual code was 5 lines, 3 of which were assertions:</p>
<pre><code class="python">
    def test_request_success(self):
        self.client.post('/en-US/firefox/users/pwreset',
                        {'email': self.user.email})

        eq_(len(mail.outbox), 1)
        assert mail.outbox[0].subject.find('Password reset') == 0
        assert mail.outbox[0].body.find('pwreset/%s' % self.uidb36) > 0
</code></pre>
<p>With the power of the new test suite we&#8217;re once again writing and maintaining our unit tests &#8211; currently at around 390 tests and increasing steadily.  Plenty of people have written about why unit tests are important so I won&#8217;t belabor the point, but I will mention that it&#8217;s a great feeling to be able to commit something and be confident it hasn&#8217;t affected other parts of the site.  It&#8217;s almost as good of a feeling when you write your code and a completely different test fails pointing out a case that you didn&#8217;t even consider but one that would soak up developer time trying to debug down the road.  </p>
<p>Building on a foundation that takes testing seriously is great.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2010/05/03/addons-mozilla-org-%e2%99%a5s-unit-tests-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Continuous Integration comes to AMO</title>
		<link>http://micropipes.com/blog/2010/04/07/continuous-integration-comes-to-amo/</link>
		<comments>http://micropipes.com/blog/2010/04/07/continuous-integration-comes-to-amo/#comments</comments>
		<pubDate>Wed, 07 Apr 2010 14:54:13 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=139</guid>
		<description><![CDATA[It&#8217;s time to hail another milestone for AMO in our epic push for improvements in 2010.  This time I&#8217;m happy to announce our Hudson continuous integration server which has been humming along for a few months.

Hudson Integration Screenshot.  Click to enlarge.
AMO is the first Mozilla Webdev site to use continuous integration, and it&#8217;s [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s time to hail another milestone for <a href="https://addons.mozilla.org/">AMO</a> in our <a href="http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/">epic push for improvements in 2010</a>.  This time I&#8217;m happy to announce our <a href="https://hudson.mozilla.org/job/preview.addons.mozilla.org/">Hudson continuous integration server</a> which has been humming along for a few months.</p>
<p><a href="http://micropipes.com/blog/wp-content/img/hudson_screenshot.png"><img src="http://micropipes.com/blog/wp-content/img/hudson_screenshot_small.png" title="Hudson Summary Screenshot" style="border:1px solid #000; padding:10px;" /></a></p>
<caption>Hudson Integration Screenshot.  Click to enlarge.</caption>
<p>AMO is the first Mozilla Webdev site to use continuous integration, and it&#8217;s been a long time coming.  With the way it&#8217;s currently configured we&#8217;ve got <a href="https://hudson.mozilla.org/job/preview.addons.mozilla.org/512/cobertura/?">code coverage trending</a>, <a href="https://hudson.mozilla.org/job/preview.addons.mozilla.org/512/testReport/?">unit test trending</a>, <a href="https://hudson.mozilla.org/job/preview.addons.mozilla.org/512/violations/?">code quality trending</a>, as well as detailed reports for all the above for every single check in.</p>
<p>If anything fails or oversteps a threshold our IRC bot complains and we can get it fixed up quickly.  It&#8217;s a boon to productivity to know that all the code being checked in is being tested automatically, plus it gives everyone a stable state to compare to.</p>
<p>Thanks to everyone that helped get Hudson going, from the <a href="http://blog.hudson-ci.org/">people that write it</a>, to <a href="http://blog.mozilla.com/it/">the IT team that keeps it alive</a>, to <a href="http://blog.mozilla.com/webdev/">the webdev team</a> that helped work out the kinks.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2010/04/07/continuous-integration-comes-to-amo/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Maintaining localization between Python and PHP (it&#8217;s not fun)</title>
		<link>http://micropipes.com/blog/2010/03/08/maintaining-localization-between-python-and-php-its-not-fun/</link>
		<comments>http://micropipes.com/blog/2010/03/08/maintaining-localization-between-python-and-php-its-not-fun/#comments</comments>
		<pubDate>Mon, 08 Mar 2010 22:42:00 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[L10n]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=115</guid>
		<description><![CDATA[I reached my hand into the barrel of problems our migration to Python is going to cause and came up with Localization.  It figures.
First out of the chute was the .po files.  It turns out the actual formatting is different between the two languages.  PHP uses %1$s for its substitutions, but python [...]]]></description>
			<content:encoded><![CDATA[<p>I reached my hand into the barrel of problems <a href="http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/">our migration to Python</a> is going to cause and came up with Localization.  It figures.</p>
<p>First out of the chute was the .po files.  It turns out the actual formatting is different between the two languages.  PHP uses <em>%1$s</em> for its substitutions, but python uses either named variables like <em>(num)s</em> or integers like <em>{0}</em>.  For the record, they both support <em>%s</em> when you don&#8217;t need to order the substitutions.<br />
PHP example:<br />
<code>I have %2$s apples and %1$s oranges</code><br />
Python example:<br />
<code>I have {1} apples and {0} oranges</code></p>
<p>Since I&#8217;ve worked with the <a href="http://translate.sourceforge.net/wiki/">Translate Toolkit</a> before, I decided to write a script to convert between the two formats.  If you find yourself in the same unfortunate boat as me, behold<br />
<a href="http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/translate/tools/phppo2pypo.py?view=markup">phppo2pypo</a> and <a href="http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/translate/tools/pypo2phppo.py?view=markup">pypo2phppo</a> to convert between the two types.</p>
<p>Crisis averted, right?  Oh, that&#8217;s just scratching the surface.  Remember <a href="http://micropipes.com/blog/2008/07/09/adding-context-to-amo-po-files/">how happy I was that PHP finally started supporting msgctxt</a>?  Well, Python has had <a href="http://bugs.python.org/issue2504">a patch for it since 2008</a> but no one has bothered to land it.  I wrote a new <a href="http://github.com/clouserw/tower/blob/master/l10n/__init__.py">ugettext() and ungettext()</a> that recognizes context in the .po files.  To use simply do: <em>from l10n import ugettext as _</em> at the top of your file.</p>
<p>Along with adding msgctxt support, those two functions also collapse consecutive white space.  We&#8217;re using <a href="http://jinja.pocoo.org/2/">Jinja2</a> with <a href="http://babel.edgewall.org/">Babel</a> and the <a href="http://jinja.pocoo.org/2/documentation/extensions">i18n extension</a> as our template engine.  Jinja2 has a concept of stripping white space from the beginning or end of a string but does nothing about the middle.  A paragraph of text in a Jinja2 template would look like:<br />
<code><br />
  {% trans -%}Mozilla is providing links to these applications<br />
  as a courtesy, and makes no representations regarding the<br />
  applications or any information related thereto. Any questions,<br />
  complaints or claims regarding the applications must be<br />
  directed to the appropriate software vendor.<br />
  {%- endtrans %}<br />
</code></p>
<p>That&#8217;s a decent looking template, right?  Yeah, well, when Babel extracts that, it includes all the line breaks too, giving you something <a href="http://bitbucket.org/plurk/solace/src/tip/solace/i18n/messages.pot#cl-625">like this</a>.  The localizers would revolt if I sent them that, so I added in auto white-space collapsing.  Getting Babel to use the new functions means <a href="http://github.com/clouserw/tower/blob/master/tower/management/commands/extract.py">a new extraction script</a>.</p>
<p>At this point, we&#8217;re extracting strings from our new code and we can convert between Python and PHP files.  All we need now is a Frankenstein mix of xgettext functions to act as glue.  Meet the <a href="http://github.com/clouserw/tower/blob/master/l10n/management/commands/amalgamate.py">amalgamate script</a> that uses the pypo2php scripts, concatenates the .pot files, and merge updates each locales .po file.  After that it&#8217;s <a href="http://viewvc.svn.mozilla.org/vc?revision=63671&#038;view=revision">quick tweaks to the build scripts</a> to create z-messages.po files and we&#8217;re done.</p>
<p>So, all that said, the new process for L10n, while we&#8217;re in this transitional phase, is:</p>
<ol>
<li>From the PHP code, run <em>locale/extract-po-remora.sh</em>.  That pulls everything from all the PHP files, creates <em>locale/r-keys.pot</em>, updates the messages.po file for each locale, and compiles them.  Life used to be so simple.</li>
<li>From the python code, make sure you&#8217;re up to date, then run <em>./manage.py extract</em>.  That will pull everything from the python code and templates and create <em>locale/z-keys.pot</em>.</li>
<li>Run <em>./manage.py amalgamate</em>.  That will merge the z-keys.pot into the PHP messages.po files.</li>
<li>Localizers can make their changes as usual, and commit back to messages.po.</li>
<li>From PHP, <em>locale/copy-to-zamboni.py locale</em> will create z-messages.po files in the Python format. We could skip right to .mo files, but in case something goes wrong I want to see the .po files.</li>
<li>Then, like today, <em>locale/compile-mo.sh locale</em> will compile all the .po files.</li>
</ol>
<p>After all those steps are done, we&#8217;ve got duplicate .mo files, aside from formatting, and each application can look at its own .mo to get the strings it needs.  All this code is just a big band-aid and there are plenty of things that are more fun than juggling L10n between two applications across two <abbr title="Revision Control Systems">RCS</abbr>s.  But we knew what we were getting in to.  I&#8217;ll post something more positive later to help justify it. <img src='http://micropipes.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2010/03/08/maintaining-localization-between-python-and-php-its-not-fun/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>AMO Development Changes in 2010</title>
		<link>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/</link>
		<comments>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 21:44:12 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[CakePHP]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[hindsight]]></category>
		<category><![CDATA[L10n]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[SVN]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=98</guid>
		<description><![CDATA[The AMO team met in Mountain View last week to develop a 2010 plan.  We&#8217;ve been wanting to change some key areas of our development flow for a while but we needed to make sure time was budgeted in the overall AMO and Mozilla goals.  As usual, the timeline will be tight, but [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="https://addons.mozilla.org/"><abbr title="addons.mozilla.org">AMO</abbr></a> team met in Mountain View last week to develop a 2010 plan.  We&#8217;ve been wanting to change some key areas of our development flow for a while but we needed to make sure time was budgeted in the overall AMO and Mozilla goals.  As usual, the timeline will be tight, but the AMO developers do amazing work and as our changes are implemented, development should just get faster.  I&#8217;ll give a brief summary of the changes we&#8217;re planning; a lot of discussion went into this and I&#8217;m not going to be able to cover everything here.  If you&#8217;ve been in the AMO calls or reading the notes you probably already know most of this.</p>
<h3>Migrating from CakePHP to Django</h3>
<p>This is a big undertaking and we&#8217;ve been discussing it for quite a while.  We&#8217;re currently the highest trafficked site on the internet using <a href="http://cakephp.org/">CakePHP</a> and along with that we&#8217;ve run into a lot of frustrating issues.  CakePHP has serviced AMO well for several years, so it&#8217;s not my intention to bad mouth it here, but I do want to give a fair summary of why we&#8217;re moving on.  Please also note that <em>AMO is still running on CakePHP 1.1 which is, I think, a year out of date</em>?  Three substantial issues:</p>
<ul>
<li><strong>Useful Database Abstraction Layer:</strong>  CakePHP has a concept of database abstraction, but we didn&#8217;t find it powerful enough.  When it did work it would return enormous nested arrays of data causing massive CPU and memory usage (out of memory errors plague us on AMO).  When it didn&#8217;t work, we&#8217;d end up doing queries directly which kind of defeats the purpose.  We couldn&#8217;t use prepared statements so we&#8217;d have to escape variables ourselves.  There was no effective caching built-in and since we just had huge arrays as a response there was no effective way to invalidate the cache we were using (see: <a href="http://micropipes.com/blog/2008/04/23/caching-is-easy-expiration-is-hard/">Caching is easy; Expiration is hard</a>).  The DB layer should return objects that are easy to cache and easy to invalidate.  The built-in Django database classes (combined with memcache) should work fine for us here.</li>
<li><strong>Effective unit tests:</strong>  I&#8217;ve <a href="http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/">beat the drum about our unit tests before</a> but the simple matter is that it&#8217;s really difficult to do them right with the tools we are using.  Our test data is already very limited, but if we try to run all our tests right now they&#8217;ll run out of memory (and take forever).  The CakePHP method of mocking controllers and models was inadequate for what we needed and difficult to deal with.  We want our unit tests to run quickly, from the command line, and be independent from each other so there aren&#8217;t intermittent problems to waste our time with.  We&#8217;ll be using Django&#8217;s <a href="http://docs.djangoproject.com/en/dev/topics/testing/">built-in testing framework</a>.</li>
<li><strong>Better debugging:</strong>  Debugging in CakePHP amounts to defining a DEBUG level and seeing what is printed on the screen (usually the giant arrays).  We supplemented this with <a href="http://www.xdebug.org/">Xdebug</a> where we needed it, but that&#8217;s still not enough.  A framework should have excellent logging and on-the-fly debugging that displays a full traceback (often something will fail deep within CakePHP and we&#8217;ll get the file/line where PHP gave up, but not the line in our code that started the problem), the values of variables, the page headers, server settings, SQL that was run, what views and elements are in use, etc.  We&#8217;re planning on using a combination of <a href="http://docs.python.org/library/pdb.html">pdb</a>, <a href="http://ipython.scipy.org/moin/">IPython</a>, and the <a href="http://robhudson.github.com/django-debug-toolbar/">django-debug-toolbar</a> to make all of this easily accessible while developing.</li>
</ul>
<p>Those are the major issues we&#8217;re having right now, but if you want to dig into the comparison some more check out our <a href="https://wiki.mozilla.org/AMO:v4">discussion wiki pages</a>, but realize the majority of discussion happened in person.</p>
<h3>Moving away from <abbr title="Subversion">SVN</abbr></h3>
<p>We moved AMO into SVN in 2006 and it&#8217;s treated us relatively well.  Somewhere along the line, we decided to tag our production versions at a revision of trunk instead of keeping a separate tag and merging changes into it.  It&#8217;s worked for us but it&#8217;s a hard cutoff on code changes, which means that while we&#8217;re in a code freeze no one can check anything in to trunk.  As we begin to branch for larger projects this will become more of a hassle, so I&#8217;m planning on going back to a system where a production tag is created and changes are merged into it as they are ready to go live.</p>
<p>Most of the development team has been using <a href="http://kernel.org/pub/software/scm/git/docs/git-svn.html">git-svn</a> for several months and, aside from the commands being far more verbose, we haven&#8217;t had many complaints.  We&#8217;ve discovered <a href="http://git-scm.com/">Git</a> is a much more powerful development tool and we expect to use it directly starting some time next year.  As of now, we expect to maintain the /locales/ directory in SVN so this change doesn&#8217;t affect localizers but we&#8217;ll keep people notified if there are any changes to that process.</p>
<h3>Continuous Integration</h3>
<p>I mentioned excellent testing being one of the reasons we&#8217;re moving to Django.  Along with that testing is the opportunity for continuous integration.  We plan on using <a href="https://hudson.dev.java.net/">Hudson</a> as the framework for our continuous integration.  With excellent test coverage and quick feedback from Hudson this should drastically lower our regressions and boost our confidence when we deploy.  Speaking of which&#8230;</p>
<h3>Faster Deployment</h3>
<p>For most of 2009 we&#8217;ve pushed on 3 week cycles.  2 weeks of development, 1 week of <abbr title="Quality Assurance">QA</abbr> and <abbr title="Localization">L10n</abbr>.  Delays and regressions being what they are, I think we averaged a little better than a push a month.  This is a fairly rapid cycle for a lot of development shops, but I feel like it&#8217;s holding us back.  We&#8217;ve heard a lot of success stories about shorter  cycles and I&#8217;d like to aim for deployment (optionally, of course) of a few times per week.  By shortening the development cycle we reduce the stress of:</p>
<ul>
<li><strong>the developers:</strong>  Everyone likes to see what they&#8217;ve done go out quicker and it means less conflicts with others when the patches are smaller.</li>
<li><strong>the QA team:</strong> Right now we dump 2 weeks of work on them and say we need it done right away.  With smaller cycles they can verify small changes as they go and not be overwhelmed.</li>
<li><strong>the infrastructure team:</strong> Smaller changes means less to go wrong and with a continuous integration server and some automation they can have minimal involvement with the whole process.</li>
<li><strong>the localizers:</strong> Every time we release we dump a bunch of changes on these fantastic people and tell them we need them back in a week.  Most of the time they plow forward and get them done on time.  If they don&#8217;t though, they are stuck with waiting for the next 3 week cycle.  If we push often, it&#8217;s not a big deal.</li>
<li><strong>the product managers:</strong> These guys come up with crazy ideas for us to implement and then they stare at graphs and numbers to see if it worked.  With shorter cycles they can get faster feedback about what works and what doesn&#8217;t.</li>
<li><strong>the users:</strong> Faster release cycles means bugs that are fixed in the repository are fixed on the live site sooner.  &#8217;nuff said.</li>
</ul>
<h3>Process Data Offline</h3>
<p>Much of AMO relies on cron jobs to get things done.  All the statistics, add-on download numbers, how popular an add-on is, all the star rating calculations, any cleanup or maintenance tasks &#8211; these are all run via cron and they are so intensive that the database has trouble keeping up.  We&#8217;re planning on utilizing <a href="http://gearman.org/">Gearman</a> to farm all this work out to other machines in incremental pieces instead of single huge queries.  Any heavy calculating that can be done offline will be moved to these external processors which should help improve the speed of the site and make all our statistics more reliable (as currently the cron jobs have a tendency to fail before they are complete).</p>
<h3>Improve the Documentation</h3>
<p>Documentation is a noble goal of many developers but it rarely gets enough attention.  We evaluated our <a href="https://wiki.mozilla.org/AMO:Developers">current documentation</a> and found it is woefully out of date.  By being on a wiki that is rarely used it doesn&#8217;t get updated except when someone tries to use it and sees it&#8217;s not right.  We&#8217;re hoping to change that by moving the developer documentation into the code repository itself.  We&#8217;ll be able to integrate with generated API docs, style the docs however we want, and check in changes right along with our code patches.  When someone checks out a copy of AMO, they&#8217;ll get all the documentation right along with it.  We&#8217;ll use <a href="http://sphinx.pocoo.org/">Sphinx</a> to build the docs.</p>
<p>The outline above details several large, high-level changes but there are a lot of other plans for smaller improvements as well.  This post got a lot longer than I was expecting, but I&#8217;m really excited about the direction AMO is headed for 2010.  As these changes are implemented the site will become more responsive and reliable, and we&#8217;ll be able to adapt to the needs of Mozilla&#8217;s users even faster.  As always, feedback and discussion are welcome and stay tuned for further back end improvements.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/feed/</wfw:commentRss>
		<slash:comments>37</slash:comments>
		</item>
		<item>
		<title>Add-on Localization Completeness Script is on AMO</title>
		<link>http://micropipes.com/blog/2009/10/22/add-on-localization-completeness-script-is-on-amo/</link>
		<comments>http://micropipes.com/blog/2009/10/22/add-on-localization-completeness-script-is-on-amo/#comments</comments>
		<pubDate>Thu, 22 Oct 2009 15:40:18 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[add-ons]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[L10n]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=88</guid>
		<description><![CDATA[The add-on verification suite launched a few months ago and has been refined with each subsequent milestone.  We&#8217;ve changed what it searches for based on feedback and our own findings and earlier this month we made it available to anyone on AMO, not just a hosted add-on&#8217;s authors.
The framework was written in an extensible [...]]]></description>
			<content:encoded><![CDATA[<p>The add-on verification suite launched a few months ago and has been refined with each subsequent milestone.  We&#8217;ve changed what it searches for based on feedback and our own findings and earlier this month we made it <a href="https://addons.mozilla.org/en-US/developers/addon/validate">available to anyone on AMO</a>, not just a hosted add-on&#8217;s authors.</p>
<p>The framework was written in an extensible way so in addition to tweaking the built-in searches, we could also leverage external scripts.  The first such script that is making it to the live site is <a href="http://koala.mozdev.org/drupal/blog/7216">Adrian Kalla&#8217;s</a> <a href="http://hg.mozilla.org/users/akalla_aviary.pl/silme-patched">localization completeness</a> check.  This script attempts to parse and record all the English string files as a baseline.  Then it looks at each locale and reports any missing files, missing translations, or untranslated strings (translations that exist in the locale but are the same as English).</p>
<p>If you validate an extension now and only have partial L10n coverage, scroll down to the new <abbr title="Localization">L10n</abbr> section and you should see something like this:<br />
<img src="http://micropipes.com/blog/wp-content/img/l10n_validation.png" alt="Screen shot of the validation tool" /></p>
<p>Thanks to RJ and Adrian for doing all the work on this.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/10/22/add-on-localization-completeness-script-is-on-amo/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Using substitution strings in .po files</title>
		<link>http://micropipes.com/blog/2009/09/01/using-substitution-strings-in-po-files/</link>
		<comments>http://micropipes.com/blog/2009/09/01/using-substitution-strings-in-po-files/#comments</comments>
		<pubDate>Tue, 01 Sep 2009 20:08:00 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[hindsight]]></category>
		<category><![CDATA[L10n]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=82</guid>
		<description><![CDATA[A couple years ago I recommended using fake msgid&#8217;s in .po files and was, predictably, met with some argument.   I suggested using this hack because there wasn&#8217;t a standard way to store context in a .po file yet.[1]
Since that time msgctxt has become a standard part of gettext and makes my substitution string [...]]]></description>
			<content:encoded><![CDATA[<p>A couple years ago <a href="http://micropipes.com/blog/2007/07/26/ten-tips-for-website-localization/">I recommended using fake msgid&#8217;s in .po files</a> and was, predictably, met with some argument.   I suggested using this hack because there wasn&#8217;t a standard way to store context in a .po file yet.[1]</p>
<p>Since that time <em>msgctxt</em> has become a standard part of gettext and makes my substitution string recommendation obsolete.  I wanted to officially come out and say: substitution strings are a pain.  The scripts we used made it manageable but finding strings in the code meant searching through the .po and not only was this painful for our developers but I think it confused contributors as well.</p>
<p>In our latest release, we&#8217;ve converted <a href="https://addons.mozilla.org/">AMO</a> to use regular .po files now.  On the off chance someone followed my advice and would like to convert their site to regular .po files as well,  Zbigniew Braniecki <a href="http://diary.braniecki.net/2009/08/19/amo-loses-accent/">wrote a bit about the process</a> and you can grab his scripts (and read about the troubles I had) at <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=501988">bug 501988</a>.</p>
<p>Since I tagged this post <em>hindsight</em> I guess I should look back and conclude something too.  Would I do it again?  At the time, there were no great alternatives.  So, yeah, I would get more opinions on whether the Gnome/KDE method was better, but I would do it again.  I think it was the best choice out of several poor ones, but that doesn&#8217;t mean I&#8217;m not very happy to be rid of them.</p>
<p>[1] To be fair, there <em>was</em> a more common way, used by some KDE and Gnome projects, which was to put a delimiter in the msgid and keep the context on one side and the original string on the other.  This is also pretty hacky and you can <a href="http://live.gnome.org/GnomeGoals/MsgctxtMigration">read about Gnome&#8217;s migration away from that method</a> if you&#8217;re curious.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/09/01/using-substitution-strings-in-po-files/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Top 50 searches on addons.mozilla.org</title>
		<link>http://micropipes.com/blog/2009/05/26/top-50-searches-on-addonsmozillaorg/</link>
		<comments>http://micropipes.com/blog/2009/05/26/top-50-searches-on-addonsmozillaorg/#comments</comments>
		<pubDate>Tue, 26 May 2009 21:13:12 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[add-ons]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[statistics]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=80</guid>
		<description><![CDATA[The flight from Portland to San Jose is just about the right length to write some scripts to analyze a bunch of data, make a pretty graph, and then write a blog post drawing fairly obvious conclusions.  Someone on IRC said they were interested in the top search terms being used on addons.mozilla.org so [...]]]></description>
			<content:encoded><![CDATA[<p>The flight from Portland to San Jose is just about the right length to write some scripts to analyze a bunch of data, make a pretty graph, and then write a blog post drawing fairly obvious conclusions.  Someone on <abbr title="Internet Relay Chat">IRC</abbr> said they were interested in the top search terms being used on <a href="https://addons.mozilla.org/">addons.mozilla.org</a> so here we are.</p>
<p>During the week of April 29, 2009 and May 5, 2009 there were around 150000 queries.  Of the top 20 queries on addons.mozilla.org (a quick estimate says that is around 12% of the total queries on the site) only 7 actually have search terms.  The rest are just choosing different options for the search like category or number of results on a page.  If we filter the top queries for ones that include search terms we get a graph that looks like this:</p>
<p><img src="/blog/wp-content/img/amo.searches.graph.05.2009.png" title="Top 50 search terms and their rankings on addons.mozilla.org" /></p>
<p>All the searches on that page are for the <abbr title="English (US)">en-US</abbr> locale unless otherwise noted.  It looks like the majority of searches are for specific add-ons but there are also some popular generic terms like <em>download</em>, <em>gmail</em>, and <em>video</em>.  I think it&#8217;s interesting that German was the only other locale to make the list (and fairly high up on the list).  Maybe the next stats post will be about overall locale use.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/05/26/top-50-searches-on-addonsmozillaorg/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>addons.mozilla.org Celebrates 1000 (passing) Unit Tests</title>
		<link>http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/</link>
		<comments>http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/#comments</comments>
		<pubDate>Thu, 09 Apr 2009 23:16:35 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[testing]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=79</guid>
		<description><![CDATA[We started writing unit tests for AMO a few years ago with the best of intentions.  As the tests grew we started running into memory/timeout problems that prevented us from running the tests.  Other priorities took over and since we couldn&#8217;t run the tests we quit writing them.  The tests got put [...]]]></description>
			<content:encoded><![CDATA[<p>We started writing unit tests for <a href="https://addons.mozilla.org/"><abbr title="addons.mozilla.org">AMO</abbr></a> a few years ago with the best of intentions.  As the tests grew we started running into memory/timeout problems that prevented us from running the tests.  Other priorities took over and since we couldn&#8217;t run the tests we quit writing them.  The tests got put on the back burner, became stale, and we&#8217;re for the most part forgotten (an all too familiar story for most developers).</p>
<p>Over the past few months we&#8217;ve been turning that around.  While it&#8217;s certainly a team effort, it&#8217;s not stretching the truth to say that <a href="http://blog.jeffbalogh.org/">Jeff Balogh</a> has been the driving force behind making sure our framework can scale and getting our old tests running again.  Thanks to his tireless efforts our latest numbers show over <strong>1200 unit tests</strong>, 1065 of which are passing.</p>
<p>In an effort to prevent them from being forgotten again he also <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=479905">created an IRC bot named bosley</a> who tracks the tests and reminds people when they fail.  Expect to see bosley in <a href="irc://irc.mozilla.org/#amo">#amo</a> soon.</p>
<p>The number of tests and the continuous monitoring of them is a huge milestone for AMO and Mozilla WebDev.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>The Tagging Plan for AMO</title>
		<link>http://micropipes.com/blog/2009/03/06/the-tagging-plan-for-amo/</link>
		<comments>http://micropipes.com/blog/2009/03/06/the-tagging-plan-for-amo/#comments</comments>
		<pubDate>Fri, 06 Mar 2009 20:25:09 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[L10n]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=76</guid>
		<description><![CDATA[Firstly, thanks for all the great feedback.  Something as seemingly simple as tagging gets complex quickly when thought out and the varied perspectives of the community are always great to have.
Allowing full Unicode would let anyone use meaningful tags in their own character sets but would prevent us from offering similar matches and common [...]]]></description>
			<content:encoded><![CDATA[<p>Firstly, thanks for <a href="http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/#comments">all the great feedback</a>.  Something as seemingly simple as tagging gets complex quickly when thought out and the varied perspectives of the community are always great to have.</p>
<p>Allowing full Unicode would let anyone use meaningful tags in their own character sets but would prevent us from offering similar matches and common misspellings.  On the other hand, we support several languages on <a href="https://addons.mozilla.org"><abbr title="addons.mozilla.org">AMO</abbr></a> that don&#8217;t use the Latin alphabet.  It stands to reason that users would search for tags in their own character sets and would get no results.  There are pros and cons for each choice but we&#8217;re essentially debating the value of normalization in tagging.</p>
<p>After distilling all the feedback and talking amongst ourselves our overall feeling was that forcing people to convert their input into the Latin alphabet wasn&#8217;t in the users&#8217; best interest.  The <a href="http://www.mozilla.org/about/manifesto">Mozilla Manifesto</a> talks about a global internet that fosters creativity and free expression.  Not supporting a user&#8217;s native language when we have the option to doesn&#8217;t feel like the right path to take.</p>
<p>With that in mind our current plan is as follows:</p>
<ul>
<li>Allow full Unicode in tags to the extent we do everywhere else on AMO.</li>
<li>Do no automatic character normalization.  The option of manual normalization (essentially, marking some tags as equivalent) is left open as a future enhancement.</li>
<li>Do automatic white space and capitalization normalization.  Spaces are displayed on an add-on&#8217;s page but when searching or entering into a <abbr title="Uniform Resource Locator">URL</abbr> spaces are unnecessary.  For example, <em>newyork</em>, <em>new york</em> and <em>New YORk</em> are all equivalent.</li>
<li>A list of suggestions will be provided as the user types.  We may attempt some simplistic character normalization in the suggestions if we can come up with a way that provides enough value to continue to use (perhaps something that is per-language).</li>
<li>White space is trimmed from the beginning and end of tags before they are saved into the database.</li>
<li>Tags are limited to 128 characters and add-ons are limited to 80 tags.</li>
<li>Tags will be comma delimited.  To include a comma in your tag you must use quotation marks.  Quotation marks, whether they are matched or not, are discarded.  Example:  <em>&#8220;Portland, OR&#8221;</em> will become <em>Portland, OR</em> whereas <em>Portland&#8221;, OR</em> will become <em>Portland</em> and <em>OR</em>.</li>
</ul>
<p>Additional feedback, as always, is welcome.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/03/06/the-tagging-plan-for-amo/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Some considerations when adding Tags to AMO</title>
		<link>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/</link>
		<comments>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/#comments</comments>
		<pubDate>Mon, 02 Mar 2009 23:43:12 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[CakePHP]]></category>
		<category><![CDATA[L10n]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=75</guid>
		<description><![CDATA[Tags broke into the limelight around the time &#8220;Web 2.0&#8243; was becoming popularized.  They provided a simple but effective way to categorize objects and many sites are using them now.  Despite their proliferation, I haven&#8217;t found any documentation on the internet regarding standards for implementing tags.  
A tag library exists for CakePHP [...]]]></description>
			<content:encoded><![CDATA[<p>Tags broke into the limelight around the time &#8220;Web 2.0&#8243; was becoming popularized.  They provided a simple but effective way to categorize objects and many sites are using them now.  Despite their proliferation, I haven&#8217;t found any documentation on the internet regarding standards for implementing tags.  </p>
<p>A <a href="http://bakery.cakephp.org/articles/view/tag-cloud">tag library exists for CakePHP</a> but it, and many others, are too simplistic for what we want.</p>
<p>We&#8217;ve written our tagging goals into a plan but have some technical details we still need to figure out.  While reviewing what we have a couple questions arose that we thought people would have opinions on.</p>
<p>1) What should the range of allowed characters be?  Our first instinct was simplicity, something like <em>/[A-Za-z0-9-]/</em> (that is, all English letters and numbers and a dash).  This is easy to handle on our end but leaves out everyone that doesn&#8217;t want to add tags using the English alphabet.  There is some debate how useful it would be to allow other Unicode characters, particularly when you think about #2 below.</p>
<p>2) Tags are most useful when they are normalized.  By allowing Unicode characters we run the risk of diluting our tag cloud.  For example, resume and résumé are close enough that for our purposes they are equivalent.  If we allow Unicode we&#8217;ll have to deal with converting characters like é to e and vice versa for searches.  At that point we&#8217;ll need a list of &#8220;equivalent&#8221; characters &#8211; not impossible but it will slow things down (both development and speed of a search).  The second question is:  Assuming you think we should allow Unicode characters, what characters are equivalents?  Here is a quick idea from <a href="http://php.oregonstate.edu/manual/en/function.strtr.php">php.net&#8217;s strtr() documentation</a>:</p>
<pre><code class="php">
$a = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõöøùúûýýþÿŔŕ';
$b = 'aaaaaaaceeeeiiiidnoooooouuuuybsaaaaaaaceeeeiiiidnoooooouuuyybyRr';
</code></pre>
<p>Some other aspects of our current plan are:</p>
<ul>
<li>Tags are not localizable in the same way as other strings on the site (like categories).  There isn&#8217;t anything stopping someone from using &#8220;WebDev&#8221; as a tag or creating a new tag with &#8220;WebDev&#8221; translated in their language.  However, there won&#8217;t be any relationship between the two translated tags.</li>
<li>Tags are separated by spaces.  Spaces within tags are allowed with quotes.</li>
<li>Spaces will be preserved when displaying a tag on the add-on&#8217;s page, however, they will be removed for displaying the tag in a URL and for doing logical operations on the back end like searching.  This means searching for &#8220;Portland OR&#8221; will actually be collapsed to &#8220;PortlandOR&#8221; and will match either &#8220;Portland OR&#8221; or &#8220;PortlandOR&#8221; tags.  This is consistent with <a href="http://flickr.com/">flickr</a>.</li>
<li>If unicode is allowed we&#8217;ll preserve characters as they are entered even if we are actually searching on their &#8220;equivalents.&#8221;</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/03/02/some-considerations-when-adding-tags-to-amo/feed/</wfw:commentRss>
		<slash:comments>15</slash:comments>
		</item>
	</channel>
</rss>
