<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>All Night Diner &#187; SVN</title>
	<atom:link href="http://micropipes.com/blog/tag/svn/feed/" rel="self" type="application/rss+xml" />
	<link>http://micropipes.com/blog</link>
	<description>because at 3am anything sounds good</description>
	<lastBuildDate>Mon, 03 May 2010 17:34:44 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>AMO Development Changes in 2010</title>
		<link>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/</link>
		<comments>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/#comments</comments>
		<pubDate>Tue, 17 Nov 2009 21:44:12 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[AMO]]></category>
		<category><![CDATA[caching]]></category>
		<category><![CDATA[CakePHP]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[hindsight]]></category>
		<category><![CDATA[L10n]]></category>
		<category><![CDATA[PHP]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[scalability]]></category>
		<category><![CDATA[SVN]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=98</guid>
		<description><![CDATA[The AMO team met in Mountain View last week to develop a 2010 plan.  We&#8217;ve been wanting to change some key areas of our development flow for a while but we needed to make sure time was budgeted in the overall AMO and Mozilla goals.  As usual, the timeline will be tight, but [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="https://addons.mozilla.org/"><abbr title="addons.mozilla.org">AMO</abbr></a> team met in Mountain View last week to develop a 2010 plan.  We&#8217;ve been wanting to change some key areas of our development flow for a while but we needed to make sure time was budgeted in the overall AMO and Mozilla goals.  As usual, the timeline will be tight, but the AMO developers do amazing work and as our changes are implemented, development should just get faster.  I&#8217;ll give a brief summary of the changes we&#8217;re planning; a lot of discussion went into this and I&#8217;m not going to be able to cover everything here.  If you&#8217;ve been in the AMO calls or reading the notes you probably already know most of this.</p>
<h3>Migrating from CakePHP to Django</h3>
<p>This is a big undertaking and we&#8217;ve been discussing it for quite a while.  We&#8217;re currently the highest trafficked site on the internet using <a href="http://cakephp.org/">CakePHP</a> and along with that we&#8217;ve run into a lot of frustrating issues.  CakePHP has serviced AMO well for several years, so it&#8217;s not my intention to bad mouth it here, but I do want to give a fair summary of why we&#8217;re moving on.  Please also note that <em>AMO is still running on CakePHP 1.1 which is, I think, a year out of date</em>?  Three substantial issues:</p>
<ul>
<li><strong>Useful Database Abstraction Layer:</strong>  CakePHP has a concept of database abstraction, but we didn&#8217;t find it powerful enough.  When it did work it would return enormous nested arrays of data causing massive CPU and memory usage (out of memory errors plague us on AMO).  When it didn&#8217;t work, we&#8217;d end up doing queries directly which kind of defeats the purpose.  We couldn&#8217;t use prepared statements so we&#8217;d have to escape variables ourselves.  There was no effective caching built-in and since we just had huge arrays as a response there was no effective way to invalidate the cache we were using (see: <a href="http://micropipes.com/blog/2008/04/23/caching-is-easy-expiration-is-hard/">Caching is easy; Expiration is hard</a>).  The DB layer should return objects that are easy to cache and easy to invalidate.  The built-in Django database classes (combined with memcache) should work fine for us here.</li>
<li><strong>Effective unit tests:</strong>  I&#8217;ve <a href="http://micropipes.com/blog/2009/04/09/addonsmozillaorg-celebrates-1000-passing-unit-tests/">beat the drum about our unit tests before</a> but the simple matter is that it&#8217;s really difficult to do them right with the tools we are using.  Our test data is already very limited, but if we try to run all our tests right now they&#8217;ll run out of memory (and take forever).  The CakePHP method of mocking controllers and models was inadequate for what we needed and difficult to deal with.  We want our unit tests to run quickly, from the command line, and be independent from each other so there aren&#8217;t intermittent problems to waste our time with.  We&#8217;ll be using Django&#8217;s <a href="http://docs.djangoproject.com/en/dev/topics/testing/">built-in testing framework</a>.</li>
<li><strong>Better debugging:</strong>  Debugging in CakePHP amounts to defining a DEBUG level and seeing what is printed on the screen (usually the giant arrays).  We supplemented this with <a href="http://www.xdebug.org/">Xdebug</a> where we needed it, but that&#8217;s still not enough.  A framework should have excellent logging and on-the-fly debugging that displays a full traceback (often something will fail deep within CakePHP and we&#8217;ll get the file/line where PHP gave up, but not the line in our code that started the problem), the values of variables, the page headers, server settings, SQL that was run, what views and elements are in use, etc.  We&#8217;re planning on using a combination of <a href="http://docs.python.org/library/pdb.html">pdb</a>, <a href="http://ipython.scipy.org/moin/">IPython</a>, and the <a href="http://robhudson.github.com/django-debug-toolbar/">django-debug-toolbar</a> to make all of this easily accessible while developing.</li>
</ul>
<p>Those are the major issues we&#8217;re having right now, but if you want to dig into the comparison some more check out our <a href="https://wiki.mozilla.org/AMO:v4">discussion wiki pages</a>, but realize the majority of discussion happened in person.</p>
<h3>Moving away from <abbr title="Subversion">SVN</abbr></h3>
<p>We moved AMO into SVN in 2006 and it&#8217;s treated us relatively well.  Somewhere along the line, we decided to tag our production versions at a revision of trunk instead of keeping a separate tag and merging changes into it.  It&#8217;s worked for us but it&#8217;s a hard cutoff on code changes, which means that while we&#8217;re in a code freeze no one can check anything in to trunk.  As we begin to branch for larger projects this will become more of a hassle, so I&#8217;m planning on going back to a system where a production tag is created and changes are merged into it as they are ready to go live.</p>
<p>Most of the development team has been using <a href="http://kernel.org/pub/software/scm/git/docs/git-svn.html">git-svn</a> for several months and, aside from the commands being far more verbose, we haven&#8217;t had many complaints.  We&#8217;ve discovered <a href="http://git-scm.com/">Git</a> is a much more powerful development tool and we expect to use it directly starting some time next year.  As of now, we expect to maintain the /locales/ directory in SVN so this change doesn&#8217;t affect localizers but we&#8217;ll keep people notified if there are any changes to that process.</p>
<h3>Continuous Integration</h3>
<p>I mentioned excellent testing being one of the reasons we&#8217;re moving to Django.  Along with that testing is the opportunity for continuous integration.  We plan on using <a href="https://hudson.dev.java.net/">Hudson</a> as the framework for our continuous integration.  With excellent test coverage and quick feedback from Hudson this should drastically lower our regressions and boost our confidence when we deploy.  Speaking of which&#8230;</p>
<h3>Faster Deployment</h3>
<p>For most of 2009 we&#8217;ve pushed on 3 week cycles.  2 weeks of development, 1 week of <abbr title="Quality Assurance">QA</abbr> and <abbr title="Localization">L10n</abbr>.  Delays and regressions being what they are, I think we averaged a little better than a push a month.  This is a fairly rapid cycle for a lot of development shops, but I feel like it&#8217;s holding us back.  We&#8217;ve heard a lot of success stories about shorter  cycles and I&#8217;d like to aim for deployment (optionally, of course) of a few times per week.  By shortening the development cycle we reduce the stress of:</p>
<ul>
<li><strong>the developers:</strong>  Everyone likes to see what they&#8217;ve done go out quicker and it means less conflicts with others when the patches are smaller.</li>
<li><strong>the QA team:</strong> Right now we dump 2 weeks of work on them and say we need it done right away.  With smaller cycles they can verify small changes as they go and not be overwhelmed.</li>
<li><strong>the infrastructure team:</strong> Smaller changes means less to go wrong and with a continuous integration server and some automation they can have minimal involvement with the whole process.</li>
<li><strong>the localizers:</strong> Every time we release we dump a bunch of changes on these fantastic people and tell them we need them back in a week.  Most of the time they plow forward and get them done on time.  If they don&#8217;t though, they are stuck with waiting for the next 3 week cycle.  If we push often, it&#8217;s not a big deal.</li>
<li><strong>the product managers:</strong> These guys come up with crazy ideas for us to implement and then they stare at graphs and numbers to see if it worked.  With shorter cycles they can get faster feedback about what works and what doesn&#8217;t.</li>
<li><strong>the users:</strong> Faster release cycles means bugs that are fixed in the repository are fixed on the live site sooner.  &#8217;nuff said.</li>
</ul>
<h3>Process Data Offline</h3>
<p>Much of AMO relies on cron jobs to get things done.  All the statistics, add-on download numbers, how popular an add-on is, all the star rating calculations, any cleanup or maintenance tasks &#8211; these are all run via cron and they are so intensive that the database has trouble keeping up.  We&#8217;re planning on utilizing <a href="http://gearman.org/">Gearman</a> to farm all this work out to other machines in incremental pieces instead of single huge queries.  Any heavy calculating that can be done offline will be moved to these external processors which should help improve the speed of the site and make all our statistics more reliable (as currently the cron jobs have a tendency to fail before they are complete).</p>
<h3>Improve the Documentation</h3>
<p>Documentation is a noble goal of many developers but it rarely gets enough attention.  We evaluated our <a href="https://wiki.mozilla.org/AMO:Developers">current documentation</a> and found it is woefully out of date.  By being on a wiki that is rarely used it doesn&#8217;t get updated except when someone tries to use it and sees it&#8217;s not right.  We&#8217;re hoping to change that by moving the developer documentation into the code repository itself.  We&#8217;ll be able to integrate with generated API docs, style the docs however we want, and check in changes right along with our code patches.  When someone checks out a copy of AMO, they&#8217;ll get all the documentation right along with it.  We&#8217;ll use <a href="http://sphinx.pocoo.org/">Sphinx</a> to build the docs.</p>
<p>The outline above details several large, high-level changes but there are a lot of other plans for smaller improvements as well.  This post got a lot longer than I was expecting, but I&#8217;m really excited about the direction AMO is headed for 2010.  As these changes are implemented the site will become more responsive and reliable, and we&#8217;ll be able to adapt to the needs of Mozilla&#8217;s users even faster.  As always, feedback and discussion are welcome and stay tuned for further back end improvements.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2009/11/17/amo-development-changes-in-2010/feed/</wfw:commentRss>
		<slash:comments>37</slash:comments>
		</item>
		<item>
		<title>Committing to SVN securely from a web application</title>
		<link>http://micropipes.com/blog/2008/09/19/committing-to-svn-securely-from-a-web-application/</link>
		<comments>http://micropipes.com/blog/2008/09/19/committing-to-svn-securely-from-a-web-application/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 22:29:58 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[security]]></category>
		<category><![CDATA[SVN]]></category>
		<category><![CDATA[Verbatim]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/?p=56</guid>
		<description><![CDATA[Verbatim is the second project I&#8217;ve been the lead on recently where the requirements included people committing to SVN as themselves via the application.  At first glance this means storing the authentication tokens of the user in plain text since we&#8217;ll need to pass them along to SVN whenever they commit.  I wasn&#8217;t [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://wiki.mozilla.org/Verbatim">Verbatim</a> is the second project I&#8217;ve been the lead on recently where the requirements included people committing to <abbr title="Subversion">SVN</abbr> as themselves via the application.  At first glance this means storing the authentication tokens of the user in plain text since we&#8217;ll need to pass them along to SVN whenever they commit.  I wasn&#8217;t happy with that solution so after a bit of thinking we came up with an idea that leaves everything encrypted and doesn&#8217;t cache any credentials.  It involved minimal code in Verbatim and minor work on the SVN server.</p>
<p>On the SVN server the first thing we did was to create a special Verbatim user that can commit to SVN via SSH using a generated key.  We copied this key to the Verbatim host which allowed us to commit as the verbatim user without typing a username or password.</p>
<p>The only thing that was added to the Verbatim code was <a href="http://sourceforge.net/mailarchive/forum.php?thread_name=ADA2D058-904B-44F0-8301-21334A7B6E02%40mozilla.com&#038;forum_name=translate-pootle">a patch that Dan Schafer cooked up</a> that sets an SVN revision property, <em>translate:author</em>, to the name of the current user.  When the user clicks &#8220;commit&#8221; this property is set and sent along with the commit.</p>
<p>At this point we could commit from the application but it still goes to the application as the Verbatim user.  We used <a href="http://svnbook.red-bean.com/en/1.4/svn-book.html#svn.ref.reposhooks">SVN&#8217;s hooks</a> to take the next step.</p>
<p>The first script we changed was the pre-revprop-change hook.  This controls what special revision properties a user can modify when they commit.  <a href="https://bugzilla.mozilla.org/attachment.cgi?id=337775">Our script</a> adds the ability to modify svn:author and translate:author.  Before allowing the modifications the script checks if the user committing is the special verbatim user to prevent anyone from committing as someone else.</p>
<p>Next we added a <a href="https://bugzilla.mozilla.org/attachment.cgi?id=339184">post-commit script</a> that looks for the translate:author property.  If it&#8217;s found it will take that value, replace svn:author, and remove translate:author; effectively making whatever was in translate:author the real author.  This is a non-versioned change which means there is no commit that needs to happen &#8211; the new author is set immediately.</p>
<p>With these scripts in place we can commit as anyone from the application and everyone&#8217;s credentials stay encrypted and secure.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2008/09/19/committing-to-svn-securely-from-a-web-application/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>10000 commits and going strong</title>
		<link>http://micropipes.com/blog/2008/02/06/10000-commits-and-going-strong/</link>
		<comments>http://micropipes.com/blog/2008/02/06/10000-commits-and-going-strong/#comments</comments>
		<pubDate>Wed, 06 Feb 2008 08:34:24 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[code]]></category>
		<category><![CDATA[open web]]></category>
		<category><![CDATA[SVN]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/2008/02/06/10000-commits-and-going-strong/</guid>
		<description><![CDATA[Mozilla&#8217;s SVN repository was started on September 2nd, 2006 and just hit 10000 commits.  That&#8217;s an average of over 19 commits a day for 520 days straight!
After my positive experience with python I was gearing up for a script to do some repository analysis when I ran across MPY SVN Stats.  After a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://svn.mozilla.org/">Mozilla&#8217;s SVN repository</a> was started on September 2nd, 2006 and just hit 10000 commits.  That&#8217;s an average of over 19 commits a day for 520 days straight!</p>
<p>After my <a href="http://micropipes.com/blog/2008/01/02/the-most-worthless-bot-on-irc/">positive experience with python</a> I was gearing up for a script to do some repository analysis when I ran across <a href="http://mpy-svn-stats.berlios.de/">MPY SVN Stats</a>.  After a fast download and a one line command I had charts and tables full of info.  So, here&#8217;s some late night stat work for everyone:</p>
<p>We had a total of 103 people that committed code directly to SVN, 69 of which had 10 or more commits.  The top 25 committers by total numbers of commits are:</p>
<blockquote><pre>
No    Author                                Commits    Percentage
1  	wclouser#mozilla.com                1558       15.58%
2 	fligtar#gmail.com                   739        7.39%
3 	steven#silverorange.com             707        7.07%
4 	reed#reedloden.com                  639        6.39%
5 	pascal.chevrel#mozilla-europe.org   567        5.67%
6 	fwenzel#mozilla.com                 551        5.51%
7 	nelson#wordmaster.org               524        5.24%
8 	mkaply#us.ibm.com                   481        4.81%
9 	paul#glaxstar.com                   392        3.92%
10 	mgalli#mgalli.com                   364        3.64%
11 	morgamic#mozilla.com                311        3.11%
12 	dougt#meer.net                      307        3.07%
13 	michael.koch#enough.de              179        1.79%
14 	ahajdukewycz#mozilla.com            173        1.73%
15 	shaver#mozilla.com                  162        1.62%
16 	reed#mozilla.com                    158        1.58%
17 	abuchanan#mozilla.com               112        1.12%
18 	dougt#mozilla.com                   105        1.05%
19 	eshepherd#mozilla.com               101        1.01%
20 	erik#raincitystudios.com            97         0.97%
21 	mfinkle#mozilla.com                 89         0.89%
22 	robert#accettura.com                89         0.89%
23 	smalolepszy#aviary.pl               82         0.82%
24 	oremj#mozilla.com                   76         0.76%
25 	tim.babych#gmail.com                58         0.58%
</pre>
</blockquote>
<p>My numbers here (wclouser#mozilla.com) are inflated because I did a lot of the branching/tagging on our projects.  If you throw my number out as an anomaly you&#8217;ll see that no single person has committed more than 7.5% of the code in SVN.  That&#8217;s a great community hard at work!</p>
<p>Thanks for all your help!</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2008/02/06/10000-commits-and-going-strong/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Quick overview of mozilla.com publishing process</title>
		<link>http://micropipes.com/blog/2007/12/19/quick-overview-of-mozillacom-publishing-process/</link>
		<comments>http://micropipes.com/blog/2007/12/19/quick-overview-of-mozillacom-publishing-process/#comments</comments>
		<pubDate>Thu, 20 Dec 2007 02:54:53 +0000</pubDate>
		<dc:creator>Wil Clouser</dc:creator>
				<category><![CDATA[Mozilla]]></category>
		<category><![CDATA[Drupal]]></category>
		<category><![CDATA[SVN]]></category>

		<guid isPermaLink="false">http://micropipes.com/blog/2007/12/19/quick-overview-of-mozillacom-publishing-process/</guid>
		<description><![CDATA[(apologies in advance to anyone without an LDAP account.  A lot of the links here point to content behind authentication.  If you&#8217;re really curious about what&#8217;s going on back there, you&#8217;re welcome to check out the Kubla source.)
Over the past couple of days I&#8217;ve felt a growing confusion about how the web infrastructure [...]]]></description>
			<content:encoded><![CDATA[<p><small>(apologies in advance to anyone without an <abbr title="Lightweight Directory Access Protocol">LDAP</abbr> account.  A lot of the links here point to content behind authentication.  If you&#8217;re really curious about what&#8217;s going on back there, you&#8217;re welcome to check out <a href="http://svn.mozilla.org/projects/kubla/">the Kubla source</a>.)</small></p>
<p>Over the past couple of days I&#8217;ve felt a growing confusion about how the web infrastructure works on <a href="http://www.mozilla.com/">www.mozilla.com</a>.  People are beginning to use <a href="http://wiki.mozilla.org/Kubla">Kubla</a> (it&#8217;s started with a small set and is steadily growing), and they&#8217;re asking some good questions that hopefully I can help answer here:</p>
<p>Firstly, I think everyone with access has already seen this diagram, but bear with me:</p>
<p><img src="http://people.mozilla.com/~clouserw/public/mozilla.com/mozilla_three_tier_publishing_process.png" title="Mozilla.com Three Tier Publishing Process" /></p>
<p>Fresh changes are committed to the first (top) tier &#8211; also referred to as &#8220;trunk&#8221;.  If you&#8217;re <a href="http://wiki.mozilla.org/Kubla:Documentation/Editing_Pages">making the change yourself</a> you&#8217;re actually affecting this tier.  You can preview any of the changes from the first tier on <a href="https://www-trunk.stage.mozilla.com">www-trunk.stage.mozilla.com</a>.</p>
<p>The second tier (also known as &#8220;stage&#8221; or &#8220;staging&#8221;) is where changes are previewed before they are pushed live.  When something is <a href="http://wiki.mozilla.org/Kubla:Documentation/Merging_Pages">pushed from trunk to stage</a> it will be on this second tier and viewable at <a href="https://www.stage.mozilla.com">www.stage.mozilla.com</a>.  To get to this tier, a change needs to be approved by a human (a publisher in Kubla).  If you have access and you want to see the differences between trunk and staging, look at the <a href="https://kubla.mozilla.com/queue/stage">staging queue</a>.</p>
<p>The final tier is our live site (or &#8220;production&#8221;).  Anything pushed on to this tier will be on <a href="http://www.mozilla.com/">www.mozilla.com</a> within an hour, automatically.  To get to this tier, a change needs to be approved by <a href="https://intranet.mozilla.org/Mozilla_com_approvals">an admin in Kubla</a>.  The differences between the staging site and the live site can be seen in the <a href="https://kubla.mozilla.com/queue/prod">production queue</a>.</p>
<p>One of the main things to remember is that changes are not instantaneous.  Moving a change from trunk to stage and stage to production both require human interaction.  The staging site and the production site are updated periodically (Kubla shows an estimated time to next update on the left hand menu when you log in).  For production particularly, this is definitely an estimate.  The updates still need to rsync to our servers and then the aggressive caching in front of them needs to expire.  Generally that&#8217;s an additional 15 minutes.</p>
<p>For changes that need to go out together or at a specific time we can speed up some of this process, but it becomes less automatic the faster we want it to go.  For most updates the automatic systems should work fine.  As always, I&#8217;m happy to answer questions if you have them, either about this process or Kubla itself.</p>
]]></content:encoded>
			<wfw:commentRss>http://micropipes.com/blog/2007/12/19/quick-overview-of-mozillacom-publishing-process/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
