AMO Development Changes in 2010

The AMO team met in Mountain View last week to develop a 2010 plan. We've been wanting to change some key areas of our development flow for a while but we needed to make sure time was budgeted in the overall AMO and Mozilla goals. As usual, the timeline will be tight, but the AMO developers do amazing work and as our changes are implemented, development should just get faster. I'll give a brief summary of the changes we're planning; a lot of discussion went into this and I'm not going to be able to cover everything here. If you've been in the AMO calls or reading the notes you probably already know most of this.

Migrating from CakePHP to Django

This is a big undertaking and we've been discussing it for quite a while. We're currently the highest trafficked site on the internet using CakePHP and along with that we've run into a lot of frustrating issues. CakePHP has serviced AMO well for several years, so it's not my intention to bad mouth it here, but I do want to give a fair summary of why we're moving on. Please also note that AMO is still running on CakePHP 1.1 which is, I think, a year out of date? Three substantial issues:

Those are the major issues we're having right now, but if you want to dig into the comparison some more check out our discussion wiki pages, but realize the majority of discussion happened in person.

Moving away from SVN

We moved AMO into SVN in 2006 and it's treated us relatively well. Somewhere along the line, we decided to tag our production versions at a revision of trunk instead of keeping a separate tag and merging changes into it. It's worked for us but it's a hard cutoff on code changes, which means that while we're in a code freeze no one can check anything in to trunk. As we begin to branch for larger projects this will become more of a hassle, so I'm planning on going back to a system where a production tag is created and changes are merged into it as they are ready to go live.

Most of the development team has been using git-svn for several months and, aside from the commands being far more verbose, we haven't had many complaints. We've discovered Git is a much more powerful development tool and we expect to use it directly starting some time next year. As of now, we expect to maintain the /locales/ directory in SVN so this change doesn't affect localizers but we'll keep people notified if there are any changes to that process.

Continuous Integration

I mentioned excellent testing being one of the reasons we're moving to Django. Along with that testing is the opportunity for continuous integration. We plan on using Hudson as the framework for our continuous integration. With excellent test coverage and quick feedback from Hudson this should drastically lower our regressions and boost our confidence when we deploy. Speaking of which...

Faster Deployment

For most of 2009 we've pushed on 3 week cycles. 2 weeks of development, 1 week of QA and L10n. Delays and regressions being what they are, I think we averaged a little better than a push a month. This is a fairly rapid cycle for a lot of development shops, but I feel like it's holding us back. We've heard a lot of success stories about shorter cycles and I'd like to aim for deployment (optionally, of course) of a few times per week. By shortening the development cycle we reduce the stress of:

Process Data Offline

Much of AMO relies on cron jobs to get things done. All the statistics, add-on download numbers, how popular an add-on is, all the star rating calculations, any cleanup or maintenance tasks - these are all run via cron and they are so intensive that the database has trouble keeping up. We're planning on utilizing Gearman to farm all this work out to other machines in incremental pieces instead of single huge queries. Any heavy calculating that can be done offline will be moved to these external processors which should help improve the speed of the site and make all our statistics more reliable (as currently the cron jobs have a tendency to fail before they are complete).

Improve the Documentation

Documentation is a noble goal of many developers but it rarely gets enough attention. We evaluated our current documentation and found it is woefully out of date. By being on a wiki that is rarely used it doesn't get updated except when someone tries to use it and sees it's not right. We're hoping to change that by moving the developer documentation into the code repository itself. We'll be able to integrate with generated API docs, style the docs however we want, and check in changes right along with our code patches. When someone checks out a copy of AMO, they'll get all the documentation right along with it. We'll use Sphinx to build the docs.

The outline above details several large, high-level changes but there are a lot of other plans for smaller improvements as well. This post got a lot longer than I was expecting, but I'm really excited about the direction AMO is headed for 2010. As these changes are implemented the site will become more responsive and reliable, and we'll be able to adapt to the needs of Mozilla's users even faster. As always, feedback and discussion are welcome and stay tuned for further back end improvements.

25 Comments

We've just built an Intranet on CakePHP 1.2 and it is miles ahead of 1.1

The problems you list we've never really encountered once we'd got to grips with the framework's way of doing things.

Better debugging come in the DebugKit plugin: http://www.ohloh.net/p/cakephp-debugkit , although SQL and stacktraces are in there by default.

Caching just works with the Cache::read, and Cache::write functions and various engines (memcached, Xcache, filesystem)

The only disadvantage I've found so far is that performance (requests per second) is so-so, and you really need to scale out with it.
-- Dave, 17 Nov 2009
It's a bit weird to see the move to Django+Git when other Mozilla projects use Mercurial. Surely using Mercurial alongside Django would be way better than Django+Git?
-- Greg, 17 Nov 2009
Just a vote for cakephp 1.2 since it could solve some problems you met and migrate from 1.1 to 1.2 won't be harder than from cakephp to Django.
-- kiang, 17 Nov 2009
Moving to CakePHP 1.2 wouldn't solve most of their problems and it would be harder than you think. they modified the framework heavily so they would have a hard time merging their changes.
Thanks guys for using CakePHP and good luck with the transition.
-- kabturek, 17 Nov 2009
w00t! Looking forward to progress reports. :)
-- Barry, 17 Nov 2009
All of those problems can easily be solved by migrating to Cake 1.2, hell even Cake 1.3. You should also be using memcached and Debug kit.

Its kinda weird that you decide to go with another option, before attempting to fix the current problem.
-- Miles Johnson, 17 Nov 2009
Great news, guys.
It is much better to do the job, not fix problems that are solved already.
-- windock, 17 Nov 2009
@Greg

What do you think we would gain by using Mercurial instead of Git? We don't have to worry about cross-platform issues, which was (IIRC) the killer for git in the Great Mozilla Rebase.

Since a lot of us were using git-svn already, switching to git made more sense than jumping over to hg (and feels much better than fighting with svn). I don't think git vs. hg is going to be a big hurdle in keeping people from contributing code.
-- Jeff Balogh, 17 Nov 2009
@miles, etc.

As far as attempting to fix the current problem - this is precisely what we're doing. We did weigh in upgrading Cake - which for our project is no light matter - with seeking other frameworks.

As someone who's quite proficient with PHP and has been coding PHP for years, and someone who's had a good run with Django with in production apps, I can honestly say that this is going to be the right choice for our team. Django is an elegant solution, something that much of our team is familiar with, and has proven itself to work well. Furthermore the Django Community is really great and very supportive of our efforts.

We'll still have a lot of Cake to contend with, but as we clear out old issues, Django should be very exciting, challenging, but ultimately a winner.

Cheers,
-d
-- Dave Dash, 17 Nov 2009
Just for curiosity: why Django?
Of course, it's a great framework. I trieds a few months ago, but I have a problem with python (I simply don't like it) so I gave up. But there are plenty of other good frameworks availlable, for PHP (Symfony and Zend being the best choices IMHO) and other languages (of course Rails, but less known but still powerful tools like Grails for Groovy, Catalyst for Perl), so I wanted to know if the choice was actually a matter of taste (or maybe a few developers who already know Django), or it came after looking for pros and cons of every other choice...
-- Davide, 18 Nov 2009
I personally use CakePHP 1.2 very often. I didn't use it back in the 1.1 days but from looking at old documentation/tutorials, CakePHP 1.1 was very immature. I've made way too many apps with 1.2 and have run some of the craziest queries using their ORM model. It's fine if you want to move to Django but CakePHP 1.2 does fix almost all of the issues mentioned.

-Cake's ORM will only return the data that you need. That's what the Containable and Linkable behavior do.
-Caching can be implemented very quickly by just doing full view caching with certain bits and pieces (like user login at the top of the page) still being rendered depending on the user state etc.. You can even have it cache those views to memcached automatically.
-Simple memcached interface that will let you expire your caches, etc..
-The entire framework has unit testing behind it. If it's used properly, it gets the job done just fine.
-As far as debugging goes, there is the debugkit toolbar someone mentioned but you will see what queries ran in the SQL output and your page will contain a full stack trace in the source view. You could easily spend 30 minutes and fine tune it to your needs since all of the data is there, it's just a matter of printing/logging it.

I haven't used Django so I can't say whether it will suffice or not but from what I've seen, it does look very powerful. Goodluck with everything.
-- Giuliano B, 18 Nov 2009
Sorry guys to be the one to say this but, the reasons given in this post show the ineffectiveness of the AMO development team.

I worked at Yahoo and I've used different Java and PHP frameworks in the past, some good, some bad, but that never stopped us from having a continuous build running locally and an integration build running on our server. It never stopped us from unit testing our code, even though we had to extend some components to make it work. The same with debugging, it never stopped us from creating awesome tools to debug our code and monitor the performance of our applications.

A framework doesn't make you a better programmer. You guys should know this. BTW, I use Django as well, and guess what, I'm still working the same way I was working 10 years ago.
-- Matt, 18 Nov 2009
Does it mean that AMO is going to be rewritten to Python? Mozilla uses Mercurial. Why do you want to use different repository?
-- Pavel Cvrček, 18 Nov 2009
Sounds like good changes - I'm curious about the decision to go with Hudson over buildbot though. Can you explain why you chose Hudson, rather then any of the other integration options out there?
-- Ben B, 18 Nov 2009
Hey, more django python shop :-)

Pitfalls we're seeing with django on the l10n dashboard:

- insertmany is sloooow. Not sure how much of a problem that would be for AMO
- ManyToManyField relations don't always query the way you want. We're currently having hard-coded SQL in there at some places, I still want to try to hack around that with faking a through-model by creating a non-managed model that works on the intermediate table.
- I'd love to hear what you guys hack around.

PS: we're using 1.1 now, fwiw.
-- Axel Hecht, 18 Nov 2009
Hi,

Concerning Mercurial, think again.
There are enormous benefits for developers working on Python and used to Subversion. The learning curve is inexistent. You known subversion, you mostly known Mercurial.
Performance and features are on par with Git, and better Mercurial is 100% Python and not a complicated mixture of languages like Git. Also setting up Mercurial is fast and easy.
Also if you are used to both Git and Subversion, then Mercurial is a no brainer. You will pick up with ease.

For the rest, Django is a good choice. Good luck.

Regards,

Richard
-- Richard Lopes, 20 Nov 2009
I've used both Cake (1.2) and Django (0.96-1.1) and I have to congratulate you guys for the move.
I have to warn you though, there's a lot of fun ahead, so be careful :-)

Cheers,
Marcos
-- Marcos, 21 Nov 2009
CakePHP 1.2+ containable behavior + object oriented database results is amazing. Sounds like your programmers jumped the gun and chose not to learn how to maximize your development in Cake. Sad.
-- Max, 23 Nov 2009
Rewriting from scratch? You're making a big mistake.
Things You Should Never Do
-- Jed, 24 Nov 2009
I think the change from cakephp to django is for the better. Anyway keep up the good work!
-- corre@ webbshop, 25 Jul 2010
Why all the attention on porting from one platform to another? If the project has out grown one system and a shift is inevitable then all you can do is ensure that when you choose the next platform that it is going to be inline with your growth/development needs.
-- Joe, 31 Jul 2010
CakePHP 1.2+ containable behavior + object oriented database results is amazing. Sounds like your programmers jumped the gun and chose not to learn how to maximize your development in Cake. Sad.
-- Webdesign Köln, 08 Aug 2010
Of course, it’s a great framework. I trieds a few months ago, but I have a problem with python (I simply don’t like it) so I gave up. But there are plenty of other good frameworks availlable, for PHP (Symfony and Zend being the best choices IMHO) and other languages (of course Rails, but less known but still powerful tools like Grails for Groovy, Catalyst for Perl), so I wanted to know if the choice was actually a matter of taste (or maybe a few developers who already know Django), or it came after looking for pros and cons of every other choice
-- Stephen Rivers, 13 Aug 2010
I’m curious about the decision to go with Hudson over buildbot though.
-- Craig Dorsey, 22 Sep 2010
A good point made by Pavel Cvrcek.Why do you want to use a different repository when Mozilla uses Mercurial.
-- Doug from AbWorkouts, 12 Nov 2010

Post a comment

Feel free to email me with any comments, and I'll be happy to post them on the articles. This is a static site so it's not automatic.