Skip to content

Grave Pursuit

I read a book called Hint Fiction last year where the idea was to write a compelling story in 25 words or less. My favorite that I can remember was by J. Matthew Zoss:

I’m sorry, but there’s not enough air in here for everyone. I’ll tell them you were a hero.

I had an idea to translate this idea into photographs and tell a story within a limit of 3 photos (a generous 3000 words if we’re going by the standard exchange). I took the first two photos fairly quickly but the 3rd took me a long time to organize the scene (and get a participant). During some time off a couple weeks ago I found the time to finish the trio of photos and complete the tale.

Click on the photo below to see the whole story

Security in Depth; the first layer of addons.mozilla.org

Discussing the security measures of a public facing and popular website is usually taboo. Often owners are unsure they are following best practices, prefer not to draw attention to their site, or hope that they can maintain security through the obscurity of their code. At Mozilla we are fortunate to offer nearly all of the code in the entire company as open source software. addons.mozilla.org is no exception. This means we need to be extra vigilant with the code we write (and a huge thanks to our developers doing code reviews, the security and QA teams testing code, and the community members reporting bugs they find), but it also means I can write posts like this to explain some of the security measures we have implemented and how you can use them to make visitors to your sites safer too.

SSL Encryption: Let’s start off easy. Anytime you go to addons.mozilla.org we redirect you to https://addons.mozilla.org. Assuming you make it through the redirect safely you can be reasonably sure you’re talking to us at that point. Any data sent between your browser and us is encrypted with industry standard encryption. This seems like a freebie (I know you’re thinking, “really? you’re talking about SSL?”), but do a quick search and you’ll find plenty of financial institutions that fail to take even this most basic precaution on pages where you submit the username and password to your accounts.

Alright, let’s get more interesting. AMO has a lot of user uploaded data on it, from images to files (the add-ons themselves) to the files within add-ons (we allow you to browse uploaded add-ons on the site). If a user uploads some malicious JavaScript they’ll be able to run it in the context of the addons.mozilla.org domain which would give them access to manipulate the site and change or steal user data. We protect ourselves by using an alternate domain for user uploads – static.addons.mozilla.net. By using .net instead of .org we’ve sand-boxed user scripts onto their own domain and protected the content on .org. This is industry standard (notice how your content you upload to Google comes off of www.googleusercontent.com) and gives you a free performance boost as well.

Actually, uploaded images should get special mention. It’s a best practice to always clean and verify user data but this is often overlooked for images. Back in the days of IE6 you could actually run arbitrary JavaScript embedded in the comments of an image. This has since been fixed in the browser but poorly configured servers and applications can still pose a threat. Neal Poole demonstrated a proof of concept on a Mozilla site where he embedded PHP in an image, saved it as “image.php” and uploaded to a site. The site saved it under a /media/gallery/ directory (under the webroot with PHP enabled) and he had arbitrary PHP execution on the server. The lesson learned was always re-encode user images when they are submitted. Even if you’re re-encoding from PNG to PNG, strip the comments – it’s not worth it to find out there was something malicious in them later on.

For many sites session cookies are one of the most valuable assets behind the actual credentials to log in. AMO protects the session cookie (and most of its cookies) with two very underused options: the Secure and HTTPOnly flags. Secure simply means the cookie is only sent over a secure connection – that means that when you go to addons.mozilla.org without typing the https, your cookies (and therefore your session) aren’t sent and won’t be compromised if someone is eavesdropping. HTTPOnly means that the cookies are sent with browser traffic to AMO but the cookie is inaccessible to client side scripts. If a malicious script is somehow injected into the page this option will prevent it from stealing the session id. Assuming you don’t need access to the cookies and are running SSL these are essentially free additional layers of security for your site.

Every request to AMO returns a pile of interesting (and sometimes bleeding edge) HTTP headers. If you hit the front page, you’ll see X-Frame-Options: DENY. In a supporting browser, this will prevent someone from putting the AMO site into an <iframe> (which prevents things like clickjacking). The vast majority of sites can add this header for more free security.

In a couple examples above I say that once you get to AMO on SSL you’ll be fine but I conveniently skip all the traffic and redirects up until then. An attempt to keep people safe until they reach that point is the Strict-Transport-Security: max-age=2592000 header. This tells the browser that for the next 30 days, if you type in addons.mozilla.org without https it will automatically send it over SSL before the initial request – no unencrypted traffic at all. Support for this header is not widespread yet, but it’s in all the recent versions of Firefox and I expect support for it to expand.

I can’t mention “bleeding edge” and “headers” without a hat tip to the Content Security Policy (specifications). We’ve had it in reporting mode for a couple months as we work out what needs to be adjusted before we turn it on, but once we do, this will (again, in a supporting browser) define specific rules for what domains will have valid assets (like images, JavaScript, CSS, etc.) as well as disallow any inline JavaScript from executing. This essentially locks down XSS attacks even if someone does find a way to inject code into the page. It’s a really exciting development but not for the faint of heart to implement on a complex site at this point in time. There are still some bugs to iron out and some edge cases to clarify in CSP but it’s becoming something to seriously consider. I think Twitter is the most prominent site using it to enforce rules (as opposed to only reporting violations) at this time.

Whew, that is a pile of text and that’s just covering the extreme front end. I’m going to cut it off there to keep this from running on for pages but if there is interest I’ll write another post about more things AMO does to defend itself and its visitors, and other areas where everyone can consider adding in extra security.

In the mean time, if you want even more best practices the Mozilla Security Team has made a great wiki page for further reading about web security.

AMO 2011 Development Visualized

I was playing around with gource this weekend while watching the TSL 3 Finals and pointed it at addons.mozilla.org’s source repository. I sped it up to display 1 day of commits per second and piped it all to ffmpeg to make a video.

It turned out pretty well so here is addons.mozilla.org development so far for 2011 (in HD!):


(Warning: prefetching is off but if you click play you’re in for an 80MB video.)

The gource docs are easy to read if you want to do this for your project, but for the record this is what I ran:

gource --viewport 1280x720 \
--user-image-dir ~/sandbox/zamboni/.git/avatar/ \
--title "addons.mozilla.org" \
--auto-skip-seconds 1 \
--seconds-per-day 1 \
--start-position .715 \
--max-file-lag 0.5 \
--max-files 5000 \
--camera-mode track \
-o -

Piped to:

ffmpeg -y \
-r 60 \
-f image2pipe \
-vcodec ppm \
-i - \
-vcodec libtheora \
-b 10000K \
~/out.ogv

That gave me a 180MB uncompressed ogv. The uncompressed version looks far nicer, but that’s a lot of bandwidth for a random video so I cut it down with ffmpeg2thoerea (anyone know the switch to do this directly in the ffmpeg command?):

ffmpeg2theora -v 4
~/out.ogv
--optimize
--noaudio
-o amo-2011-development.ogv

Tagged , , ,

getpersonas.com: where it’s from, where it’s going

getpersonas.com was started as a labs project in 2008. The plan was to get a website up and running to show off what lightweight themes were and see if they got any traction. If the site became popular, we’d merge it in to AMO in six or ten months and everyone would go back to working on other things. Ha.

As is all too common, way leads on to way, and now here we are three years later. getpersonas.com has become a juggernaut of 3000x200px free expression on the web. There are over 1.25 million registered users on the site, 400,000 personas, and a half million hits a day. The site was built with scaling in mind and, honestly, has needed relatively little attention.

On the other hand, the site lost its owners and maintainers last year. Deb stepped up with some awesome volunteers and contractors to fix minor issues but there are no dedicated developers to keep the site fresh. The web security bounty program late last year wasn’t kind to the old code, and any time devoted to the site turned in to trudging through old PHP code to solve overlooked problems from long ago.

We’ve decided that this is the year to finally replace the precarious cron job synchronizing the getpersonas.com and AMO databases for the past 18 months and finally migrate the site to AMO completely. This is no small undertaking, but we’ve had a lot of time to think about it. ;)

I wrote a migration plan a few weeks ago as a general guide. The searching and listing pages are already at parity with getpersonas.com. The reviewer and author functionality will be added shortly – and if you read the bugs and look at the mockups you’ll see it’s greatly improved. This is a mutually beneficial migration; the personas will be able to leverage AMO features like statistics reporting and collections, and AMO will get a fresh look at reviewing user submitted content and an influx of creative designers.

I snuck in to a personas planning meeting last week and I saw a lot of fun stuff in the pipeline for personas. I’m happy to say migrating them onto AMO will give everyone the server and developer resources to get that new stuff out the door. This will get underway in Q3 of this year.

Tagged , , ,

Welcome to the Landfill

Anyone who has tried to set up AMO knows it’s no walk in the park even with the respectable amount of documentation. There are two big stumbling blocks: the database is large and complex, and a portion of the site functionality is still in PHP. Django’s syncdb can make a database, but the relationships in the data is what’s hard and trying to load fixtures from the test cases is an exercise in frustration since they may or may not all combine into a useful set of data.

With the launch of landfill.amo[1] we bypass the entire headache. The site started with a clean database and I uploaded an add-on to show it worked, but otherwise it’s empty. It’s compact, fast, and simple to use. The beauty of the site for volunteers and casual developers is that the database and the filesystem are available in their entirety to download. This means you can check out the code, fill in the configuration, import the landfill database and have the site 90% running.[2]

Perhaps a testament to the obscene number of open bugs for AMO right now, but this also solves a second long standing problem where localizers couldn’t see the entire site. On landfill, anyone can be an administrator, an editor, or any other permission level they’d like; and they’ll be able to see the entire site.

If you’ve been overwhelmed or frustrated trying to set up AMO in the past, now is a good time to give it another shot. The landfill should just get better with age and use – if a few people register and add some data the available database dumps will get richer.

If there is a part of the site that isn’t working and you need it to be, let me know. Keep in mind this is only the new Python code, so the few parts that are still on PHP (like the admin panel) won’t be available until they are ported. Code is updated near-instantly on commit, localization changes are updated every 5 minutes.

[1] Forgive the fake certificate. This is a sandbox for developers, y’all know what you’re doing. :)

[2] Honestly, 90% is really all you need. We do a lot of stuff for scalability, statistics, etc. and unless you’re actually working on that part of the site, you don’t need those elements running. Of course, you’re more than welcome to turn them on, I’m just trying to make it easy.

Tagged , , , , , , ,

High level perspective on the switch from PHP to Python

It may be fatuous to write this post before we’ve actually finished the transition from PHP to Python, but I started writing a different post and this is what came out. Sometimes that happens.

In January of 2010 we started migrating addons.mozilla.org from CakePHP to Django. It was a controversial decision. Developers were ambivalent to excited, managers were opposed to neutral – a split anyone would expect. When I first talked about it I expected to be able to turn off PHP by the end of the year. It didn’t turn out quite like that.

Fifteen months later we’re still transitioning and it’s still stressful. The toughest part about a major migration like this is that there is only one team that is doing the migration, continuing to add the new features we need, and all the while maintaining the old site. That’s a stressful environment for developers since the interactions between the languages can be complicated, it’s stressful for managers because features take longer to complete, and it’s stressful for users (and QA for that matter) because issues will arise which are hard to reproduce and complicated to explain.

In the midst of all the work of migration, the rest of the company is still working: the security team is announcing bounties on our site which means we need to be vigilant about fixing issues, project management continues to come up with features to be added, the site perseveres in its never-ending quest for a new look and feel, and Firefox 4 is using AMO like never before meaning approaching 10,000 hits per second is a regular day. All of that is specific to the add-ons site, but consider your own company if you’re thinking of going down the same road – what is coming up for your site that will throw a wrench in the works?

The meat and potatoes of it really comes down to: Given the hindsight of today, would the migration be a good idea? There isn’t a right answer for every site, but for AMO we did the right thing[1]. As of today the majority of pages that matter are on Python – there are some admin tools, and some cron jobs, and the occasional semi-obsolete public page that is on PHP, but for the most part, we’re looking really good (less hand waving, more real data). My new (overly optimistic?) plan is to have PHP off by the end of this year. We’ll see.

To give you an idea of man-hours, we’ve had anywhere from 3 to 6 superhero developers working on the site over the past 15 months, and it’s looking like the whole thing will take around 24 months. That’s a big chunk of time for a site that needs to grow and evolve as quickly as popular sites do these days.

So, overall, I think the lesson is: any reasonably sized site is going to have rabbit holes in it. At first glance AMO might look like it’s got a dozen “main” pages, with a couple dozen more supporting pages (and throw in a few more for the admin CRUD). Have a look at that spreadsheet I linked above and you’ll see that’s not even remotely the case. The spreadsheet even ignores sub-pages in a few places and doesn’t include any new features added in the past year. If you’re considering a migration, think it through well. Make a spreadsheet of every URL, identify the complicated areas, and make sure everyone is clear on the timeline and what it means for new features. People will absolutely try to scope creep your migration – make it clear if a section of the site is migrating as-is or can be migrated and redesigned at the same time. Redesigns add complexity for the developers but can earn you some good will with the users and managers and if you’re in this boat you can use all the good will you can get.

May you have the best of luck with your decisions. :)

[1] I’ll write another post about pros/cons of the actual frameworks and platforms. Let’s just assume we’re happy with the technical side of the switch for now.

Tagged , , , ,

Status Watch: An add-on for noticing HTTP error codes

Often on complex pages with many assets it can be easy to overlook assets which don’t load. Usually they are minor JS, CSS, or tracking pixels which aren’t noticed until you’ve spent way too long trying to track down the problem (or a month later you log into your stats dashboard to discover you haven’t been collecting stats).

With the launch of the new Add-on Builder (still an alpha product, but usable) I decided to make my first Jetpack to fix this.

In only about 20 lines of code I was able to look for any 4xx or 5xx errors in the HTTP traffic and show a brief notification to the user about what went wrong and where. The builder was a great development experience (lack of documentation aside) and was a breeze to do something relatively complex.

If you’ve wanted a similar add-on, feel free to use Status Watch. It’s a Jetpack so you’ll need Firefox 4, but on the bright side, you won’t even need to restart the browser. Just click and go.

I’ve had some requests to support a whitelist of sites so you don’t get notifications all over the web and I’ll add that when I get time. It’s surprising how many sites have 403s and 404s though.

Update 2011-02-17: Version 2 is on AMO now which supports a whitelist. See the AMO page for details.

Tagged , , , , ,

A Night in the Emergency Department

Within minutes of my arrival at the Emergency Department a call comes in that an ambulance will arrive shortly transporting a man in cardiac arrest.  Orientation can wait.  Over the next 20 minutes he is given a regiment of drugs.  I follow him to a unit that will try to locate and destroy the clot in his heart.  In the next hour his heart stops four times while technicians put two femoral catheters in his legs and follow a dye through his blood stream.  Eventually they finish what they can do and ship him to the Cardiac Care Unit.  No one knows about permanent damage.  On my way back to the Emergency Department I pass a frantic looking woman with a cell phone.  She’s just spied her teenage daughter running in and cries, “they say it’s his heart and it’s serious.”  I don’t make eye contact.

An older couple accompanies a woman on a stretcher with hematemesis and a severely distended abdomen into room 9.  She’s legally blind and keeps asking if they are in the room.  The old man continually assures her with a soft “I’m here, mom.”  I consider the overflowing landfills briefly as my non-latex glove count hits double digits in an hour.  Another bout of black vomit snaps me back to reality.  Mother Earth can take the hit tonight.  The nurse readies an NG tube while I wipe off the patient’s chin with a warm wash cloth and tell her she looks pretty again.  She smiles.

Every time I walk past room 20 I hear a woman sobbing into the phone.  She was out celebrating tonight and a dozen margaritas later she woke up on a stretcher with a fractured tibia and fibula from tripping over a curb in a parking lot.  Somewhere between the bar and the hospital she’s lost her purse, her clothes, and her self respect.  All she can do is apologize to her mother on the phone through sobbing breaths, over and over.

The hours pass by.  A 22 year old woman has an abscess under her eye; the doctor decides to drain it with a needle instead of a knife because he doesn’t want to cut up a young girl’s face.  A 27 year old male has a seizure because he stopped taking his medicine; he says his doctor never gave it to him.  A woman in room 10 watches hospital security put restraints on her husband so he doesn’t roll off his stretcher or hurt someone.  An 84 year old man with dysphasia (he can’t speak) watches in silent pain as a nurse tries to get an IV started for the third time.  An old man tells jokes to his wife and the assistant who is setting up for an EKG; the only interruption of his smile is every few minutes when he’s curled up and clutching his chest in pain.

The Emergency Department showcases the extremes of the emotional spectrum – the best and the worst of human nature.  In one bed is a 27 year old female with cuts on her wrists who washed down all the pills in her medicine cabinet with a bottle of vodka – 20 hours later she wakes up long enough to ask to go to the bathroom and then passes back out.  Two rooms down is a man with Alzheimer’s, cursing at the nurses for trying to remove his shirt.  His wife tells me that half the time he doesn’t remember his name, but he always remembers how to swear.  She tells him she loves him as she calms him down.  They’ll be together forever.

When I drive away from the ER that night Loveline is on the radio.  It’s some girl complaining that her boyfriend doesn’t try hard enough in bed.  The world’s problems seem trivial.


I became an EMT last year and as a part of the course I had to work a twelve hour shift in the emergency department. I wrote this essay for the class, but it seemed like something to share here also. Read out of context it may sound like I didn’t enjoy the night, but that was definitely not the case – I had a great time and learned a lot. I would definitely volunteer to work there again if they had the room.

Tagged ,

md5verify: A script to automatically verify file integrity

I have a lot of files on my computer. Email archives, personal documents, stuff for work, photos I’ve taken…the list goes on – I’m sure most people reading this are in a similar boat. On occasion I’ve found some files to be missing or corrupt which is disturbing but is probably something to be expected. The bad part is, I keep backups, but I rotate them out when they reach a certain age which means if I don’t notice a file is corrupt or missing I’ll eventually lose it forever.

I stayed up late a few nights ago and wrote a script to raise an alert when something has changed. On its first run the script will recursively walk a directory tree hashing each file and storing the hashes in the directory (in an md5sum compatible formatted file). On subsequent runs it will begin tracking new files automatically but it will also print messages for missing and changed files. By saving the checksums in each directory it becomes portable – you can copy a directory somewhere else and still be able to verify nothing changed (a quick md5sum -c checksums.txt will let you know).

By default the script only prints messages when it sees something fishy so it’s perfect to drop into cron and it uses exit statuses so it’ll work for nagios too. I’ve been running it for a few months and have found a couple files that have changed – nothing critical yet but it’s nice to know it’s there.

Tagged , , ,

CLI Split Windows in Vim

I use split windows, both horizontally and vertically, in Vim all the time. I’ve always wanted to be able to split the window and then start a command line shell within that window but up until now that has just been a dream.

My friend sent me a link to conque this morning which I’m so taken with that it’s prompted me to drop coding and write a blog post.

Conque lets me do exactly what I’ve wanted, create shells within Vim. In insert mode you interact with the shell as expected. In normal mode you can scroll back through your history, yank text into buffers, paste and manipulate text. Best thing ever.


Two files open on the left, three shells on the right; iPython, MySQL, and Django’s web server – all with syntax highlighting. Tagged , , ,