--- ---

addons.mozilla.org Celebrates 1000 (passing) Unit Tests

We started writing unit tests for AMO a few years ago with the best of intentions. As the tests grew we started running into memory/timeout problems that prevented us from running the tests. Other priorities took over and since we couldn't run the tests we quit writing them. The tests got put on the back burner, became stale, and we're for the most part forgotten (an all too familiar story for most developers). Over the past few months we've been turning that around. While it's certainly a team effort, it's not stretching the truth to say that Jeff Balogh has been the driving force behind making sure our framework can scale and getting our old tests running again. Thanks to his tireless efforts our latest numbers show over 1200 unit tests, 1065 of which are passing. In an effort to prevent them from being forgotten again he also created an IRC bot named bosley who tracks the tests and reminds people when they fail. Expect to see bosley in #amo soon. The number of tests and the continuous monitoring of them is a huge milestone for AMO and Mozilla WebDev.

The Tagging Plan for AMO

Firstly, thanks for all the great feedback. Something as seemingly simple as tagging gets complex quickly when thought out and the varied perspectives of the community are always great to have. Allowing full Unicode would let anyone use meaningful tags in their own character sets but would prevent us from offering similar matches and common misspellings. On the other hand, we support several languages on AMO that don't use the Latin alphabet. It stands to reason that users would search for tags in their own character sets and would get no results. There are pros and cons for each choice but we're essentially debating the value of normalization in tagging. After distilling all the feedback and talking amongst ourselves our overall feeling was that forcing people to convert their input into the Latin alphabet wasn't in the users' best interest. The Mozilla Manifesto talks about a global internet that fosters creativity and free expression. Not supporting a user's native language when we have the option to doesn't feel like the right path to take. With that in mind our current plan is as follows: Allow full Unicode in tags to the extent we do everywhere else on AMO. Do no automatic character normalization. The option of manual normalization (essentially, marking some tags as equivalent) is left open as a future enhancement. Do automatic white space and capitalization normalization. Spaces are displayed on an add-on's page but when searching or entering into a URL spaces are unnecessary. For example, newyork, new york and New YORk are all equivalent. A list of suggestions will be provided as the user types. We may attempt some simplistic character normalization in the suggestions if we can come up with a way that provides enough value to continue to use (perhaps something that is per-language). White space is trimmed from the beginning and end of tags before they are saved into the database. Tags are limited to 128 characters and add-ons are limited to 80 tags. Tags will be comma delimited. To include a comma in your tag you must use quotation marks. Quotation marks, whether they are matched or not, are discarded. Example: "Portland, OR" will become Portland, OR whereas Portland", OR will become Portland and OR. Additional feedback, as always, is welcome.

Some considerations when adding Tags to AMO

Tags broke into the limelight around the time "Web 2.0" was becoming popularized. They provided a simple but effective way to categorize objects and many sites are using them now. Despite their proliferation, I haven't found any documentation on the internet regarding standards for implementing tags. A tag library exists for CakePHP but it, and many others, are too simplistic for what we want. We've written our tagging goals into a plan but have some technical details we still need to figure out. While reviewing what we have a couple questions arose that we thought people would have opinions on. 1) What should the range of allowed characters be? Our first instinct was simplicity, something like /[A-Za-z0-9-]/ (that is, all English letters and numbers and a dash). This is easy to handle on our end but leaves out everyone that doesn't want to add tags using the English alphabet. There is some debate how useful it would be to allow other Unicode characters, particularly when you think about #2 below. 2) Tags are most useful when they are normalized. By allowing Unicode characters we run the risk of diluting our tag cloud. For example, resume and résumé are close enough that for our purposes they are equivalent. If we allow Unicode we'll have to deal with converting characters like é to e and vice versa for searches. At that point we'll need a list of "equivalent" characters - not impossible but it will slow things down (both development and speed of a search). The second question is: Assuming you think we should allow Unicode characters, what characters are equivalents? Here is a quick idea from php.net's strtr() documentation:

Differentiate Bugzilla emails?

Bugzilla is an awesome bug tracker that is used by hundreds of companies. I've got accounts on several projects' trackers and I'm sure many others do also. When I get mail from Bugzilla it's not obvious which project it's from. My email client (GMail) only shows the "from name" so all I see for these projects is: Mozilla: bugzilla-daemon Pootle: bugzilla-daemon Miro: bugzilla kernel.org: bugme-daemon Apache: bugzilla Wouldn't it make sense to differentiate each projects' emails in the from name? Maybe even by default (something like "%SITE_NAME% Bugzilla")? Reed says it's a personal problem because his mail client shows the full address. Am I the only one? :(

How addons.mozilla.org defends against XSS attacks

One of the things that gets a lot of news time these days is XSS. There are a lot of places that explain what it is and how to prevent it but most are oversimplified or don’t provide real world examples. I thought I’d explain a couple of the ways AMO attempts to prevent it.