Using substitution strings in .po files

A couple years ago I recommended using fake msgid’s in .po files and was, predictably, met with some argument. I suggested using this hack because there wasn’t a standard way to store context in a .po file yet.1

Since that time msgctxt has become a standard part of gettext and makes my substitution string recommendation obsolete. I wanted to officially come out and say: substitution strings are a pain. The scripts we used made it manageable but finding strings in the code meant searching through the .po and not only was this painful for our developers but I think it confused contributors as well.

In our latest release, we’ve converted AMO to use regular .po files now. On the off chance someone followed my advice and would like to convert their site to regular .po files as well, Zbigniew Braniecki wrote a bit about the process and you can grab his scripts (and read about the troubles I had) at bug 501988.

Since I tagged this post hindsight I guess I should look back and conclude something too. Would I do it again? At the time, there were no great alternatives. So, yeah, I would get more opinions on whether the Gnome/KDE method was better, but I would do it again. I think it was the best choice out of several poor ones, but that doesn’t mean I’m not very happy to be rid of them.

1 To be fair, there was a more common way, used by some KDE and Gnome projects, which was to put a delimiter in the msgid and keep the context on one side and the original string on the other. This is also pretty hacky and you can read about Gnome’s migration away from that method if you’re curious.

5 Comments

It's 3am in the mornin... put my key in the door 'n... bodies layin all over the floor n'... im not really sure how they got there, so i guess i musta killed 'dem. KILLED 'DEM.

Sorry, that came to mind after reading your blog title...

Interesting post. So I guess there is no standard way to do substitutions at the moment without dirty hacking?
-- g, 02 Sep 2009
Interesting, we independently came up with the same solution back in 2002-3 for Plone, and we are moving away from it now too.

Thanks for blogging about the scripts and advice — I'm sure it'll come in handy for our i18n people. :)
-- Alexander Limi, 02 Sep 2009
So I guess there is no standard way to do substitutions at the moment without dirty hacking?

Substitutions, no. But they shouldn't be necessary with msgctxt.
-- Wil Clouser, 02 Sep 2009
IMHO, it's still a pain to keep n+2 copies of the original English string in your source code repo for n locales (the 2 are the .pot with just English and the copy in the source code itself) as well as keeping long strings or paragraphs of text in the source code. The stuff we use in the Mozilla code is better in keeping only one copy of the English strings, but has other problems. IMHO, only a new library and L10n file format can make this really better, so I long for L20n.
-- Robert Kaiser, 04 Sep 2009
Glad to see them go! In retrospect its taken quite a while for msgctxt first to be added to gettext and then to get used.

@Robert: its just space, it's not as if we suffer from a lack of it at the moment. Unfortunately the alternatives mean continually aligning 2 files and English and the translation which can in some instance be impossible to align. I'd rather suffer the redundency knowing that things can't go out of sync and meaning I can work easily offline.
-- Dwayne Bailey, 07 Sep 2009

Post a comment

Feel free to email me with any comments, and I'll be happy to post them on the articles. This is a static site so it's not automatic.