Re: Translations (was: cjl - r30934 et al. - abiword/trunk/po)

From: Chris Leonard <cjlhomeaddress_at_gmail.com>
Date: Thu May 03 2012 - 20:53:54 CEST

On Thu, May 3, 2012 at 4:20 AM, Ingo Brückl <ib@wupperonline.de> wrote:
>
> Chris,
>
>>  "Language-Team: translate-discuss-af@lists.sourceforge.net\n"
>> -"Language: af_ZA\n"
>>  "MIME-Version: 1.0\n"
>>  "Content-Type: text/plain; charset=UTF-8\n"
>>  "Content-Transfer-Encoding: 8bit\n"
>> +"Language: af\n"
>
> Just a nit: xgettext places "Language:" right behind "Language-Team:" where
> it is well placed.
>
> The content of the language field should preferably be an ISO 639 language
> code (lowercase) and (if necessary and the file is named like this) an ISO
> 3166 country code (uppercase, appended by an underscore) plus an optional
> variant designator appended by an at sign. In short, the file name with minus
> sign replaced by underscore. (Please see "6.2 Filling in the Header Entry" in
> the gettext manual.)

Perhaps we read the gettext manual's instructions differently.

http://www.gnu.org/software/gettext/manual/html_node/Header-Entry.html

<quoting>

Language
    Fill in the language code of the language. This can be in one of
three forms:

        ‘ll’, an ISO 639 two-letter language code (lowercase). See
Language Codes for the list of codes.

        ‘ll_CC’, where ‘ll’ is an ISO 639 two-letter language code
(lowercase) and ‘CC’ is an ISO 3166 two-letter country code
(uppercase). The country code specification is not redundant: Some
languages have dialects in different countries. For example, ‘de_AT’
is used for Austria, and ‘pt_BR’ for Brazil. The country code serves
to distinguish the dialects. See Language Codes and Country Codes for
the lists of codes.

        ‘ll_CC@variant’, where ‘ll’ is an ISO 639 two-letter language
code (lowercase), ‘CC’ is an ISO 3166 two-letter country code
(uppercase), and ‘variant’ is a variant designator. The variant
designator (lowercase) can be a script designator, such as ‘latin’ or
‘cyrillic’.

. . .

So, if your locale name is ‘de_DE.UTF-8’, the language specification
in PO files is just ‘de’.

</quoting>

The odd thing about the example they chose (de) is that I can readily
identify the de_CH counterexample, but the point remains, they seem to
be suggesting using the shorter iso-639 CODE only.

It seems all versions are *permitted*, but it sounds to me like the
short (ISO-639 only) is preferred unless there is good reason to go
with a longer glibc locale like entry. (e.g. en_GB, pt_BR, @latin
variants, etc.)

I know that AbiWord generally uses full glibc locale like names for
the PO files in nearly all circumstances, although I am not sure
exactly why. This is obviously needed in certain commonly occurring
cases (en_GB, pt_BR, @latin variants, etc.), but I am not sure what
advantage is gained from specifying af_ZA or sl_SI or cy_GB when there
are not commonly ISO-3166 (country) specific variants found in
everyday usage (at least, not AFAICT).

I'm open to being educated otherwise, even if the answer is "that's
just the way we've always done it at AbiWord", but I'd like to learn
if there is a functional reason for being more specific than strictly
required when there may be some advantages to using the more general
case of just the ISO-639 code (without the ISO-3166 modifier of glibc
locales), again, unless otherwise indicated by common usage.

In Sugar Labs Pootle, we generally use the shortest unambigiuous code
for setting up languages, but that's just the choice we made.

Just curious.

cjl
Sugar Labs Translation Team Coordinator
Received on Thu May 3 20:54:46 2012

This archive was generated by hypermail 2.1.8 : Thu May 03 2012 - 20:54:46 CEST