[libreoffice-l10n] gettext and translations

classic Classic list List threaded Threaded
12 messages Options
sophi sophi
Reply | Threaded
Open this post in threaded view
|

[libreoffice-l10n] gettext and translations

Hi all,

I invite all of you to have a read to that message from Caolan, it's a
discussion, so please share your concerns, your solutions, your views :)

https://lists.freedesktop.org/archives/libreoffice/2017-May/077818.html

For those who translates out of Pootle, msgctxt is to be removed, if you
need them, just tell me how you use it.

Thanks a lot in advance,
Cheers
Sophie
--
Sophie Gautier [hidden email]
GSM: +33683901545
IRC: sophi
Release coordinator
The Document Foundation

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Martin Srebotnjak Martin Srebotnjak
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

Hi, Sophie,

regarding 16 and 18, I think these are important or even most important
ones.

The 18 is the crucial one and should be tried in a separate world, with an
extra instance of Pootle to test what really happens. Without testing this
with real world examples it should not be introduced.

As with 16, I do think it might be important if I understand correctly how
gettext functions are working (and script using these commands, like
pomigrate2 etc.). In Slovenian and probably other languages same English
strings are translated sometimes differently. Some languages use cases, so
same word that has one form in English has several forms in other
languages. Also, some words in English have different meanings/translations
in other languages depending on context. By the change in 16 I fear that
the translation systems will offer or even automatically use the first
available translation of the English string and not offer a fuzzy string
for translators to check if translation is right (i.e. it will not make
difference between different translations of same English string that is
now defined by this part of po file).

As I am using my own translation systems, using basic Translate Toolkit
commands and scripts so I am ready to try your testing po sets on my
translation memory to see, what happens.

Also, the LO version, if this goes through, should be decided in advance,
5.4 for sure is not the one for this.

Lp, m.

2017-06-01 17:14 GMT+02:00 Sophie <[hidden email]>:

> Hi all,
>
> I invite all of you to have a read to that message from Caolan, it's a
> discussion, so please share your concerns, your solutions, your views :)
>
> https://lists.freedesktop.org/archives/libreoffice/2017-May/077818.html
>
> For those who translates out of Pootle, msgctxt is to be removed, if you
> need them, just tell me how you use it.
>
> Thanks a lot in advance,
> Cheers
> Sophie
> --
> Sophie Gautier [hidden email]
> GSM: +33683901545
> IRC: sophi
> Release coordinator
> The Document Foundation
>
> --
> To unsubscribe e-mail to: [hidden email]
> Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-
> unsubscribe/
> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
> List archive: http://listarchives.libreoffice.org/global/l10n/
> All messages sent to this list will be publicly archived and cannot be
> deleted
>

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Mihkel Tõnnov Mihkel Tõnnov
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

In reply to this post by sophi
2017-06-01 18:14 GMT+03:00 Sophie <[hidden email]>:

> Hi all,
>
> I invite all of you to have a read to that message from Caolan, it's a
> discussion, so please share your concerns, your solutions, your views :)
>
> https://lists.freedesktop.org/archives/libreoffice/2017-May/077818.html
>
> For those who translates out of Pootle, msgctxt is to be removed, if you
> need them, just tell me how you use it.
>

Hi Sophie, *,

I read Caolán's mail, and while removing clutter is good, I'm really
worried by this proposed (impending?) removal of msgctxt. Namely, does
removing it mean that e.g. "Left", "Edit", "Number", "Print", "None" etc.
could no longer be translated differently (depending on the string's
precise context) within same module? How "big" would a module be in the
proposed new system anyway? Would all strings in same
toolbar/panel/dialog/menu/etc. be one module, or all strings in same
component, like the whole of Writer?

If answer to my first question is yes, then it would affect the more
inflected languages quite badly - and very, very badly if the answer to my
second question is anything that encompasses more than one single toolbar,
or one single panel, or dialog, or menu. The exact maximum size of one
module will vary by language.

Also, would removing msgctxt really only affect those of us who translate
offline? Doesn't Pootle distinguish strings based on that as well?

In any case, as has been said time and time again: especially for short
strings (one or two words), each and every appearance of the string HAS TO
BE independently translatable, no matter if the string is same in English -
consider these for example:
"Number" - is it noun or verb?
"None" - which gender, number, case, etc. does it have?
"Open/Save/Print" - is it dialog title or the button?

These are just a few most basic examples off the top of my head. There are
many, many more such strings out there in the Estonian translation alone -
and Estonian doesn't even have the grammatical gender distinction that e.g.
the Slavic languages have. Ask translators of e.g. Slovenian, Bulgarian,
Polish or Russian how many times they have had to fight in various
projects for splitting strings that are identical in English, but need e.g.
different gender forms in translation.

Anyway, since you asked especially for offline usecases, here's mine: when
updating the translation after several thousands of words go fuzzy after
each major release, I use msgctxt to quickly identify the (basic) context
of the string - is it dialog title, label, menuitem with/without context,
etc. - and match it to suitable TM suggestion. If the location of the
string remains unclear from msgctxt, or if it's a new feature, I go find
the actual location based on Key-ID. If I'd have to find every last one of
the fuzzies via Key-ID, translating LibreOffice would take a lot more time
and effort, and cause way lot more frustration.

Best regards,
Mihkel
Estonian team

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Michael Bauer Michael Bauer
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

In reply to this post by Martin Srebotnjak
Big +1 for what Martin and Mikhel have said about matching existing
translations and not losing the ability to translate to different
targets depending on the context.

If there is going to be wholesale re-organisation as to how the strings
are presented for translation, two other considerations also spring to
mind which I didn't see on the list:
1) Plurals? I think LO still doesn't do plurals properly (but I may be
confusing projects here, apologies if it does). This would also tie in
with the 1 source string » multiple target strings issue.
2) Turning en-US into a to-be-translated locale and making the source
strings just en or some fake locale? That way, en-US can change case,
correct en-US typos and other such stuff which is English specific to
its heart's content without hitting all the other locales at the same time.

Michael


--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Yury Tarasievich Yury Tarasievich
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

In reply to this post by Mihkel Tõnnov
I've used the .PO based workflow from the
beginning of my OOO/LO L10N stint, and yes,
you'd get those problems in such environment.

You'd just have to keep the IDs for strings
translations' variants/exceptions/etc. separately.

That was how I was dealing with the problem,
anyway -- last time I looked, there was no easy
way to save this in .PO files created from the
POT sets published by OOO/LO teams. Can't
rightly remember, seems the extra info was lost
in migration from POT set to POT set.

-Yury

On 01/06/17 22:45, Mihkel Tõnnov wrote:
> 2017-06-01 18:14 GMT+03:00 Sophie <[hidden email]>:
...

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Serg Bormant Serg Bormant
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

Another big +1 for what Martin and Mikhel have said about matching
existing translations and not losing the ability to translate to
different targets depending on the context.

And another bad thing if we lost qtz (KeyIDs) pseudo locale to differ
one entry from another.

--
wbr, sb, Russian Team

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Mihail Balabanov Mihail Balabanov
Reply | Threaded
Open this post in threaded view
|

RE: [libreoffice-l10n] gettext and translations

Hello,

Yes, Bulgarian is also one of those inflected languages that require different translations to the same English phrase depending on the context. If we lose msgctxt, we must have another way to separate different instances of the same English phrase. Years ago at my workplace, we had this problem with an in-house l10n system and AFAIR we circumvented it by including the context in the original translatable string where necessary and then removing it in a small English ‘translation’, something like this:

Original: Number|noun
English translation: Number
Bulgarian translation: Номер

Original: Number|menu_item
English translation: Number
Bulgarian translation: Номериране

Original: Settings
English translation: <same as the original, so not included in the l10n file>
Bulgarian translation: Настройки

So this could be a fallback if the new system does not support a context/instance id separate aside from the translatable text itself. UI code can also automatically strip the part after the escaping character, thus eliminating the need for an English ‘translation’.

I rarely download POs to translate offline, but even in Pootle, msgctxt is often useful to me to decide how to translate short ambiguous phrases instead of searching for them visually in the UI.

Cheers,
Mihail




--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Mihkel Tõnnov Mihkel Tõnnov
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

2017-06-04 22:23 GMT+03:00 <[hidden email]>:

> Hello,
>
> Yes, Bulgarian is also one of those inflected languages that require
> different translations to the same English phrase depending on the context.
> If we lose msgctxt, we must have another way to separate different
> instances of the same English phrase. Years ago at my workplace, we had
> this problem with an in-house l10n system and AFAIR we circumvented it by
> including the context in the original translatable string where necessary
> and then removing it in a small English ‘translation’, something like this:
>
> Original: Number|noun
> English translation: Number
> Bulgarian translation: Номер
>
> Original: Number|menu_item
> English translation: Number
> Bulgarian translation: Номериране
>
> Original: Settings
> English translation: <same as the original, so not included in the l10n
> file>
> Bulgarian translation: Настройки
>
> So this could be a fallback if the new system does not support a
> context/instance id separate aside from the translatable text itself. UI
> code can also automatically strip the part after the escaping character,
> thus eliminating the need for an English ‘translation’.
>

Hi,

While I appreciate the ingenuity of this approach, it would be utter
stupidity to replace a well-working implementation with msgctxt with such a
system in a l10n project the size of LibO :)

Best, Mihkel

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Cor Nouws Cor Nouws
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

In reply to this post by Mihail Balabanov
[hidden email] wrote on 04-06-17 21:23:
> Yes, Bulgarian is also one of those inflected languages that require different translations to the same English phrase depending on the context. If we lose msgctxt, we must have another way to separate different instances of the same English

If I read Caoláns post well, it would not be a problem to preserve
msgtxt in the new situation.

ciao,
Cor

--
Cor Nouws
GPD key ID: 0xB13480A6 - 591A 30A7 36A0 CE3C 3D28  A038 E49D 7365 B134 80A6
- vrijwilliger http://nl.libreoffice.org
- volunteer http://www.libreoffice.org
- The Document Foundation Membership Committee Member

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
sophi sophi
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

Hi all,

Le 04/06/2017 à 22:38, Cor Nouws a écrit :
> [hidden email] wrote on 04-06-17 21:23:
>> Yes, Bulgarian is also one of those inflected languages that require different translations to the same English phrase depending on the context. If we lose msgctxt, we must have another way to separate different instances of the same English
>
> If I read Caoláns post well, it would not be a problem to preserve
> msgtxt in the new situation.

Thanks to all for your feedback. We are well aware of the need of
context, as Cor said, msgtxt could be preserved. Also before going live
with it, there will be several tests run to make sure the situation is
ok for us and doesn't imply more work on our side.

Cheers
Sophie

--
Sophie Gautier [hidden email]
GSM: +33683901545
IRC: sophi
Release coordinator
The Document Foundation

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Milos Sramek Milos Sramek
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

Hi,
one thing which I miss in the current system is the possibility to
translate plurals correctly. gettext supports that - maybe that it would
be necessary to change something in code, but I think its is worth that.
Pls, think about this
best
Milos

On 2017-06-05 13:44, Sophie wrote:

> Hi all,
>
> Le 04/06/2017 à 22:38, Cor Nouws a écrit :
>> [hidden email] wrote on 04-06-17 21:23:
>>> Yes, Bulgarian is also one of those inflected languages that require different translations to the same English phrase depending on the context. If we lose msgctxt, we must have another way to separate different instances of the same English
>> If I read Caoláns post well, it would not be a problem to preserve
>> msgtxt in the new situation.
> Thanks to all for your feedback. We are well aware of the need of
> context, as Cor said, msgtxt could be preserved. Also before going live
> with it, there will be several tests run to make sure the situation is
> ok for us and doesn't imply more work on our side.
>
> Cheers
> Sophie
>

--
email & jabber: [hidden email]


--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Valter Mura Valter Mura
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] gettext and translations

In reply to this post by Mihkel Tõnnov


Il 01/06/2017 21:45, Mihkel Tõnnov ha scritto:

> 2017-06-01 18:14 GMT+03:00 Sophie <[hidden email]>:
>
>> Hi all,
>>
>> I invite all of you to have a read to that message from Caolan, it's a
>> discussion, so please share your concerns, your solutions, your views :)
>>
>> https://lists.freedesktop.org/archives/libreoffice/2017-May/077818.html
>>
>> For those who translates out of Pootle, msgctxt is to be removed, if you
>> need them, just tell me how you use it.
>>
>
> Hi Sophie, *,
>
> I read Caolán's mail, and while removing clutter is good, I'm really
> worried by this proposed (impending?) removal of msgctxt. Namely, does
> removing it mean that e.g. "Left", "Edit", "Number", "Print", "None" etc.
> could no longer be translated differently (depending on the string's
> precise context) within same module? How "big" would a module be in the
> proposed new system anyway? Would all strings in same
> toolbar/panel/dialog/menu/etc. be one module, or all strings in same
> component, like the whole of Writer?

Hi All

I agree with this point, 'msgtxt' often gives to us useful hints to
translators, especially to those people that translate offline.
Of course, if LibreOffice developers doesn't use it, it should be
unuseful for us.

What I hope for it is the following, look at the example taken by KDE:

#. +> trunk5
#: plugins/generic/skg_advice/skgadviceboardwidget.cpp:40
#, kde-format
msgctxt "Dashboard widget title"
msgid "Advices"
msgstr "Suggerimenti"

#. +> trunk5
#: plugins/generic/skg_advice/skgadviceboardwidget.cpp:57
#, kde-format
msgctxt "Noun, a user action"
msgid "Activate all advice"
msgstr "Attiva tutti i suggerimenti"


This is very useful for us :)

>
> If answer to my first question is yes, then it would affect the more
> inflected languages quite badly - and very, very badly if the answer to my
> second question is anything that encompasses more than one single toolbar,
> or one single panel, or dialog, or menu. The exact maximum size of one
> module will vary by language.
>
> Also, would removing msgctxt really only affect those of us who translate
> offline? Doesn't Pootle distinguish strings based on that as well?
>
> In any case, as has been said time and time again: especially for short
> strings (one or two words), each and every appearance of the string HAS TO
> BE independently translatable, no matter if the string is same in English -
> consider these for example:
> "Number" - is it noun or verb?
> "None" - which gender, number, case, etc. does it have?
> "Open/Save/Print" - is it dialog title or the button?
>
> These are just a few most basic examples off the top of my head. There are
> many, many more such strings out there in the Estonian translation alone -
> and Estonian doesn't even have the grammatical gender distinction that e.g.
> the Slavic languages have. Ask translators of e.g. Slovenian, Bulgarian,
> Polish or Russian how many times they have had to fight in various
> projects for splitting strings that are identical in English, but need e.g.
> different gender forms in translation.
>
> Anyway, since you asked especially for offline usecases, here's mine: when
> updating the translation after several thousands of words go fuzzy after
> each major release, I use msgctxt to quickly identify the (basic) context
> of the string - is it dialog title, label, menuitem with/without context,
> etc. - and match it to suitable TM suggestion. If the location of the
> string remains unclear from msgctxt, or if it's a new feature, I go find
> the actual location based on Key-ID. If I'd have to find every last one of
> the fuzzies via Key-ID, translating LibreOffice would take a lot more time
> and effort, and cause way lot more frustration.
>
> Best regards,
> Mihkel
> Estonian team
>

Ciao
--
Valter
Open Source is better!
LibreOffice: www.libreoffice.org
KDE: www.kde.org
Kubuntu: www.kubuntu.org
OpenSuse: it.opensuse.org

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted