[libreoffice-l10n] Question about odd use of msgctxt

classic Classic list List threaded Threaded
7 messages Options
Chris Leonard Chris Leonard
Reply | Threaded
Open this post in threaded view
|

[libreoffice-l10n] Question about odd use of msgctxt

Hello LibreOffice L10n team,

Is there a logical reason that so many of the PO files for LibreOffice
have seemingly extraneous msgctxt comments that are simply a
repetition of the location?

Just one extreme example:

scp2/source/xsltfilter.po

Is it really necessary to translate "XSLT Sample Filters" twice (as
enforced by the msgctxt comment).  Is there ever going to be a
difference in the way the first and second instance is translated?  Is
this just some hack to address builds in Windows that don't use
gettext properly?

I'm very interested to understand the logic behind this curious
choice.  There are many downsides to this practice (abuse of msgctxt)
which I could describe at greater length if needed.

#. No_K
#: module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
msgctxt "module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text"
msgid "XSLT Sample Filters"
msgstr ""

#. Bq1B
#: module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
msgctxt "module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text"
msgid "XSLT Sample Filters"
msgstr ""



would normally appear as:

#: module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
#: module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
msgid "XSLT Sample Filters"
msgstr ""

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Martin Srebotnjak Martin Srebotnjak
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

Well, the msgctxt fields itself say that one is the name and the other
is the description. In case some language needs more space that is not
allowed for name but can be fitted in description, they can translate
this string in two ways - the short and non-descriptive in name and
the long and more descriptive in description. Even the English string
might one day get longer if user base says that the name and
description are too vague. So looking at po files just from the side
of repetition of strings might not be very productive (even if it look
the way around, that repetition of same strings in po files is not
productive).

Lp, m.

2012/7/10 Chris Leonard <[hidden email]>:

> Hello LibreOffice L10n team,
>
> Is there a logical reason that so many of the PO files for LibreOffice
> have seemingly extraneous msgctxt comments that are simply a
> repetition of the location?
>
> Just one extreme example:
>
> scp2/source/xsltfilter.po
>
> Is it really necessary to translate "XSLT Sample Filters" twice (as
> enforced by the msgctxt comment).  Is there ever going to be a
> difference in the way the first and second instance is translated?  Is
> this just some hack to address builds in Windows that don't use
> gettext properly?
>
> I'm very interested to understand the logic behind this curious
> choice.  There are many downsides to this practice (abuse of msgctxt)
> which I could describe at greater length if needed.
>
> #. No_K
> #: module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
> msgctxt "module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text"
> msgid "XSLT Sample Filters"
> msgstr ""
>
> #. Bq1B
> #: module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
> msgctxt "module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text"
> msgid "XSLT Sample Filters"
> msgstr ""
>
>
>
> would normally appear as:
>
> #: module_xsltfilter.ulf#STR_NAME_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
> #: module_xsltfilter.ulf#STR_DESC_MODULE_OPTIONAL_XSLTFILTERSAMPLES.LngText.text
> msgid "XSLT Sample Filters"
> msgstr ""
>
> --
> Unsubscribe instructions: E-mail to [hidden email]
> Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
> List archive: http://listarchives.libreoffice.org/global/l10n/
> All messages sent to this list will be publicly archived and cannot be deleted
>

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Rimas Kudelis Rimas Kudelis
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

In reply to this post by Chris Leonard
Hi Chris,

2012.07.10 08:48, Chris Leonard rašė:

> Is there a logical reason that so many of the PO files for LibreOffice
> have seemingly extraneous msgctxt comments that are simply a
> repetition of the location?
>
> Just one extreme example:
>
> scp2/source/xsltfilter.po
>
> Is it really necessary to translate "XSLT Sample Filters" twice (as
> enforced by the msgctxt comment).  Is there ever going to be a
> difference in the way the first and second instance is translated?  Is
> this just some hack to address builds in Windows that don't use
> gettext properly?
>
> I'm very interested to understand the logic behind this curious
> choice.  There are many downsides to this practice (abuse of msgctxt)
> which I could describe at greater length if needed.

in addition to what Martin wrote, there's another thing: LibreOffice
does NOT use those .po files as they are. They are converted to a
different type of L10n resource during compilation, and that other type
of resource is what ends up being used in LibO, not just on Windows, but
in all operating systems. With that in mind, you could also say that our
conversion procedures are simply suboptimal, and that might be true
somewhat, but they are what they are...

I believe there have been thoughts about migrating to gettext properly,
but considering the amount of work it would take and the impact, no
wonder it hasn't happened (or perhaps even started) yet.

Rimas

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Olivier Hallot Olivier Hallot
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

...Not forgeting that the process must handle C, C++, Java, XML, plus
another bunch of filetypes....

Em 10-07-2012 03:37, Rimas Kudelis escreveu:
> I believe there have been thoughts about migrating to gettext properly,
> but considering the amount of work it would take and the impact, no
> wonder it hasn't happened (or perhaps even started) yet.

- --
Olivier Hallot
Founder, Board of Directors Member - The Document Foundation
The Document Foundation, Zimmerstr. 69, 10117 Berlin, Germany
Fundação responsável civilmente, de acordo com o direito civil
Detalhes Legais: http://www.documentfoundation.org/imprint
LibreOffice translation leader for Brazilian Portuguese
+55-21-8822-8812


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJP++z+AAoJEJp3R7nH3vLxfREIAJpszUp9/5ulJLAIRxO8bHqt
u3BACUh+rt/36DFrOc5FagxYlKt+j0Mg6CLqKFDIJ2FppqZfb1jNOBWWji8ANKRg
eWoGsQTAsyvTRkE6Tz2wP5uvsyS2cE78Hv93ryLV3jQfOTGQinj46Fo8lk+NOp75
3S18125ex9tkO8OGs8fw2oFFpHgrKCCRLgt9NAiLSYQK7JK1DzqfFNloxHFzivP4
cswobWOJhCZbxjxwDlv2NPxk17VjMrPqKbJFExHIPl4yLciykShtXJiV5dbccR1j
Bc6TdAjxCAtWQP1PMHTbImPlv8lU3F4WagSA1lWKyuKhlmZQUqmMuEsW988g1yA=
=vguO
-----END PGP SIGNATURE-----

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Chris Leonard Chris Leonard
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

In reply to this post by Rimas Kudelis
On Tue, Jul 10, 2012 at 2:37 AM, Rimas Kudelis <[hidden email]> wrote:

> I believe there have been thoughts about migrating to gettext properly, but
> considering the amount of work it would take and the impact, no wonder it
> hasn't happened (or perhaps even started) yet.
>
> Rimas

Thanks for the answer Rimas.  I do understand the challenges of
working solely in gettext with a project as large and diverse as
LibreOffice.

It is nonetheless unfortunate that it is necessary to hack the msgctxt
comment to achieve your L10n.  Not so much for the extra burden it
places on localizers, which can largely be mitigated by working in
Virtaal with it's translation memory (for instance); but for it's
impact on other string comparison tools like Translate Toolkit.

I came across this during a process of doing cross-project consistency
checks for conflicting translations.  I'm a scientist by training and
data pipelining quality checks is just one of those things I feel
compelled to do. I realized that one of my usual tricks was less
effective than it usually is, however, it still yields some useful
results.

Here is the trick.

1) Download the zip of the entire LibreOffice 3.6 – UI project for a
given language as a "Zip of Directory" from the top of the Translate
tab for a given language.

2) Place it in it's own folder, unzip it, place the collapse36.sh file
in that folder, at the same level as the large group of folders from
the zip file.

3) Run the shell script, which collapses the folder hierarchy, moving
all PO files into a new folder called po and deletes the other folders
as it goes along to clean up.  it renames files with common names by
adding numbers to make them all uniquely named.

sh collapse36.sh

4) You need the Translate toolit for this.  Run pocompendium on the
collected PO files.

pocompendium -out.po -d po

5) Then run posplit.

posplit out.po

6) Examine the compendium conflicts in the out-fuzzy.po file and you
will often find errors in need of correction.  Not all cases will
genuine problems, but most are worth investigating. It may reveal the
need for a msgctxt comment (as traditionally used) to distinguish an
ambiguous English word that could use more context ot achieve optimal
translation.

If you find references to files with numerical additions (e.g.
src3.po) use the text of collapse36.sh to map back to it's original
name and location in the file hierarchy.

Unfortunately because pocompendium respects the location-based msgctxt
comments LO uses extensively, this is less effective than it could be,
but it still identifies opportunities for improving consistency of
translation across the project.

Regards,

cjl

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

F Wolff F Wolff
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

Op Wo, 2012-07-11 om 11:01 -0400 skryf Chris Leonard:

> On Tue, Jul 10, 2012 at 2:37 AM, Rimas Kudelis <[hidden email]> wrote:
>
> > I believe there have been thoughts about migrating to gettext properly, but
> > considering the amount of work it would take and the impact, no wonder it
> > hasn't happened (or perhaps even started) yet.
> >
> > Rimas
>
> Thanks for the answer Rimas.  I do understand the challenges of
> working solely in gettext with a project as large and diverse as
> LibreOffice.
>
> It is nonetheless unfortunate that it is necessary to hack the msgctxt
> comment to achieve your L10n.  Not so much for the extra burden it
> places on localizers, which can largely be mitigated by working in
> Virtaal with it's translation memory (for instance); but for it's
> impact on other string comparison tools like Translate Toolkit.

Sounds good, except that using msgctxt like this for OOo/LibO
localisation was invented by us in the Translate Toolkit :-)

When running oo2po, you can specify if you want to use msgctxt like
this, or to merge in the traditional gettext way:
http://translate.sourceforge.net/wiki/toolkit/duplicates_duplicatestyle

The idea is to make it possible to translate things differently if
needed. If things are merged, it is not possible, while the code base
supports it.

> I came across this during a process of doing cross-project consistency
> checks for conflicting translations.  I'm a scientist by training and
> data pipelining quality checks is just one of those things I feel
> compelled to do. I realized that one of my usual tricks was less
> effective than it usually is, however, it still yields some useful
> results.

This sounds similar to what poconflicts do, or did I misread it? I think
poconflicts should ignore the msgctxt as you want, but maybe I'm
misundererstanding what you are trying to do.

Friedel

--
Recently on my blog:
http://translate.org.za/blogs/friedel/en/content/localisation-guide-now-available-spanish


--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Chris Leonard Chris Leonard
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Question about odd use of msgctxt

On Wed, Jul 11, 2012 at 12:56 PM, F Wolff <[hidden email]> wrote:

> This sounds similar to what poconflicts do, or did I misread it? I think
> poconflicts should ignore the msgctxt as you want, but maybe I'm
> misundererstanding what you are trying to do.
>
> Friedel

Friedel,

I've used poconflicts, but I must reluctantly admit  that I'm not very
fond of it's output format (many individual files) and the need to
post-process that folder to get something more human readable.  Maybe
I'm just lazy.

The pocompendium trick may be quick and dirty, but it catches a lot
and quickly provides an easily reviewable single PO file with a "trail
of breadcrumbs" back to the original PO.

cjl

--
Unsubscribe instructions: E-mail to [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted