[libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

classic Classic list List threaded Threaded
7 messages Options
sophi sophi
Reply | Threaded
Open this post in threaded view
|

[libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

Hi all,

As you may have read in the minutes of the ESC call, Olivier reported an
issue with the help files and translation for 5.2:
+ Many strings are the same content but with xml tags placed differently
            + same visual result
            + issues with translators ahead, must find support for a
script to fix this
               => need a script to undo this; or un-fuzzy-them.

and I reacted to that:
* l10n (Sophie)
    + will ask on the l10n list wrt. scripting
        + developer support appreciated.
        + how urgent is it ?
            + since 5.2 is due in August - so plenty.
        + is it only po files, or UI files too ? (JanI)
            + only help files; for pt_BR: 57k new words to review.
            + can compare 5.1 vs. master dbs (JanI)
AI:     + unwind / script changes here (Christian)

So to make it more explicit, is there among our group somebody able to
write a script that will remove fuzzy strings when it's only and xml tag
that has been moved?
Thanks in advance :)
Cheers
Sophie
--
Sophie Gautier [hidden email]
GSM: +33683901545
IRC: sophi
Co-founder - Release coordinator
The Document Foundation

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Christian Lohmaier (klammer) Christian Lohmaier (klammer)
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

Hi *,

On Fri, Mar 11, 2016 at 3:03 PM, Sophie <[hidden email]> wrote:
> Hi all,
>
> + Many strings are the same content but with xml tags placed differently

Most fuzzy strings are not because xml tags being placed differently
(where xml tags are different,  it was to fix validation errors), but
most fuzzy strings are due to changes in message context.

for example:
-<paragraph role="bascode" id="par_id3154685" xml-lang="en-US"
l10n="U" oldref="4">ChDrive Text As String</paragraph>
+<paragraph id="par_id3154685" role="bascode" xml-lang="en-US">ChDrive
Text As String</paragraph>

→ the effectivechange is removalof obsolete attributes l10n and
oldref. What matters here  is oldref, as that was part of the po
files:

-#. Hew7C
+#. rkzEY
 #: 03020402.xhp
 msgctxt ""
 "03020402.xhp\n"
 "par_id3154685\n"
-"4\n"
 "help.text"
 msgid "ChDrive Text As String"
 msgstr ""

→ the oldref gets removed from message context, making  it a new
entity/new string. as there's an identical source-string, pootle
reuses the translation and marks it fuzzy.

>             + only help files; for pt_BR: 57k new words to review.
>             + can compare 5.1 vs. master dbs (JanI)
> AI:     + unwind / script changes here (Christian)
>
> So to make it more explicit, is there among our group somebody able to
> write a script that will remove fuzzy strings when it's only and xml tag
> that has been moved?

See above, it is not about  xml-tags, but changes to message context.
Writing a script to un-fuzzy automatically is possible, but not
entirely trivial.

for each fuzzy string look in translations marked obsolete whether
"same context+one additional line (consisting of only a number)"
exists with exact msgid and translation string, and if so, remove
fuzzy marker.

But as it is time til August, not urgent (doesn't need to be done this
or next week IMHO). Remember: if we wouldn't have master projects, you
wouldn't know about this right now :-))

Just avoid basic/shared (that's where nearly all of this kind of
change was done) and you won't be bothered by those fuzzy ones. Of
course the reason why it was done was because files in basic/shared
had lots of syntax/validation errors, so even after the fuzzy ones are
gone, there will be some strings to translate, but of course much
fewer.

ciao
Christian

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Stanislav Horáček Stanislav Horáček
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

Hi,

Dne 14.3.2016 v 18:35 Christian Lohmaier napsal(a):

> Hi *,
>
> On Fri, Mar 11, 2016 at 3:03 PM, Sophie <[hidden email]> wrote:
>> Hi all,
>>
>> + Many strings are the same content but with xml tags placed differently
>
> Most fuzzy strings are not because xml tags being placed differently
> (where xml tags are different,  it was to fix validation errors), but
> most fuzzy strings are due to changes in message context.

there are also hundreds of the strings with differently placed tags and
for them it is not enough to just accept fuzzy strings - typically
<swichinline><caseinline><emph></emph></caseinline></swichinline> has
been changed to
<emph><swichinline><caseinline></caseinline><swichinline></emph>.

Are these strings planned to be translated automatically as well?

>
> But as it is time til August, not urgent (doesn't need to be done this
> or next week IMHO). Remember: if we wouldn't have master projects, you
> wouldn't know about this right now :-))

This is not completely true - the issue with the "l10n" and "oldref"
tags changing context affected also 5.1 strings, now it is just
highlighted because of the massive automated changes.

Best regards,
Stanislav

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Martin Srebotnjak Martin Srebotnjak
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

It's an old story. Developers of OOo/LO think localizers are enjoying this
kind of manual work. As if it was a mandala or knitting.
So they rework tags every now and then, without caring about our feelings.

But we are not enjoying it and the latest thousands changes of the tag kind
must be made with a script in the translated strings. Not our work.

So we have enough time to do what we do - localize. Amen.

Lp, m.

2016-03-19 18:13 GMT+01:00 Stanislav Horáček <[hidden email]>:

> Hi,
>
> Dne 14.3.2016 v 18:35 Christian Lohmaier napsal(a):
>
>> Hi *,
>>
>> On Fri, Mar 11, 2016 at 3:03 PM, Sophie <[hidden email]> wrote:
>>
>>> Hi all,
>>>
>>> + Many strings are the same content but with xml tags placed differently
>>>
>>
>> Most fuzzy strings are not because xml tags being placed differently
>> (where xml tags are different,  it was to fix validation errors), but
>> most fuzzy strings are due to changes in message context.
>>
>
> there are also hundreds of the strings with differently placed tags and
> for them it is not enough to just accept fuzzy strings - typically
> <swichinline><caseinline><emph></emph></caseinline></swichinline> has been
> changed to <emph><swichinline><caseinline></caseinline><swichinline></emph>.
>
> Are these strings planned to be translated automatically as well?
>
>
>> But as it is time til August, not urgent (doesn't need to be done this
>> or next week IMHO). Remember: if we wouldn't have master projects, you
>> wouldn't know about this right now :-))
>>
>
> This is not completely true - the issue with the "l10n" and "oldref" tags
> changing context affected also 5.1 strings, now it is just highlighted
> because of the massive automated changes.
>
> Best regards,
> Stanislav
>
>
> --
> To unsubscribe e-mail to: [hidden email]
> Problems?
> http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
> List archive: http://listarchives.libreoffice.org/global/l10n/
> All messages sent to this list will be publicly archived and cannot be
> deleted
>

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Valter Mura Valter Mura
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

Il 19 marzo 2016 19:02:07 CET, Martin Srebotnjak <[hidden email]> ha scritto:

>It's an old story. Developers of OOo/LO think localizers are enjoying
>this
>kind of manual work. As if it was a mandala or knitting.
>So they rework tags every now and then, without caring about our
>feelings.
>
>But we are not enjoying it and the latest thousands changes of the tag
>kind
>must be made with a script in the translated strings. Not our work.
>
>So we have enough time to do what we do - localize. Amen.
>
>Lp, m.
>
>2016-03-19 18:13 GMT+01:00 Stanislav Horáček
><[hidden email]>:
>
>> Hi,
>>
>> Dne 14.3.2016 v 18:35 Christian Lohmaier napsal(a):
>>
>>> Hi *,
>>>
>>> On Fri, Mar 11, 2016 at 3:03 PM, Sophie <[hidden email]>
>wrote:
>>>
>>>> Hi all,
>>>>
>>>> + Many strings are the same content but with xml tags placed
>differently
>>>>
>>>
>>> Most fuzzy strings are not because xml tags being placed differently
>>> (where xml tags are different,  it was to fix validation errors),
>but
>>> most fuzzy strings are due to changes in message context.
>>>
>>
>> there are also hundreds of the strings with differently placed tags
>and
>> for them it is not enough to just accept fuzzy strings - typically
>> <swichinline><caseinline><emph></emph></caseinline></swichinline> has
>been
>> changed to
><emph><swichinline><caseinline></caseinline><swichinline></emph>.
>>
>> Are these strings planned to be translated automatically as well?
>>
>>
>>> But as it is time til August, not urgent (doesn't need to be done
>this
>>> or next week IMHO). Remember: if we wouldn't have master projects,
>you
>>> wouldn't know about this right now :-))
>>>
>>
>> This is not completely true - the issue with the "l10n" and "oldref"
>tags
>> changing context affected also 5.1 strings, now it is just
>highlighted
>> because of the massive automated changes.
>>
>> Best regards,
>> Stanislav
>>
>>
>> --
>> To unsubscribe e-mail to: [hidden email]
>> Problems?
>> http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
>> Posting guidelines + more:
>http://wiki.documentfoundation.org/Netiquette
>> List archive: http://listarchives.libreoffice.org/global/l10n/
>> All messages sent to this list will be publicly archived and cannot
>be
>> deleted
>>
>
>--
>To unsubscribe e-mail to: [hidden email]
>Problems?
>http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
>Posting guidelines + more:
>http://wiki.documentfoundation.org/Netiquette
>List archive: http://listarchives.libreoffice.org/global/l10n/
>All messages sent to this list will be publicly archived and cannot be
>deleted

Hi All

I definitely agree with Martin. :)

Ciao
Valter
--
Open Source is better!
Inviato dal mio dispositivo Android con K-9 Mail.

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Yury Yury
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

In reply to this post by Martin Srebotnjak
By my estimates -- I'm looking at the kbabel
stats, which aren't perfect, -- last three years
(half 2013--end 2015) brought about 100% overall
"change" (untranslatedness) in UI strings corpus
(up to 30K units). Of course, this includes
strings going fuzzy without real change in the
content, but confirming fuzzy units is real
work, still.

JFYI.

-Yury

On 19/03/16 21:02, Martin Srebotnjak wrote:
> It's an old story. Developers of OOo/LO think localizers are enjoying this
> kind of manual work. As if it was a mandala or knitting.
> So they rework tags every now and then, without caring about our feelings.
...

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Christian Lohmaier (klammer) Christian Lohmaier (klammer)
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] many fuzzy strings due to xlm tags move in 5.2 help files

In reply to this post by Christian Lohmaier (klammer)
Hi Martin, *,

On Fri, Apr 15, 2016 at 11:12 PM, Martin Srebotnjak <[hidden email]> wrote:
> Hi, Christian,
>
> I have several questions to organize the work of the Slovenian l10n team for
> the 5.2 release:
> - When will the changing end so you can run magic script on the l10n po
> files to change these affected fuzzy strings to fully localized?

Did run the script end of last week, so the cases where obsolete tag
was removed from the message context had the fuzzy flag removed (if it
had the same translation in 5.1, still fuzzy if the translation was
changed in the meantime)

> For example, will that milestone be the alpha1 release or beta1? Any later
> date does not make it possible for l10n teams to finish other help files
> localization, as we do not know what files to localize now and which not
> because they will be made localized by magic scripts;

Remember that you wouldn't have 5.2 translation process in the old
scheme. Only because we have master you can translate stuff already...
If you ask when string freeze is: that is with RC1 (for 5.2 there are
a total of 4 rcs scheduled).
Feature freeze is with beta1 (the week after the next).

> - Could you please run this script(s) on the l10n projects that are not
> using Pootle for translation?

Only if absolutely necessary.
helpfuzz.yml - a file that lists the files that were changed with the
template updates, you don't need to restrict to those, but touching
every file would have increased the time needed for the processing
(not so much for the change in the po files, but uptating the database
in pootle afterwards) Not all of those files have the described
obsolete-attribute removal, so only a subset is affected.
the hepfuzzy.yml content:

http://pastie.org/10834259

The script to process the po file.

http://pastie.org/10834263

Leave out the syncing to/from disk to pootle - expects the
translations in translations/libo_help/<language> and
translations/libo51_help/<langauge> respectively.

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted