[libreoffice-l10n] Sort Order in Calc

classic Classic list List threaded Threaded
7 messages Options
Tadele Assefa Tadele Assefa
Reply | Threaded
Open this post in threaded view
|

[libreoffice-l10n] Sort Order in Calc

Dear All,

Our Language, Sidama, uses the latin script. However, there are additional
consonants which are formed by using two letter combinations. Eg ph is
considered one consonant and so are sh, ch, zh, ts, ny. These constants are
ordered after the first constant used to create them. "Our alphabet" reads,
a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc

As a result, we want to sort string using this order. eg the words
cala, chala, cola should be sorted cala, cola,chala... ch has to come after
c.

So, how can we include this behavior in LO?
--
Regards,

*Tadele*

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Andras Timar Andras Timar
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc

Hello,

On Fri, Jun 14, 2013 at 7:53 AM, Tadele Assefa <[hidden email]> wrote:

> Dear All,
>
> Our Language, Sidama, uses the latin script. However, there are additional
> consonants which are formed by using two letter combinations. Eg ph is
> considered one consonant and so are sh, ch, zh, ts, ny. These constants are
> ordered after the first constant used to create them. "Our alphabet" reads,
> a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc
>
> As a result, we want to sort string using this order. eg the words
> cala, chala, cola should be sorted cala, cola,chala... ch has to come after
> c.
>
> So, how can we include this behavior in LO?

You need to add the rules to LC_INDEX section of your locale.
(i18npool/source/localedata/data/sid_ET.xml)

See a working solution (Hungarian) at:
http://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/hu_HU.xml#205

If you can't submit a patch, please file a bug.

Best regards,
Andras

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Yaron Shahrabani Yaron Shahrabani
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc

*The Hungarian LC_INDEX*

  <LC_INDEX>
    <IndexKey phonetic="false" default="true" unoid="charset">A(A, Á)
B C {Cs} D {DZ} {DZS} E(E, É) F G {Gy} H I(I, Í) J-L {Ly}  -N {Ny}
O(O, Ó) Ő(Ö, Ő) P-S {Sz} T {Ty} U(U, Ú) Ű(Ü, Ű) V-Z {Zs}</IndexKey>
    <UnicodeScript>0</UnicodeScript>
    <UnicodeScript>1</UnicodeScript>
    <UnicodeScript>2</UnicodeScript>
    <UnicodeScript>3</UnicodeScript>
    <FollowPageWord>p.</FollowPageWord>
    <FollowPageWord>pp.</FollowPageWord>
  </LC_INDEX>

*LC_INDEX for Sidama*

    <IndexKey phonetic="false" default="true"
unoid="alphanumeric">A-Z</IndexKey>
    <UnicodeScript>0</UnicodeScript>
    <UnicodeScript>1</UnicodeScript>
    <FollowPageWord>STP</FollowPageWord>
    <FollowPageWord>StO</FollowPageWord>
  </LC_INDEX>

I couldn't find any documentation but I'm guessing you should first change
the unoid value to charset, what's Sidama status regarding Unicode?

Yaron Shahrabani

<Hebrew translator>



On Fri, Jun 14, 2013 at 9:31 AM, Andras Timar <[hidden email]> wrote:

> Hello,
>
> On Fri, Jun 14, 2013 at 7:53 AM, Tadele Assefa <[hidden email]> wrote:
> > Dear All,
> >
> > Our Language, Sidama, uses the latin script. However, there are
> additional
> > consonants which are formed by using two letter combinations. Eg ph is
> > considered one consonant and so are sh, ch, zh, ts, ny. These constants
> are
> > ordered after the first constant used to create them. "Our alphabet"
> reads,
> > a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc
> >
> > As a result, we want to sort string using this order. eg the words
> > cala, chala, cola should be sorted cala, cola,chala... ch has to come
> after
> > c.
> >
> > So, how can we include this behavior in LO?
>
> You need to add the rules to LC_INDEX section of your locale.
> (i18npool/source/localedata/data/sid_ET.xml)
>
> See a working solution (Hungarian) at:
>
> http://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/hu_HU.xml#205
>
> If you can't submit a patch, please file a bug.
>
> Best regards,
> Andras
>
> --
> To unsubscribe e-mail to: [hidden email]
> Problems?
> http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
> List archive: http://listarchives.libreoffice.org/global/l10n/
> All messages sent to this list will be publicly archived and cannot be
> deleted
>

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Yaron Shahrabani Yaron Shahrabani
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc

Ignore the Unicode question, you've already answered it ☺

Yaron Shahrabani

<Hebrew translator>



On Fri, Jun 14, 2013 at 10:03 AM, Yaron Shahrabani <[hidden email]>wrote:

> *The Hungarian LC_INDEX*
>
>   <LC_INDEX>
>     <IndexKey phonetic="false" default="true" unoid="charset">A(A, Á) B C {Cs} D {DZ} {DZS} E(E, É) F G {Gy} H I(I, Í) J-L {Ly}  -N {Ny} O(O, Ó) Ő(Ö, Ő) P-S {Sz} T {Ty} U(U, Ú) Ű(Ü, Ű) V-Z {Zs}</IndexKey>
>     <UnicodeScript>0</UnicodeScript>
>     <UnicodeScript>1</UnicodeScript>
>     <UnicodeScript>2</UnicodeScript>
>     <UnicodeScript>3</UnicodeScript>
>     <FollowPageWord>p.</FollowPageWord>
>     <FollowPageWord>pp.</FollowPageWord>
>   </LC_INDEX>
>
> *LC_INDEX for Sidama*
>
>     <IndexKey phonetic="false" default="true" unoid="alphanumeric">A-Z</IndexKey>
>     <UnicodeScript>0</UnicodeScript>
>     <UnicodeScript>1</UnicodeScript>
>     <FollowPageWord>STP</FollowPageWord>
>     <FollowPageWord>StO</FollowPageWord>
>   </LC_INDEX>
>
> I couldn't find any documentation but I'm guessing you should first change
> the unoid value to charset, what's Sidama status regarding Unicode?
>
> Yaron Shahrabani
>
> <Hebrew translator>
>
>
>
> On Fri, Jun 14, 2013 at 9:31 AM, Andras Timar <[hidden email]> wrote:
>
>> Hello,
>>
>> On Fri, Jun 14, 2013 at 7:53 AM, Tadele Assefa <[hidden email]>
>> wrote:
>> > Dear All,
>> >
>> > Our Language, Sidama, uses the latin script. However, there are
>> additional
>> > consonants which are formed by using two letter combinations. Eg ph is
>> > considered one consonant and so are sh, ch, zh, ts, ny. These constants
>> are
>> > ordered after the first constant used to create them. "Our alphabet"
>> reads,
>> > a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc
>> >
>> > As a result, we want to sort string using this order. eg the words
>> > cala, chala, cola should be sorted cala, cola,chala... ch has to come
>> after
>> > c.
>> >
>> > So, how can we include this behavior in LO?
>>
>> You need to add the rules to LC_INDEX section of your locale.
>> (i18npool/source/localedata/data/sid_ET.xml)
>>
>> See a working solution (Hungarian) at:
>>
>> http://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/hu_HU.xml#205
>>
>> If you can't submit a patch, please file a bug.
>>
>> Best regards,
>> Andras
>>
>> --
>> To unsubscribe e-mail to: [hidden email]
>> Problems?
>> http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
>> Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
>> List archive: http://listarchives.libreoffice.org/global/l10n/
>> All messages sent to this list will be publicly archived and cannot be
>> deleted
>>
>
>

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Jānis Jānis
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc


Citēts Yaron Shahrabani <[hidden email]>
Fri, 14 Jun 2013 10:06:49 +0300:

> Ignore the Unicode question, you've already answered it ☺

to my mind - it would be much betted if you submit locale data through
http://www.it46.se/localegen/ - then there is a chance you locale data  
will spread a lot wider.

Of course, you language must possess respective language code in, for  
example, ISO 639-3. Is it Sidamo  
(http://www.ethnologue.com/language/sid) or Sidama?

Janis
--
http://dict.dv.lv
http://tehvi.dv.lv



--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted
Eike Rathke-2 Eike Rathke-2
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc

In reply to this post by Andras Timar
Hi,

On Friday, 2013-06-14 08:31:11 +0200, Andras Timar wrote:

> On Fri, Jun 14, 2013 at 7:53 AM, Tadele Assefa <[hidden email]> wrote:
> > Our Language, Sidama, uses the latin script. However, there are additional
> > consonants which are formed by using two letter combinations. Eg ph is
> > considered one consonant and so are sh, ch, zh, ts, ny. These constants are
> > ordered after the first constant used to create them. "Our alphabet" reads,
> > a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc
> >
> > As a result, we want to sort string using this order. eg the words
> > cala, chala, cola should be sorted cala, cola,chala... ch has to come after
> > c.
> >
> > So, how can we include this behavior in LO?
>
> You need to add the rules to LC_INDEX section of your locale.
> (i18npool/source/localedata/data/sid_ET.xml)

Well, yes, but the IndexKey element is only used for Writer's index
table feature. General sorting uses collation, defaulting to ICU's
Unicode collation rules. If Unicode did not define these exemptions for
'sid' or ICU didn't implement it yet then we'd have to add a language
specific rule to i18npool/source/collator/data/


> See a working solution (Hungarian) at:
> http://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/hu_HU.xml#205

Btw, why does that opengrok instance not understand UTF-8?

  Eike

--
LibreOffice Calc developer. Number formatter stricken i18n transpositionizer.
GPG key ID: 0x65632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
For key transition see http://erack.de/key-transition-2013-01-10.txt.asc
Support the FSFE, care about Free Software! https://fsfe.org/support/?erack

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted

Tadele Assefa Tadele Assefa
Reply | Threaded
Open this post in threaded view
|

Re: [libreoffice-l10n] Sort Order in Calc

Hello Andras,

I have file a bug at[1] with modified locale for the sort order.

[1]  https://bugs.freedesktop.org/show_bug.cgi?id=65809

Regards,


On Fri, Jun 14, 2013 at 3:30 PM, Eike Rathke <[hidden email]> wrote:

> Hi,
>
> On Friday, 2013-06-14 08:31:11 +0200, Andras Timar wrote:
>
> > On Fri, Jun 14, 2013 at 7:53 AM, Tadele Assefa <[hidden email]>
> wrote:
> > > Our Language, Sidama, uses the latin script. However, there are
> additional
> > > consonants which are formed by using two letter combinations. Eg ph is
> > > considered one consonant and so are sh, ch, zh, ts, ny. These
> constants are
> > > ordered after the first constant used to create them. "Our alphabet"
> reads,
> > > a,b,c,ch,d,dh, e,f,g,h,i,j,k,l,m,n,ny,o,p,etc
> > >
> > > As a result, we want to sort string using this order. eg the words
> > > cala, chala, cola should be sorted cala, cola,chala... ch has to come
> after
> > > c.
> > >
> > > So, how can we include this behavior in LO?
> >
> > You need to add the rules to LC_INDEX section of your locale.
> > (i18npool/source/localedata/data/sid_ET.xml)
>
> Well, yes, but the IndexKey element is only used for Writer's index
> table feature. General sorting uses collation, defaulting to ICU's
> Unicode collation rules. If Unicode did not define these exemptions for
> 'sid' or ICU didn't implement it yet then we'd have to add a language
> specific rule to i18npool/source/collator/data/
>
>
> > See a working solution (Hungarian) at:
> >
> http://opengrok.libreoffice.org/xref/core/i18npool/source/localedata/data/hu_HU.xml#205
>
> Btw, why does that opengrok instance not understand UTF-8?
>
>   Eike
>
> --
> LibreOffice Calc developer. Number formatter stricken i18n
> transpositionizer.
> GPG key ID: 0x65632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
> For key transition see http://erack.de/key-transition-2013-01-10.txt.asc
> Support the FSFE, care about Free Software!
> https://fsfe.org/support/?erack
>



--
Regards,*
___________________________
Tadele Assefa
Managing Director*
*

Cell: +25-911-84-13-84*
*Think Green – Please do not print this email unless you really need to*

--
To unsubscribe e-mail to: [hidden email]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/l10n/
All messages sent to this list will be publicly archived and cannot be deleted