Page 1 of 1

Arabic diacritics (Tashkil)

Posted: Mon May 14, 2018 7:50 am
by Fatimah
Hello,

I don't know if this is the right place for my problem, so I'm sorry if it isn't.

There is a problem with how Renpy handles "tashkil" for Arabic language.
arabic problem.PNG
The word that's underlined is where the problem is. Whenever I use "tashkil", Renpy will disconnect the letters and place the "tashkil" one-letter in advance.

This is how the word should be displayed:
arabic.PNG
arabic.PNG (9.77 KiB) Viewed 1944 times
Hope it is clear.

Thank you!

Re: Arabic diacritics (Tashkil)

Posted: Mon May 14, 2018 6:46 pm
by PyTom
(This is more appropriate for the Q&A section, so I'll move it.)

Are you setting config.rtl = True in your game? That's needed for proper arabic text shaping support. Please try:

Code: Select all

config.rtl = True

Re: Arabic diacritics (Tashkil)

Posted: Tue May 15, 2018 1:26 am
by Fatimah
PyTom wrote: Mon May 14, 2018 6:46 pm (This is more appropriate for the Q&A section, so I'll move it.)

Are you setting config.rtl = True in your game? That's needed for proper arabic text shaping support. Please try:

Code: Select all

config.rtl = True
Yes, I do have it in my options file.

Re: Arabic diacritics (Tashkil)

Posted: Sun May 20, 2018 10:44 am
by PyTom
That's odd. Can you type that line of text in to the forum here, so I can copy and paste it into Ren'Py and see whgat happens? I don't know the first thing about how to input Arabic.

Re: Arabic diacritics (Tashkil)

Posted: Mon May 21, 2018 6:30 am
by Fatimah
PyTom wrote: Sun May 20, 2018 10:44 am That's odd. Can you type that line of text in to the forum here, so I can copy and paste it into Ren'Py and see whgat happens? I don't know the first thing about how to input Arabic.
Sure!

ما هذا المكان يا أبي؟ و لِمَ مجيئنا إلى هنا؟

Re: Arabic diacritics (Tashkil)

Posted: Thu May 24, 2018 12:14 am
by PyTom
Okay, I spent a few hours looking into this, but unfortunately I don't have good news for. I'm going to explain my understanding of things - I'm sure you already know this, but I'm kind of hoping that I got something wrong that would make things easier on us. Forgive me if this sounds like I'm talking down here - this is sort of rubber duck debugging, me talking this out to myself in public in the hope someone will come up to me and say I got something wrong.

So, these are Arabic diacritics, which are used to indicate missing vowels and consonant length. These tend to be used in religious texts (which I'm guessing have obscure words) and works written for children (who need to be able to match word sounds with written words), but not as part of general writing meant to be consumed by adults.

The problem is that placing these diacritics is hard. There are fairly complicated rules that need to be obeyed, since it's reasonable that multiple diacritics can be applied to a single character, and the location of the diacritic will vary based on the character it's applied to and any other diacritics that are present.

I think this is beyond what I could add to Ren'Py in any sort of reasonable timeframe, at least with any sort of degree of correctness. It would basically require a rewrite of how Ren'Py handles text, to make it use a library like harfbuzz, and everything else that entails. I won't rule out doing that someday, but that someday is going to be a long time off.

The one thing I could do would be to give you a text tag that takes in a series of characters, and returns instructions to tell Ren'Py how to move the last character. So it might see U+fee2 (meem final form) and U+064e (fatah), and it would know to move U+064e left and down by so many pixels when that happens.

If that's interesting, let me know, and I could add that feature. And if I got something wrong here, please let me know. But apart from that, I don't think I can fix this on a reasonable timeframe.

Re: Arabic diacritics (Tashkil)

Posted: Sun May 27, 2018 3:37 am
by Fatimah
PyTom wrote: Thu May 24, 2018 12:14 am Okay, I spent a few hours looking into this, but unfortunately I don't have good news for. I'm going to explain my understanding of things - I'm sure you already know this, but I'm kind of hoping that I got something wrong that would make things easier on us. Forgive me if this sounds like I'm talking down here - this is sort of rubber duck debugging, me talking this out to myself in public in the hope someone will come up to me and say I got something wrong.

So, these are Arabic diacritics, which are used to indicate missing vowels and consonant length. These tend to be used in religious texts (which I'm guessing have obscure words) and works written for children (who need to be able to match word sounds with written words), but not as part of general writing meant to be consumed by adults.

The problem is that placing these diacritics is hard. There are fairly complicated rules that need to be obeyed, since it's reasonable that multiple diacritics can be applied to a single character, and the location of the diacritic will vary based on the character it's applied to and any other diacritics that are present.

I think this is beyond what I could add to Ren'Py in any sort of reasonable timeframe, at least with any sort of degree of correctness. It would basically require a rewrite of how Ren'Py handles text, to make it use a library like harfbuzz, and everything else that entails. I won't rule out doing that someday, but that someday is going to be a long time off.

The one thing I could do would be to give you a text tag that takes in a series of characters, and returns instructions to tell Ren'Py how to move the last character. So it might see U+fee2 (meem final form) and U+064e (fatah), and it would know to move U+064e left and down by so many pixels when that happens.

If that's interesting, let me know, and I could add that feature. And if I got something wrong here, please let me know. But apart from that, I don't think I can fix this on a reasonable timeframe.
That's okay, Tom.
Thank you for taking the time to look into this.

As for the diacritics, it's more to tell the reader how to pronounce the word. See every letter in Arabic has three sounds based on these diacritics. While native speakers won't have problem reading texts without these diacritics, sometimes they are a necessity because it's possible to have two words that are spelled the same but they have different meanings because of the diacritics, and that's why they are important.

I would be very interested in the text tag.

Hopefully one day, this feature would be added to Renpy.