Jump to content


Photo

Turkish Character (Coding) Problem?


  • Please log in to reply
27 replies to this topic

#1 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 18:57

Hello,

I am from Turkey and I need your help related a technical issue with subtitle Turkish coding problem. As you can see below picture, there is a font character issue related with Turkish characters for subtitles. Because of this problem lots of people can not watch Turkish subtitles efficiently.

I tried to tell the problem with descriptions below picture, A friend of mine advised me to write here because he told me that the team should change C++ coding and create a new image so that this problem can be fixed. Any help is much appreciated and thousands of Turkish people will have the ability to use their devices efficiently. ;)

Any idea and help?

Posted Image

Re: Turkish Character (Coding) Problem? #2 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 25 February 2013 - 19:03

What subtitles are we talking about exactly?

- subtitles in live and recorded recordings using a teletext page (DVB-TXT)
- subtitles in live and recorded recordings using DVB subtitling
- separate srt file
- subtitles embedded into mkv or mp4 container

I guess #3 but I want to be certain.

SRT files don't have metadata for encoding (nor language). Enigma interpretes them as UTF-8, but if it isn't UTF-8 to begin with, it will show garbage like you show here.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #3 Taykun345

  • Senior Member
  • 1,297 posts

+41
Good

Posted 25 February 2013 - 19:34

This is known issue with no solution till now. Many users are affected, not only you guys from Turkey. Letters šžč are also a problem for example.
Enigma is doing something wrong as many other media players dont have problems with ANSI encoded subtitles.
Army MoodBlue HD skin modification by me: https://github.com/T...-MoodBlueHD-mod
Matrix10 MH-HD2 skin modification by me: https://github.com/B...-MX-HD2-OpenPli
MetrixHD skin modification by me: https://github.com/T...xHD-WPstyle-mod
Slovenian translation for OpenPLi E2: https://github.com/T...ion-for-OpenPLi

Re: Turkish Character (Coding) Problem? #4 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 25 February 2013 - 20:13

This is known issue with no solution till now. Many users are affected, not only you guys from Turkey. Letters šžč are also a problem for example.
Enigma is doing something wrong as many other media players dont have problems with ANSI encoded subtitles.


Enigma can handle ANSI encoded subtitles.

See here (Long Version): http://openpli.org/f...097#entry310097

(Short version)

1. Install the code page

opkg install eglibc-gconv-iso8859-7

2. Change /usr/bin/enigma2.sh and set the variable GST_SUBTITLE_ENCODING

Change the line from LD_PRELOAD=$LIBS /usr/bin/enigma2 to LD_PRELOAD=$LIBS GST_SUBTITLE_ENCODING="ISO-8859-7" /usr/bin/enigma2


The above sample is for Greek ISO, but you can easily configure for other ISO character sets.


Maybe a plugin configuring the GST_SUBTITLE_ENCODING and installing (if missing) the code page is mandatory for non advanced users.


Good luck.

Edited by athoik, 25 February 2013 - 20:14.

Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Turkish Character (Coding) Problem? #5 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 25 February 2013 - 20:17

This is known issue with no solution till now. Many users are affected, not only you guys from Turkey. Letters šžč are also a problem for example.
Enigma is doing something wrong as many other media players dont have problems with ANSI encoded subtitles.

ANSI encoded??? Do you mean ASCII? In contrary to popular belief ASCII does not define any accented characters! Or do you mean ISO-8859?

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #6 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 20:30

What subtitles are we talking about exactly?


Subtitle supported channels (like SKY IT 13E channels) + standard bluray rip or dvdrip media files with srt diles.

Guys, so we have a solution for that, I am confused with messages above? :huh:

Re: Turkish Character (Coding) Problem? #7 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 25 February 2013 - 20:38

SKY IT 13E = DVB subs or DVB-TXT subs
"rips" = srt

These are handled in a completely different way within enigma, so it is very important to know.

If encoding is wrong for DVB subs / DVB-TXT subs, then they are broadcast in the wrong encoding.
If srt is displayed wrong, they're actually in an encoding other than UTF8, which enigma uses, but because in a srt file you cannot specify an encoding, they show up wrong.

I bet the srt are some windows code page (1251 or similar). That's definitely no UTF8. That's also the main problem with web sites showing wrong characters.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #8 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 20:42

Once I change skin.xml file for a skin with subtitle color as color, both SKY IT 13E & rips srt are the same colors.

DVB Subtitles mean BBC HD in 13E.

Re: Turkish Character (Coding) Problem? #9 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 25 February 2013 - 20:42

1. Install the code page

opkg install eglibc-gconv-iso8859-7


So there is it ;)ISO/IEC encoding for Turkish. That's ISO latin encoding for GREEK! That's definitely no UTF8.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #10 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 25 February 2013 - 20:47

Once I change skin.xml file for a skin with subtitle color as color, both SKY IT 13E & rips srt are the same colors.

Yes that is intended. It's still another path within Enigma.

DVB Subtitles mean BBC HD in 13E.

This channel has DVB (graphical) subtitling for Turkish. As the subtitling is graphical, the encoding cannot be incorrect, and indeed, it looks like proper Turkish. As far as I can see there is not teletext (DVB-TXT) subtitling.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #11 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 21:23

@Eric, thank you so much for your appreciated help here.

But unfortunately I gave the command "opkg install eglibc-gconv-iso8859-7" and restart my device, still character issue stands. :huh: I tried to give the command again to make sure and it's already in the device. What can be the problem again?

Posted Image

Re: Turkish Character (Coding) Problem? #12 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 25 February 2013 - 21:29


This is known issue with no solution till now. Many users are affected, not only you guys from Turkey. Letters šžč are also a problem for example.
Enigma is doing something wrong as many other media players dont have problems with ANSI encoded subtitles.

ANSI encoded??? Do you mean ASCII? In contrary to popular belief ASCII does not define any accented characters! Or do you mean ISO-8859?


ISO character sets.
Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Turkish Character (Coding) Problem? #13 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 25 February 2013 - 21:36

Read Here for character sets: http://en.wikipedia....acter_encodings

The above example (ISO-8859-7) is working for Greek SRT (NON UTF-8) files.

For Turkish you need:

ISO 8859-3 Western Europe and South European (Turkish, Maltese plus Esperanto) OR
ISO 8859-9 Western Europe with amended Turkish character set OR
Windows-1254 for Turkish

Edit1. What is ANSI ... http://www.sttmedia.com/unicode-ansi

Edited by athoik, 25 February 2013 - 21:40.

Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Turkish Character (Coding) Problem? #14 pieterg

  • PLi® Core member
  • 32,766 posts

+245
Excellent

Posted 25 February 2013 - 21:42

Maybe a plugin configuring the GST_SUBTITLE_ENCODING and installing (if missing) the code page is mandatory for non advanced users.


Then you'd end up switching the subtitle encoding (and restarting e2) for every different movie.
Actually, to avoid all this, they invented utf-8. Goodbye to all those different encodings ;)

Re: Turkish Character (Coding) Problem? #15 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 21:57

So far I flashed via TELNET below commands and restarted the device but no good :( Could you please advise what should I post via TELNET else?

opkg install eglibc-gconv-iso8859-7
opkg install eglibc-gconv-iso8859-9
opkg install eglibc-gconv-iso8859-3

Re: Turkish Character (Coding) Problem? #16 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 25 February 2013 - 22:05


Maybe a plugin configuring the GST_SUBTITLE_ENCODING and installing (if missing) the code page is mandatory for non advanced users.


Then you'd end up switching the subtitle encoding (and restarting e2) for every different movie.
Actually, to avoid all this, they invented utf-8. Goodbye to all those different encodings ;)


I agree UTF-8 is much better. But since there is a feature everybody wants, why not to use it. All my srt files are Greek ISO, this solution helps people with one ISO encoding only, but it helps.

So far I flashed via TELNET below commands and restarted the device but no good :( Could you please advise what should I post via TELNET else?

opkg install eglibc-gconv-iso8859-7
opkg install eglibc-gconv-iso8859-9
opkg install eglibc-gconv-iso8859-3


By default enigma2 assumes ISO-8859-15.

Encoding to assume if input subtitles are not in UTF-8 encoding. If not set, the GST_SUBTITLE_ENCODING environment variable will be checked for an encoding to use. If that is not set either, ISO-8859-15 will be assumed.


2. Change /usr/bin/enigma2.sh and set the variable GST_SUBTITLE_ENCODING

Change the line from LD_PRELOAD=$LIBS /usr/bin/enigma2 to LD_PRELOAD=$LIBS GST_SUBTITLE_ENCODING="ISO-8859-9" /usr/bin/enigma2

or

Change the line from LD_PRELOAD=$LIBS /usr/bin/enigma2 to LD_PRELOAD=$LIBS GST_SUBTITLE_ENCODING="ISO-8859-3" /usr/bin/enigma2

Edited by athoik, 25 February 2013 - 22:08.

Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Turkish Character (Coding) Problem? #17 eozen81

  • Senior Member
  • 55 posts

+2
Neutral

Posted 25 February 2013 - 22:16

Thank you so much, with your appreciated help here.

I want to thank you on behalf of lots of Turkish people who will be able to watch their movies with subtitles from now on.

You all made my day here, thanks are really not enough :D

Re: Turkish Character (Coding) Problem? #18 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 26 February 2013 - 15:59

Edit1. What is ANSI ... http://www.sttmedia.com/unicode-ansi

There is no "ANSI" encoding, you mean ISO/IEC "latin" extensions.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.


Re: Turkish Character (Coding) Problem? #19 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 26 February 2013 - 16:19

Yes when i say ANSI i mean 8bit characher set.


...The term "ANSI" as used to signify Windows code pages is a historical reference, but is nowadays a misnomer that continues to persist in the Windows community...

http://blogs.msdn.co.../31/144893.aspx
Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Turkish Character (Coding) Problem? #20 Erik Slagter

  • PLi® Core member
  • 46,969 posts

+541
Excellent

Posted 26 February 2013 - 18:57

Hmm, interesting, that's new to me (maybe that's because I don't use windows). I am familiar with "windows code pages" though.

Anyway, IF someone was to use non-UTF8, than at least be it iso-8859, all the rest is hackish...

I think the real solution is to convert files in all these "weird" encodings to UTF8, there are tools to do that.

Edited by Erik Slagter, 26 February 2013 - 18:59.

* Wavefrontier T90 with 28E/23E/19E/13E via SCR switches 2 x 2 x 6 user bands
I don't read PM -> if you have something to ask or to report, do it in the forum so others can benefit. I don't take freelance jobs.
Ik lees geen PM -> als je iets te vragen of te melden hebt, doe het op het forum, zodat anderen er ook wat aan hebben.



7 user(s) are reading this topic

0 members, 7 guests, 0 anonymous users