Jump to content


Photo

Subtitle and Italic

Enigma2

  • Please log in to reply
80 replies to this topic

#1 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 25 January 2011 - 22:25

Hi

Let's say I have a mymoview.mkv and a mymovie.srt
In this srt I have some text marked as <i>italic</i>

How can it be that the sub parser/render actually renders the <i> tag on screen?

I have looked into it and in /usr/share/enigma2/skin_subtitles.xml actually we do have settings for Subtitle_Italic. Is it gst-plugin-subparse rendering it wrong?

I am wondering if you experience the same problem...

Re: Subtitle and Italic #2 ficaz

  • Senior Member
  • 177 posts

+1
Neutral

Posted 26 January 2011 - 11:21

I have the same problem too.

Re: Subtitle and Italic #3 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 26 January 2011 - 11:28

Currently the subtitles cannot be rendered in Italic as this has not been coded (yet) in the correct way by DMM... We did made a lot of changes to the teletekst subtitles.

What should be done is that in eSubtitle.cpp the font has to be changed to a correct font in italian -or- what also could be an improvement is simply remove all the tags.

Do you have an example of those subtitles (srt file)?

Re: Subtitle and Italic #4 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 26 January 2011 - 13:23

Do you really see the <i> in the subtitle.... when the <1> and </i> signs are in the begin of the line and at the end it should work fine....When it is in the middle it seems to be that the line is cut in the begin and at the end and then we see <i>'s...

See the code in /lib/gui/esubtitle.cpp
262						   face = Subtitle_Regular;
263						   ePangoSubtitlePageElement &element = m_pango_page.m_elements[i];
264						   std::string text = element.m_pango_line;
265						   std::string::size_type loc = text.find("<", 0 );
266						   if ( loc != std::string::npos )
267						   {
268								 switch (char(text.at(1)))
269								 {
270								 case 'i':
271									    face = Subtitle_Italic;
272									    break;
273								 case 'b':
274									    face = Subtitle_Bold;
275									    break;
276								 }
277								 text = text.substr(3, text.length()-7);
278						   }


Re: Subtitle and Italic #5 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 26 January 2011 - 21:14

This is rendered on screen as <i>
Let me know if you want the entire .srt file to try it yourself

23
00:01:14,648 --> 00:01:16,248
<i>Move it.</i>
<i>you dare fall.</i>

Re: Subtitle and Italic #6 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 26 January 2011 - 21:37

Anyway I am quite confused. What do we use to render subtitles?

1)the code in /lib/gui/esubtitle.cpp in the Enigma2 repository

or

2) the code in gst/subparse/gstsubparse.c in the gst-plugins-base repository (gstreamer) (git://anongit.freedesktop.org/gstreamer/gst-plugins-base)

If you do both remember that gstsubparse escapes the <i> into "&lt;i&gt;"
So the code at line 265 in /lib/gui/esubtitle.cpp will never match < as the char at position 0, since it's &.
I refer to this line:
std::string::size_type loc = text.find("<", 0 );

This is stated in /lib/gui/esubtitle.cpp at line 719 in the comment of the method subrip_unescape_formatting
/* we want to escape text in general, but retain basic markup like
* <i></i>, <u></u>, and <b></b>. The easiest and safest way is to
* just unescape a white list of allowed markups again after
* escaping everything (the text between these simple markers isn't
* necessarily escaped, so it seems best to do it like this) */
static void
subrip_unescape_formatting (gchar * txt)


Re: Subtitle and Italic #7 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 26 January 2011 - 21:42

You indeed discovered the bug (induced by DMM)...

Re: Subtitle and Italic #8 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 27 January 2011 - 08:05

then i deserve at least one golden star instead of blue I have now :-)

Re: Subtitle and Italic #9 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 27 January 2011 - 08:05

then i deserve at least one golden star instead of blue I have now :-)

Re: Subtitle and Italic #10 pieterg

  • PLi® Core member
  • 32,766 posts

+245
Excellent

Posted 27 January 2011 - 10:42

you might be on to the problem, but your conclusion is not entirely correct;

std::string::size_type loc = text.find("<", 0 );

finds the location of the first occurance of '<' in the text. It doesn't just check the first character of the string.

Re: Subtitle and Italic #11 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 27 January 2011 - 11:25

But afterwards it removes the first three characters and the last 4 (7-3=4)... seems to be a bit ghosting this code ;-)

Re: Subtitle and Italic #12 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 27 January 2011 - 11:34

@pieterg
the std::string::size_type loc is used in the next line for an if statement so actually it's not just a find.
std::string::size_type loc = text.find("<", 0 );
if ( loc != std::string::npos )
{
	 switch (char(text.at(1)))
	 {
	 case 'i':
		   face = Subtitle_Italic;

so if the string starts with an & the switch is never triggered and the string is never parsed for italic or bold

Re: Subtitle and Italic #13 pieterg

  • PLi® Core member
  • 32,766 posts

+245
Excellent

Posted 27 January 2011 - 11:36

ok, your conclusion seemed to be about the find, I hadn't looked at the remainder of the code yet ;)

Re: Subtitle and Italic #14 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 27 January 2011 - 11:38

So in short you mean the < is replaced by an & something previously.... below in the code some & something are converted... problable we should replace this above this code and in addition add something so get the < and / also in a proper format.

Also removing from the beginning and the end is strange... we could afterwards remove <i>, </i>, <b> and </b> from the complete string

But then still the whole line is italic or bold or normal. We cannot do (yet) a part of it.

Re: Subtitle and Italic #15 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 27 January 2011 - 12:24

We could do some regexp matching trying to match &lt;i&gt; in addition with the find < in position 0.
In this way we can handle escaped and not escaped strings.

I am quite rusty at C++ and quite spoiled with Ruby.. so I am not sure how to write it myself :-)

Re: Subtitle and Italic #16 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 27 January 2011 - 12:33

Just as suggestion....

But somewhere else the <> etc are changed by &-signs... Why not remove this manupulation there then all replace_alls could be removed here. I did not find yet were the characters are changed to &-signs.

In addition I would change 'text = text.substr(3, text.length()-7); ' by just removing <i> <b> </i> </b>,,,,,

						   text = replace_all(text, "&apos;", "'");
						   text = replace_all(text, "&quot;", "\"");
						   text = replace_all(text, "&amp;", "&");
						   text = replace_all(text, "&lt", "<");
						   text = replace_all(text, "&gt", ">");
						   std::string::size_type loc = text.find("<", 0 );
						   if ( loc != std::string::npos )
						   {
								 switch (char(text.at(1)))
								 {
								 case 'i':
									    face = Subtitle_Italic;
									    break;
								 case 'b':
									    face = Subtitle_Bold;
									    break;
								 }
								 text = text.substr(3, text.length()-7);
						   }


Re: Subtitle and Italic #17 pieterg

  • PLi® Core member
  • 32,766 posts

+245
Excellent

Posted 27 January 2011 - 12:58

quite sure there already is a (eString?) function to remove html escapes?

Re: Subtitle and Italic #18 libolibo

  • Senior Member
  • 81 posts

0
Neutral

Posted 27 January 2011 - 13:54

I think littlesat patch deserve a try!

Re: Subtitle and Italic #19 littlesat

  • PLi® Core member
  • 57,084 posts

+698
Excellent

Posted 27 January 2011 - 14:18

This was just a brain storm... but I think it is better to get rid of the &-codes completely.... as in the orriginal file they are also not there.Sometimes a quick fix is a dirty fix.

Re: Subtitle and Italic #20 pieterg

  • PLi® Core member
  • 32,766 posts

+245
Excellent

Posted 27 January 2011 - 14:30

Just as suggestion....


yes, that looks like the quickest workaround.
Did not find eString html conversion routines, most likely they only existed in e1's eString, as e2 doesn't do html in the c++ part.

But somewhere else the <> etc are changed by &-signs... Why not remove this manupulation there then all replace_alls could be removed here. I did not find yet were the characters are changed to &-signs.


No, that's probably a gstreamer element. We'd rather not patch that.

In addition I would change 'text = text.substr(3, text.length()-7); ' by just removing <i> <b> </i> </b>,,,,,


yes, the current aproach is very error-prone. But a proper pango parser would require a lot more code.
On the other hand, coding a slightly better parser would allow us to introduce \i \b escapecodes, just like we do with colors...



1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users