Jump to content


Photo

Help needed with a subtitle file (comma's in the time stamps got lost)


  • Please log in to reply
15 replies to this topic

#1 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 09:56

I synced a subtitle file (Dutch titles with the recording I made from ITV of the show Vera)

 

After I finished saved the file, and closed the sync program.

Now the file refuses to read in anything which read srt files.

 

Opening it in PSp[ad revealed this:

1
00:02:38605 --> 00:02:41994
Blanke vrouw, begin 20.
Verwonding aan de zijkant van het hoofd.

Off course this must be:

1
00:02:38,605 --> 00:02:41,994
Blanke vrouw, begin 20.
Verwonding aan de zijkant van het hoofd.

I can always resync the file (but this takes 1/2 hour concentrated work).

 

Can somebody help with a script or a method for putting the comma's back into this file?

 

Willy

 

 

 


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #2 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 7 May 2016 - 10:29

Here you are:

>>> import re
>>> d=open("/tmp/s").read()
>>> d
'1\n00:02:38605 --> 00:02:41994\nBlanke vrouw, begin 20.\nVerwonding aan de zijkant van het hoofd.\n'
>>> re.sub("(\d\d:\d\d:\d\d\d)(\d\d)", r"\1,\2", d)
'1\n00:02:386,05 --> 00:02:419,94\nBlanke vrouw, begin 20.\nVerwonding aan de zijkant van het hoofd.\n'

Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Help needed with a subtitle file (comma's in the time stamps got lost) #3 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 11:13

Is this a linux script ?

 

Willy


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #4 WanWizard

  • PLi® Core member
  • 68,761 posts

+1,742
Excellent

Posted 7 May 2016 - 11:15

python code


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #5 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 12:28

How can I use this?

(I have python installed on my windows machine, Also on Ubuntu.  And if everything else fails on the enigma receiver)

 

Willy


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #6 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 12:44

root@et9x00:/var/volatile/tmp# python comma.py test.srt
  File "comma.py", line 1
    >>> import re
     ^
SyntaxError: invalid syntax

Willy


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #7 WanWizard

  • PLi® Core member
  • 68,761 posts

+1,742
Excellent

Posted 7 May 2016 - 12:59

what does your comma.py look like?

 

The ">>>" in the example are the python interactive prompt, not part of the code!


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #8 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 7 May 2016 - 13:08

Use this one to create script.

#!/usr/bin/python

import os
import re
import sys

f = sys.argv[1]
d = open(f).read()
d = re.sub("(\d\d:\d\d:\d\d\d)(\d\d)", r"\1,\2", d)
open(f + '.new', 'w').write(d)

Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Help needed with a subtitle file (comma's in the time stamps got lost) #9 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 15:35

Thanks.  Result.

1
00:02:386,05 --> 00:02:419,94
Blanke vrouw, begin 20.
Verwonding aan de zijkant van het hoofd.
2
00:02:420,34 --> 00:02:441,88
- Wond van een stomp voorwerp.
- Hebben we een naam?3
00:02:443,27 --> 00:02:468,78
Nog niets. Geen telefoon, geen tas.4
00:02:478,75 --> 00:02:504,67
- Tijd van overlijden?
- Ergens in het afgelopen uur.

Willy


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #10 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 15:39

Sorry not exactly correct:

 

00:02:386,05 --> 00:02:419,94

 

Should be:

 

00:02:38,605 --> 00:02:41,994

 

I have no idea how to modify it to get it correct

 

Willy


Edited by doglover, 7 May 2016 - 15:46.

~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #11 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 7 May 2016 - 17:28

Check the parentesis, there are two groups, the second one has two digits \d\d change it two three and remove one from fist group.


Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Help needed with a subtitle file (comma's in the time stamps got lost) #12 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 17:56

This seems to do the trick.

d = re.sub("(\d\d:\d\d:\d\d)(\d\d\d)", r"\1,\2", d)

But it is still Chinese for me.

But problem solved.  So thanks

 

Willy


Edited by doglover, 7 May 2016 - 17:57.

~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #13 athoik

  • PLi® Core member
  • 8,458 posts

+327
Excellent

Posted 7 May 2016 - 18:03

Follow this tutorial to learn regular experessions: http://regexone.com/
Wavefield T90: 0.8W - 1.9E - 4.8E - 13E - 16E - 19.2E - 23.5E - 26E - 33E - 39E - 42E - 45E on EMP Centauri DiseqC 16/1
Unamed: 13E Quattro - 9E Quattro on IKUSI MS-0916

Re: Help needed with a subtitle file (comma's in the time stamps got lost) #14 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 7 May 2016 - 18:16

I need to do that.  Also in parsing websites for EPG grabbing this is handy, but I never came to apply it.  Up to now I have helped myself by selecting the appropriate tags.

But regex expression are more versatile.  I know that. 

 

Willy


~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #15 doglover

  • Rytec EPG Team
  • 17,049 posts

+642
Excellent

Posted 9 May 2016 - 14:18

Well, I took a crash course (will need to repeat this, and practise a lot before I am fluent) and realized that I could use this in PSPad or Notepad++ to solve the problem as well.

Search : (\d\d:\d\d:\d\d)

Replace: $1,

And not to forget to select the regex button, did the trick.

 

Willy

Attached Files


Edited by doglover, 9 May 2016 - 14:18.

~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: Help needed with a subtitle file (comma's in the time stamps got lost) #16 WanWizard

  • PLi® Core member
  • 68,761 posts

+1,742
Excellent

Posted 9 May 2016 - 20:16

For testing regular expressions, have a look at https://regex101.com/, might save quite a bit of time in trial-and-error...


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.



1 user(s) are reading this topic

0 members, 1 guests, 0 anonymous users