Jump to content


Photo

EPGimport - Modification requested


  • Please log in to reply
7 replies to this topic

#1 doglover

  • Rytec EPG Team
  • 17,010 posts

+639
Excellent

Posted 1 July 2022 - 15:19

It happens that when downloading an XMTV file the download gets stuck.

The total EPGimport routine comes then to an halt, and the only way to abort this is restarting enigma.

 

Lately I discovered that sometimes this is caused by some of the download sites.

Contact has been made, a download requested, but nothing is transferred.  Result: hung EPGimport.

 

From my limited knowledge of python, these are the statements for the download:

    def fetchUrl(self, filename):
        if filename.startswith('http:') or filename.startswith('https:') or filename.startswith('ftp:'):
            self.do_download(filename, self.afterDownload, self.downloadFail)
        else:
            self.afterDownload(None, filename, deleteFile=False)
    def do_download(self, sourcefile, afterDownload, downloadFail):
        path = bigStorage(9000000, '/tmp', '/media/cf', '/media/mmc', '/media/usb', '/media/hdd')
        filename = os.path.join(path, 'epgimport')
        ext = os.path.splitext(sourcefile)[1]
        # Keep sensible extension, in particular the compression type
        if ext and len(ext) < 6:
            filename += ext
        sourcefile = sourcefile.encode('utf-8')
        sslcf = SNIFactory(sourcefile) if sourcefile.startswith('https:') else None
        print>>log, "[EPGImport] Downloading: " + sourcefile + " to local path: " + filename
        if self.source.nocheck == 1:
            print>>log, "[EPGImport] Not cheching the server since nocheck is set for it: " + sourcefile
            downloadPage(sourcefile, filename, contextFactory=sslcf).addCallbacks(afterDownload, downloadFail, callbackArgs=(filename, True))
            return filename
        else:
            if self.checkValidServer(sourcefile) == 1:
                downloadPage(sourcefile, filename, contextFactory=sslcf).addCallbacks(afterDownload, downloadFail, callbackArgs=(filename, True))
                return filename
            else:
                self.downloadFail("checkValidServer reject the server")

 

No time-out has been defined.

 

 

I had the same thing happening when updating the downloadsites.

Connecting to the site, but no upload.  curl kept on trying forever.  And the rest of the uploads never happened.

 

I solved this problem by adding the option --max-time 90 to the curl statement for the upload  (and download of the folder contents - to determine the changed files)

 

This 90 is 90 seconds.

 

Of course when the file won't upload, the file is not updated.  But it is updated on the other downloadsites.

 

My request:

 

Provide for the file download in EPGimport a time-out.  90 seconds will niceley do.

And when a file download times out, treat this as a failed download, so EPGimport knows it has to skip this source and go to the next source for this file.

 

 

PS: My python knowledge is very limited.


Edited by doglover, 1 July 2022 - 15:22.

~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: EPGimport - Modification requested #2 WanWizard

  • PLi® Core member
  • 68,581 posts

+1,738
Excellent

Posted 1 July 2022 - 16:28

You can add a

timeout=90

to the argument list of downloadPage(), but the challenge is how to test it.

 

An expired timeout will trigger a fail.


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.


Re: EPGimport - Modification requested #3 scriptmelvin †

  • PLi® Contributor
  • 720 posts

+46
Good

Posted 1 July 2022 - 17:18

Note that that timeout might be the connection timeout: if the connection is made within the timeout but the file takes an hour to download, downloadPage() will happily sit there for an hour before calling the success callback.

 

Testing could be done on, say, a PHP file that outputs some data, then sleeps for longer than the timeout.


Sorry to inform you this member, my brother, passed away.

Re: EPGimport - Modification requested #4 WanWizard

  • PLi® Core member
  • 68,581 posts

+1,738
Excellent

Posted 1 July 2022 - 17:23

According to the docs it should be a socket timeout, so I expected it to timeout if no data is coming in as well.

 

Note that that won't solve the "1 byte per minute" scenario, which seems to be the problem here: it connects to the webserver just fine, but the webserver doesn't timeout. Since it is very unusual to disable timeouts on a webserver, I assume there is some data transferred, just at an extremely low rate.


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.


Re: EPGimport - Modification requested #5 scriptmelvin †

  • PLi® Contributor
  • 720 posts

+46
Good

Posted 1 July 2022 - 18:11

Then a solution could be: schedule a call (with reactor.callLater()) to the cancel() method of the deferred returned by downloadPage() after the timeout.


Sorry to inform you this member, my brother, passed away.

Re: EPGimport - Modification requested #6 scriptmelvin †

  • PLi® Contributor
  • 720 posts

+46
Good

Posted 1 July 2022 - 19:47

Something like this:

from twisted.internet import reactor
from twisted.web.client import downloadPage

def downloadPage2(url, file, contextFactory=None, *args, **kwargs):

    def _cb(result):    
        try:                                
            if timeoutCall.active():            
                timeoutCall.cancel()                
        except NameError:                           
            pass                 
        return result        
                             
    deferred = downloadPage(url, file, contextFactory, *args, **kwargs).addBoth(_cb)    
    if 'timeout' in kwargs:                                                             
        timeoutCall = reactor.callLater(kwargs['timeout'], deferred.cancel)        
    return deferred

Usage just like downloadPage(), just add a timeout as @WanWizard describes.


Sorry to inform you this member, my brother, passed away.

Re: EPGimport - Modification requested #7 doglover

  • Rytec EPG Team
  • 17,010 posts

+639
Excellent

Posted 2 July 2022 - 03:37

tried the suggestion of Wanwizard.

But set a time time out of 6 seconds.

Defined 2 files to be downloaded as alternative.  The first a 25 MB file which normally takes 9-10 seconds to download.

And a 1.8 MB file which takes less than 2 seconds to download.

BTW: both the files are the same.  The large one is the uncompressed one.

 

Changed EPGImport.py as follows:

   def do_download(self, sourcefile, afterDownload, downloadFail):
        path = bigStorage(9000000, '/tmp', '/media/cf', '/media/mmc', '/media/usb', '/media/hdd')
        filename = os.path.join(path, 'epgimport')
        ext = os.path.splitext(sourcefile)[1]
        # Keep sensible extension, in particular the compression type
        if ext and len(ext) < 6:
            filename += ext
        sourcefile = sourcefile.encode('utf-8')
        sslcf = SNIFactory(sourcefile) if sourcefile.startswith('https:') else None
        print>>log, "[EPGImport] Downloading: " + sourcefile + " to local path: " + filename
        if self.source.nocheck == 1:
            print>>log, "[EPGImport] Not cheching the server since nocheck is set for it: " + sourcefile
            downloadPage(sourcefile, filename, contextFactory=sslcf, timeout=6).addCallbacks(afterDownload, downloadFail, callbackArgs=(filename, True))
            return filename
        else:
            if self.checkValidServer(sourcefile) == 1:
                downloadPage(sourcefile, filename, contextFactory=sslcf, timeout=6).addCallbacks(afterDownload, downloadFail, callbackArgs=(filename, True))
                return filename
            else:
                self.downloadFail("checkValidServer reject the server")

Result when importing these files:

[EPGImport] Selected source:  ['Test']
sourcesDone():  False None
[EPGImport] nextImport, source= Test
[EPGImport] Downloading: http://epg.vuplus-community.net/WG_PL_Misc.xml to local path: /tmp/epgimport.xml
[EPGImport] Not cheching the server since nocheck is set for it: http://epg.vuplus-community.net/WG_PL_Misc.xml
[EPGImport] download failed: [Failure instance: Traceback (failure with no frames): <class 'twisted.internet.defer.TimeoutError'>: Getting http://epg.vuplus-community.net/WG_PL_Misc.xml took longer than 6 seconds.
]
[EPGImport] Attempting alternative URL
[EPGImport] Downloading: http://epg.vuplus-community.net/rytecPL_Misc.xz to local path: /tmp/epgimport.xz
[EPGImport] Not cheching the server since nocheck is set for it: http://epg.vuplus-community.net/rytecPL_Misc.xz
[EPGImport] afterDownload /tmp/epgimport.xz
[EPGImport] unlink /tmp/epgimport.xz
[EPGImport] Downloading: http://rytecepg.wanwizard.eu/rytec.channels.xml.xz to local path: /tmp/epgimport.xz
[EPGImport] Not cheching the server since nocheck is set for it: http://rytecepg.wanwizard.eu/rytec.channels.xml.xz
[EPGImport] afterChannelDownload /tmp/epgimport.xz
[EPGImport] Using twisted thread
[EPGImport] Parsing channels from '/etc/epgimport/custom.channels.xml'
[EPGImport] Parsing channels from 'rytec.channels.xml.xz'
[EPGImport] INFO: no channel_id_filter.conf file found.
[EPGImport] Parsing channels from 'rytec.channels.xml.xz'
[EPGImport] INFO: no channel_id_filter.conf file found.
[XMLTVConverter] Enumerating event information
Unknown channel:  travelxp.pl
Unknown channel:  to!tv.pl
Unknown channel:  vh1european.pl
Unknown channel:  familysport.pl
[EPGImport] ### thread is ready ### Events: 34214
[EPGImport] imported 34214 events
[EPGImport] Save last import date and count event
[EPGImport] Run check deep standby after import
[EPGImport] #### Finished ####

Which is the desired result.  (and the result is also nicely verbosed in the log)

 

Of course in the published version the timeout should be set to something like 90, maybe 60.

 

How do we proceed further?

Do I propose a change on the Github?  Or will somebody with more knowledge do this?
 


Edited by doglover, 2 July 2022 - 03:38.

~~Rytec Team~~
Maxytec Multibox SE OpenPli (used as mediaplayer)
Mutant HD2400 OpenPli
Vu+ Duo OpenPli (backup)

Synology NAS

Sat: 13E, 19.2E, 23.5E and 28.2E
*Pli/Rytec EPG POWERED*


Re: EPGimport - Modification requested #8 WanWizard

  • PLi® Core member
  • 68,581 posts

+1,738
Excellent

Posted 2 July 2022 - 13:41

Done. Set the timeout to 90.


Currently in use: VU+ Duo 4K (2xFBC S2), VU+ Solo 4K (1xFBC S2), uClan Usytm 4K Pro (S2+T2), Octagon SF8008 (S2+T2), Zgemma H9.2H (S2+T2)

Due to my bad health, I will not be very active at times and may be slow to respond. I will not read the forum or PM on a regular basis.

Many answers to your question can be found in our new and improved wiki.



0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users