a.) If we are trying to find a
real solution to the core problem, we should split this discussion off into some development thread.
b.) To make sure we all are discussing the same, we should point out some differences between OpenPLi on the one hand and other distros on the other.
b.)
This is the boot process of OpenPLi (and older OpenViX/OpenATV/OpenHDF/...):
From power off:
1. Yocto starts to boot. System time is 1970-01-01 0:00 UTC
2. Stuff that uses certs fails in init with "certificate not valid yet"
3. One-shot NTP sync (in yocto) fails even on networked boxes due to ntpsync called too fast after ifup -> Time remains to be 1970-01-01 0:00 UTC plus current runtime
4. E2 GUI starts, restores the pseudo-RTC from front panel (which also contains 1970-01-01), checks "now < 2004?" -> True, so do a full transponder sync ONCE, no matter if that time is good or not
From reboot/deep standby:
1. Yocto starts to boot. System time is 1970-01-01 0:00 UTC
2. Stuff that uses certs fails in init with "certificate not valid yet"
3. One-shot NTP sync (in yocto) fails even on networked boxes due to ntpsync called too fast after ifup -> Time remains to be 1970-01-01 0:00 UTC plus current runtime
4. E2 GUI starts, restores the pseudo-RTC from front panel, checks "now < 2004?" -> False, so NEVER do a full transponder sync, even if the current system time has been borked by drift before
Problems:
- Cert based stuff on boot fails
- box time can drift wherever it wants to drift to (See the links from one of my previous posts), ONLY recoverable by FULLY POWERING the box OFF
- if the box powers on on a bad transponder, it will get a bad time right from the start and never fix it, also ONLY recoverable by FULLY POWERING the box OFF
This is the boot process in recent OpenViX/OpenATV/OpenHDF/...:
From any state:
1. Yocto starts to boot. System time is 1970-01-01 0:00 UTC
2. In early boot, fake-hwclock sets the system time to image build time or last shutdown
3. A bit later in boot (After the box drivers were loaded), stb-hwclock restores the pseudo-RTC from front panel as long as it is ahead of the time restored by fake-hwclock and sets system time accordingly
4. Stuff that uses certs succeeds in most cases (Unless it's happening between 2. and 3. AND the cert was issued between image build time/last shutdown and the real "now", very unlikely)
5. E2 GUI starts -> ALWAYS does a full transponder sync ONCE, no matter if that time is good or not
Problems:
- box time can drift wherever it wants to drift to (See the links from one of my previous posts), but will be recovered by ANY E2 restart, even only GUI restart
- if the box powers on on a bad transponder, it will get a bad time right from the start and never fix it, also recoverable only by E2 restart (but GUI restart is sufficient)
If you look carefully into any decent Linux distro and/or the yocto core, you will notice that the current OpenATV/OpenHDF/OpenViX/... behaviour is inline with normal Linux behaviour and yocto design.
It was previously (and still is in OpenPLi) behaving differently due to quirks and bugs in oe-a and OpenPLi core:
I would say it's rather safe to assume that having a decently proper system time at boot is a good idea, as
any Linux distro out there does its best in order to maintain this.
In systemd, advancing the system clock to at least systemd build time even is a built-in feature.
Tell me whatever you want, but I can't consider a workaround in E2 that relies on everything around it to fail (NTP one shot, systemd) or to be patched to death (save-rtc.sh) as "best practice".
For that reason, I removed the bad workaround in E2 instead ...
https://github.com/O...e341aaae6a07f77https://github.com/o...151b4ac04616692https://github.com/o...d63387c1889eb1f... because the problem is not that Linux distros give their best to maintain some decent system time but E2 making false assumptions about it.
Tests so far have proven that it's the correct direction, boxes without network (NTP) now also get a time sync (transponder) again ... and we got rid of one more bad workaround.
What's not evaluated so far is, if this might result in one "jump" of the clock under some conditions:
NTP time at boot -> transponder sync on GUI start -> back to NTP time again
The behaviour here might differ between OpenPLi and others though, as from this thread I have learned that OpenPLi users need yet another plugin for NTP, while NTP sync is a built in feature in at least OpenATV, but probably also OpenViX, OpenBH and others:
OpenATV users can toggle between NTP and transponder sync without the need for additional plugins, so selecting NTP disables transponder sync anyways, but only conditional (if the time makes sense).
Some thoughts:
The difference between transponder and box time should be "reconsidered" from time to time. It doesn't make sense to ask the transponder about its time, only to ignore it later.
One idea would to use modulo: Subtract full half hours from the difference, e.g. if the difference is 3800 seconds, it appears more probable to me that the transponder delivers a correct time that is simply in local time, rather than it being entirely wrong.
Subtracting 3600 sec = 1 hr would result in a difference of 200 seconds, which we could still sync with.
If transponder times really are that bad, why sync with them at all? Wouldn't it be more honest then to say "Your cheap-a$$ box vendor was too niggard to include a 20 Cent RTC in your box, you need network for NTP time!"?
If only some transponders deliver a good time, wouldn't it be more elegant to let E2 shortly tune to them at boot, rather than to advise the users to make the box come up with some - maybe unwanted - channel on such a transponder, just to get some good time?
Also: Why not include a list of "known-good" transponders/sat positions for time sync, so that E2 at least re-syncs when switching to them, rather than to preserve the bad offset it stored on first sync on a bad one?
From what I learned so far, bad time sync is a problem that doesn't really exist on Astra or Hotbird = most of our users.
Why does the whole world need to suffer from bad code written for down under (It was mentioned that Aussie sat positions DO deliver bad times)?
Edited by SpaceRat, 27 December 2017 - 15:22.