Over the last few days I have been upgrading from 380.65_4 to 380.69_0 on both my RT-AC68U and RT-AC88U (I always keep both routers on the same firmware version). I believe I have discovered a service dependency loop on the AC88U which only occurs under certain conditions.
The main pieces of this puzzle are:
By the time the fail-safe kicks in (after 12 - 14 minutes) and the USB partition is finally mounted, the services-start script (which also starts Entware) has given up. The script waits only 60 seconds for the USB partition to mount.
To confirm this issue I set ntp_server0 to an IP address (i.e. bypassing DNS). Everything started up normally and worked as expected. There was no delay in the USB mounting which lead to BIND starting correctly.
There are hacks that I could put in place that would work around this (like using an IP address for ntp or starting dnsmasq with DNS functionality then restarting it without DNS just before BIND starts) but these are inelegant and do not address the root cause.
I think this issue may have been introduced in 380.68_0. This is a guess, based on the following Changelog entry.
- FIXED: OpenVPN instances could potentially start too early at
boot time (before clock was set)
Has anyone else experienced this issue? Can anyone else confirm it?
The main pieces of this puzzle are:
- The startup of some services (OpenVPN, USB mount) is now delayed until the built-in ntp has successfully set the system time
- On the AC68U only OpenVPN appears to be delayed
- On the AC88U both OpenVPN and USB mounting are delayed
- If ntp fails to set the system time the services will wait for about 12 - 14 minutes and then start anyway (I assume this is a fail-safe)
- I have disabled the DNS functionality of dnsmasq (dnsmasq.conf.add contains port=0) because I use Entware BIND instead
- I have installed BIND on an ext2 partition on a USB2 storage device (SanDisk Cruzer Fit 8GB)
- dnsmasq starts but DNS functionality is disabled (see 3 above)
- ntp starts and attempts to set the system time (nvram ntp_server0 is set to au.pool.ntp.org) but fails because it cannot resolve the hostname (no DNS)
- USB mounting is delayed because the system time is not yet set
- BIND does not start because the USB mounting is delayed
By the time the fail-safe kicks in (after 12 - 14 minutes) and the USB partition is finally mounted, the services-start script (which also starts Entware) has given up. The script waits only 60 seconds for the USB partition to mount.
To confirm this issue I set ntp_server0 to an IP address (i.e. bypassing DNS). Everything started up normally and worked as expected. There was no delay in the USB mounting which lead to BIND starting correctly.
There are hacks that I could put in place that would work around this (like using an IP address for ntp or starting dnsmasq with DNS functionality then restarting it without DNS just before BIND starts) but these are inelegant and do not address the root cause.
I think this issue may have been introduced in 380.68_0. This is a guess, based on the following Changelog entry.
- FIXED: OpenVPN instances could potentially start too early at
boot time (before clock was set)
Has anyone else experienced this issue? Can anyone else confirm it?