I think it was me in terms of speculation about different behaviour.
Before in router log we saw:
Code:
Jul 16 18:08:11 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link DOWN.
Jul 16 18:08:12 WAN_Connection: ISP's DHCP did not function properly.
Jul 16 18:08:15 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link Up at 1000 mbps full duplex
Jul 16 18:08:18 WAN_Connection: WAN(0) link up.
Jul 16 18:08:18 rc_service: wanduck 1119:notify_rc restart_wan_if 0
Jul 16 18:08:18 lldpd[1128]: removal request for address of 10.241.204.243%11, but no knowledge of it
Based on instances of 'restart_wan_if' in 'wanduck.c' I think this corresponds to this segment from line 2384 of wanduck.c:
C:
current_state[wan_unit] = get_wan_state(wan_unit);
if(current_state[wan_unit] != WAN_STATE_CONNECTING && current_state[wan_unit] != WAN_STATE_CONNECTED && current_state[wan_unit] != WAN_STATE_DISABLED){
snprintf(cmd, sizeof(cmd), "restart_wan_if %d", wan_unit);
_dprintf("wanduck2: %s.\n", cmd);
notify_rc_and_wait(cmd);
}
Enhanced version of Asus's router firmware (Asuswrt) (legacy code base) - RMerl/asuswrt-merlin
github.com
Whereas now, despite same thing - eth0 down for 3 seconds - in router log we get only:
Code:
Aug 7 23:01:27 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link DOWN.
Aug 7 23:01:30 kernel: eth0 (Int switch port: 3) (Logical Port: 3) (phyId: c) Link Up at 1000 mbps full duplex
Yet I see that wanduck is running:
Code:
admin@RT-AX86U-4168:/tmp/home/root# ps |grep -i wan
1119 admin 11688 S /sbin/wanduck
6839 admin 5436 S grep -i wan
What explains why 'wanduck' didn't operate this time? Does this 'wanduck' routine need to detect the actual downtime to trigger restart, as in detect 'eth0 down' in the short space of time between eth0 down and eth0 up? If so, then perhaps john9527 is onto something here:
OK, I found it.
The problem is that the kernel saw the drop, but the router firmware (wanduck) never saw it because it was so quick. I'd have to double check, but I think it' may be a polling operation to detect WAN down.
The sequence should look like that in
post #17
Could it just be it happened too quickly for the detection routine? Can I increase the polling frequency? I can't figure out from 'wanduck.c' what controls the polling frequency.
Or could it be Diversion blocks it?
I think a reasonable guess is that upon modem refresh and eth0 down and up then router ought to restart WAN, as it tried to first time round (even though this failed for some reason - but I was using redirect all: yes before, whereas now I'm using VPN Director - so maybe that helps on that front).
That's interesting about modem subnet. I suppose I cannot change that because it is set by ISP? Or is it set by router or itself? That's confusing for me.
Next time I will try to see if I can log into modem given the issue.
I have also enabled the /jffs/scripts/dhcpc-event tweak to get extra info.
With network monitoring looks like this can only be enabled if 'dual wan' is set. And then I would put in say 8.8.8.8 or something?
I'd really like to figure out what caused wanduck to fail though - why it ran the first time but not the second time.