What's new

WANFailover Dual WAN Failover Script

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Here's another one. Today I restarted the router, the script tried to start for 10 minutes. Moreover, there are definitely no problems with the Internet, because after it gets into working mode there are no errors at all, the connection is super ping up to 1.1.1.1 10ms, there is no lags on the line, speed 1gb


The router used to be ax86u, now tuf is ax5400. I understand that the problem is exactly in the provider. But apart from the script, everything works perfectly
I have another router with the same provider, but with a static IP, everything is OK. Another router with a different provider (but also with a static ip) also everything is ok.

Hmmm I wonder if your provider has DDoS protections on newly leased DHCP leases. This would prevent ICMP for a short amount of time.
 
Hmmm I wonder if your provider has DDoS protections on newly leased DHCP leases. This would prevent ICMP for a short amount of time.
Maybe I don't know. But again, the same provider is in another house, but the ip is not internal (10.x.x.x nat) but static, external, everything is ok.

I already thought the router was to blame, changed, reset, but everything is the same. tomorrow I will return the old ax86
 
Maybe I don't know. But again, the same provider is in another house, but the ip is not internal (10.x.x.x nat) but static, external, everything is ok.

I already thought the router was to blame, changed, reset, but everything is the same. tomorrow I will return the old ax86
Try testing with a static IP? That would prove the DDoS protection for new DHCP leases possibly?
 
Try testing with a static IP? That would prove the DDoS protection for new DHCP leases possibly?
I will have to buy an external ip for a year . It's not a cheap treat ;) It's easier to change the provider in a problem house ;) Maybe you can still somehow try to identify the problem, or at least put a pause on check on wan0 after changing the wan1 to wan0 .(at least for test purposes). Because the delay at the start of the router does not help, I have already set 180 seconds, everything is the same.
 
I will have to buy an external ip for a year . It's not a cheap treat ;) It's easier to change the provider in a problem house ;) Maybe you can still somehow try to identify the problem, or at least put a pause on check on wan0 after changing the wan1 to wan0 .(at least for test purposes). Because the delay at the start of the router does not help, I have already set 180 seconds, everything is the same.
Perhaps a Failback monitor delay option? So when the router boots up it is fine?
 
Perhaps a Failback monitor delay option? So when the router boots up it is fine?
This is the initial launch. It seems to me that this is all happening because line2 is in cold mode, the script makes it hot, that is, it switches to wan1, and then tries to switch to wan0 and this happens. It seems that way to me. I would say not a Failback monitor delay, but a delay monitoring WAN0 after changing wan1 to wan0. Do you understand? The script saw that wan0 was working, switched to it and let's say the specified time in the config does not check its operability, i.e. it does not put it on monitoring.
 
This is the initial launch. It seems to me that this is all happening because line2 is in cold mode, the script makes it hot, that is, it switches to wan1, and then tries to switch to wan0 and this happens. It seems that way to me. I would say not a Failback monitor delay, but a delay monitoring WAN0 after changing wan1 to wan0. Do you understand? The script saw that wan0 was working, switched to it and let's say the specified time in the config does not check its operability, i.e. it does not put it on monitoring.

The firmware comes up with WAN1 as active and the Failback switches it to WAN0, my script does the same because this is how the firmware loads up Dual WAN. Try increasing your Boot Delay Timer to 10 minutes (600).
 
The firmware comes up with WAN1 as active and the Failback switches it to WAN0, my script does the same because this is how the firmware loads up Dual WAN. Try increasing your Boot Delay Timer to 10 minutes (600).
Well, I understand. But still, if you can, try to implement a delay in monitoring the main line after the script saw that the line was working and switched to it. Let's say he sees that WAN0 is connected and pings normally, returns to the main line and then waits for the specified time, and after the specified time has passed, he tests wan0 again and already puts it on a permanent check to detect a failure.

Maybe it will help.

I'll do 600 seconds right now
 
Well, I understand. But still, if you can, try to implement a delay in monitoring the main line after the script saw that the line was working and switched to it. Let's say he sees that WAN0 is connected and pings normally, returns to the main line and then waits for the specified time, and after the specified time has passed, he tests wan0 again and already puts it on a permanent check to detect a failure.

Maybe it will help.
Let’s start with the boot delay timer increase for now and go from there. If we need to add an adjustable setting then we can look into that but let’s see what we can do with the settings you have now.
 
Let’s start with the boot delay timer increase for now and go from there. If we need to add an adjustable setting then we can look into that but let’s see what we can do with the settings you have now.
OK, I set
WANDISABLEDSLEEPTIMER=90
BOOTDELAYTIMER=600

and restarted the router

Code:
Jul 17 00:29:00 wan-failover.sh: Debug - Locked File: /var/lock/wan-failover.lock
Jul 17 00:29:00 wan-failover.sh: Debug - Trap set to remove /var/lock/wan-failover.lock on exit
Jul 17 00:29:00 wan-failover.sh: Debug - Script Mode: run
Jul 17 00:29:00 wan-failover.sh: Debug - Function: systemcheck
Jul 17 00:29:00 wan-failover.sh: Debug - Log Level: 7
Jul 17 00:29:00 wan-failover.sh: Process ID - 3620
Jul 17 00:29:00 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:00 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:00 wan-failover.sh: Version - v1.5.5
Jul 17 00:29:00 wan-failover.sh: Debug - Firmware: 386.5
Jul 17 00:29:00 wan-failover.sh: Debug - Function: setvariables
Jul 17 00:29:00 wan-failover.sh: Debug - Reading /jffs/configs/wan-failover.conf
Jul 17 00:29:00 wan-failover.sh: Debug - Checking for missing configuration options
Jul 17 00:29:01 wan-failover.sh: Debug - Reading /jffs/configs/wan-failover.conf
Jul 17 00:29:01 wan-failover.sh: Debug - Function: debuglog
Jul 17 00:29:01 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:01 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:01 wan-failover.sh: Debug - Dual WAN Mode: fo
Jul 17 00:29:01 wan-failover.sh: Debug - Dual WAN Interfaces: wan lan
Jul 17 00:29:01 wan-failover.sh: Debug - ASUS Factory Watchdog: 0
Jul 17 00:29:01 wan-failover.sh: Debug - Firewall Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - LEDs Disabled: 0
Jul 17 00:29:01 wan-failover.sh: Debug - QoS Enabled: 0
Jul 17 00:29:01 wan-failover.sh: Debug - DDNS Hostname:
Jul 17 00:29:01 wan-failover.sh: Debug - LAN Hostname: TUF-AX5400-0CB0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN IPv6 Address:
Jul 17 00:29:01 wan-failover.sh: Debug - Default Route: default via 10.128.131.254 dev eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Routing Table Default Route:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Target IP Rule:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 IP Address: 10.128.131.80
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Real IP Address: 46.32.87.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Real IP Address State: 2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Gateway IP: 10.128.131.254
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Gateway Interface: eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Interface: eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 State: 2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Primary Status: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Target IP Address: 1.1.1.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Routing Table: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 IP Rule Priority: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Mark: 0x80000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Mask: 0xf0000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 From WAN Priority: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 To WAN Priority: 400
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Routing Table Default Route:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Target IP Rule:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 IP Address: 10.100.1.2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Real IP Address:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Real IP Address State: 0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Gateway IP: 10.100.1.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Gateway Interface: eth0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Interface: eth0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 State: 4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Primary Status: 0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Target IP Address: 1.0.0.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Routing Table: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 IP Rule Priority: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Mark: 0x90000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Mask: 0xf0000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 From WAN Priority: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 To WAN Priority: 400
Jul 17 00:29:01 wan-failover.sh: Debug - Function: wanstatus
Jul 17 00:29:01 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:01 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:01 wan-failover.sh: Debug - System Uptime: 77 Seconds
Jul 17 00:29:01 wan-failover.sh: Debug - Boot Delay Timer: 600 Seconds
Jul 17 00:29:01 wan-failover.sh: Boot Delay - Waiting for System Uptime to reach 600 seconds
Jul 17 00:29:18 WAN_Connection: WAN was restored.



The script takes time. wan1 is cold.
 
OK, I set
WANDISABLEDSLEEPTIMER=90
BOOTDELAYTIMER=600

and restarted the router

Code:
Jul 17 00:29:00 wan-failover.sh: Debug - Locked File: /var/lock/wan-failover.lock
Jul 17 00:29:00 wan-failover.sh: Debug - Trap set to remove /var/lock/wan-failover.lock on exit
Jul 17 00:29:00 wan-failover.sh: Debug - Script Mode: run
Jul 17 00:29:00 wan-failover.sh: Debug - Function: systemcheck
Jul 17 00:29:00 wan-failover.sh: Debug - Log Level: 7
Jul 17 00:29:00 wan-failover.sh: Process ID - 3620
Jul 17 00:29:00 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:00 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:00 wan-failover.sh: Version - v1.5.5
Jul 17 00:29:00 wan-failover.sh: Debug - Firmware: 386.5
Jul 17 00:29:00 wan-failover.sh: Debug - Function: setvariables
Jul 17 00:29:00 wan-failover.sh: Debug - Reading /jffs/configs/wan-failover.conf
Jul 17 00:29:00 wan-failover.sh: Debug - Checking for missing configuration options
Jul 17 00:29:01 wan-failover.sh: Debug - Reading /jffs/configs/wan-failover.conf
Jul 17 00:29:01 wan-failover.sh: Debug - Function: debuglog
Jul 17 00:29:01 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:01 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:01 wan-failover.sh: Debug - Dual WAN Mode: fo
Jul 17 00:29:01 wan-failover.sh: Debug - Dual WAN Interfaces: wan lan
Jul 17 00:29:01 wan-failover.sh: Debug - ASUS Factory Watchdog: 0
Jul 17 00:29:01 wan-failover.sh: Debug - Firewall Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - LEDs Disabled: 0
Jul 17 00:29:01 wan-failover.sh: Debug - QoS Enabled: 0
Jul 17 00:29:01 wan-failover.sh: Debug - DDNS Hostname:
Jul 17 00:29:01 wan-failover.sh: Debug - LAN Hostname: TUF-AX5400-0CB0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN IPv6 Address:
Jul 17 00:29:01 wan-failover.sh: Debug - Default Route: default via 10.128.131.254 dev eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Routing Table Default Route:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Target IP Rule:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 IP Address: 10.128.131.80
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Real IP Address: 46.32.87.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Real IP Address State: 2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Gateway IP: 10.128.131.254
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Gateway Interface: eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Interface: eth4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 State: 2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Primary Status: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Target IP Address: 1.1.1.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Routing Table: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 IP Rule Priority: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Mark: 0x80000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 Mask: 0xf0000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 From WAN Priority: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN0 To WAN Priority: 400
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Enabled: 1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Routing Table Default Route:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Target IP Rule:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 IP Address: 10.100.1.2
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Real IP Address:
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Real IP Address State: 0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Gateway IP: 10.100.1.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Gateway Interface: eth0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Interface: eth0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 State: 4
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Primary Status: 0
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Target IP Address: 1.0.0.1
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Routing Table: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 IP Rule Priority: 100
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Mark: 0x90000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 Mask: 0xf0000000
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 From WAN Priority: 200
Jul 17 00:29:01 wan-failover.sh: Debug - WAN1 To WAN Priority: 400
Jul 17 00:29:01 wan-failover.sh: Debug - Function: wanstatus
Jul 17 00:29:01 wan-failover.sh: Debug - Function: nvramcheck
Jul 17 00:29:01 wan-failover.sh: Debug - ***NVRAM Check Passed***
Jul 17 00:29:01 wan-failover.sh: Debug - System Uptime: 77 Seconds
Jul 17 00:29:01 wan-failover.sh: Debug - Boot Delay Timer: 600 Seconds
Jul 17 00:29:01 wan-failover.sh: Boot Delay - Waiting for System Uptime to reach 600 seconds
Jul 17 00:29:18 WAN_Connection: WAN was restored.



The script takes time. wan1 is cold.

Yes it is delayed for 600 seconds before it does anything because of the boot delay timer. The script itself will execute but it will not perform any checks or WAN functions until delay timer is over. I do see your WAN0 is connected properly and WAN1 is not though.
 
Yes it is delayed for 600 seconds before it does anything because of the boot delay timer. The script itself will execute but it will not perform any checks or WAN functions until delay timer is over. I do see your WAN0 is connected properly and WAN1 is not though.
Here's the sequel. Not as long as last time, but still not perfect.

I pinged 1.1.1.1 via the router interface after reboot, but before continuing the script, everything was pinged normally.

If interested, I can attach a log with the same provider and the same wan1 device from another router where everything is ok
 
Last edited:
Here's the sequel. Not as long as last time, but still not perfect.

That definitely looks like your provider has some DDoS protections for new DHCP leases which restricts ICMP for a short amount of time. Once the boot delay timer expired your script went through WAN Status checks and into Failover monitor within 60 seconds which isn’t bad it had to restart WAN1 interface during checks. You can try decreasing the boot delay timer in 60 second intervals until you start experiencing the issues again and that way you can get it as low as possible.
 
That definitely looks like your provider has some DDoS protections for new DHCP leases which restricts ICMP for a short amount of time. Once the boot delay timer expired your script went through WAN Status checks and into Failover monitor within 60 seconds which isn’t bad it had to restart WAN1 interface during checks. You can try decreasing the boot delay timer in 60 second intervals until you start experiencing the issues again and that way you can get it as low as possible.
Yes, I will try to find the optimal value. But probably tomorrow. Now I have 1am :)

Well, in any case, it turns out that right now I can't use the script on this router, because such long switches are obtained not only at the time of the initial boot, but also if wan0 suddenly fails and recovers after some time. in the same way, the return to it occurs with such unnecessary switching back and forth. :(

It seems to me that it does not depend much on the initial download time at all. The first switch is always failed, and then how lucky

 
Yes, I will try to find the optimal value. But probably tomorrow. Now I have 1am :)

Well, in any case, it turns out that right now I can't use the script on this router, because such long switches are obtained not only at the time of the initial boot, but also if wan0 suddenly fails and recovers after some time. in the same way, the return to it occurs with such unnecessary switching back and forth. :(

It seems to me that it does not depend much on the initial download time at all. The first switch is always failed, and then how lucky


You decreased the boot delay timer back to 180 seconds, also for some reason your router keeps losing its connection to WAN1 I noticed in the debug logging. It is going to State 4 which means it won’t have an IP / Gateway and is disconnected. Any idea why that might be?
 
You decreased the boot delay timer back to 180 seconds, also for some reason your router keeps losing its connection to WAN1 I noticed in the debug logging. It is going to State 4 which means it won’t have an IP / Gateway and is disconnected. Any idea why that might be?

This is how it happens on my main router. (The provider is the same, the wan1 device is the same)

Tomorrow I will return to the problematic ax86 location, once again I will thoroughly compare all the settings.
 
Last edited:
You decreased the boot delay timer back to 180 seconds, also for some reason your router keeps losing its connection to WAN1 I noticed in the debug logging. It is going to State 4 which means it won’t have an IP / Gateway and is disconnected. Any idea why that might be?
It seems to have been fixed :) I made wan1 not get an ip, but a static one, and everything loaded perfectly ;)

Although no, there is the first 100% packet loss, but it does not affect further work. In short, tomorrow I'll make up my mind with the settings.
 
Last edited:
It seems to have been fixed :) I made wan1 not get an ip, but a static one, and everything loaded perfectly ;)

Although no, there is the first 100% packet loss, but it does not affect further work. In short, tomorrow I'll make up my mind with the settings.

That supports my theory of DDoS protection from your ISP on new DHCP leases.
 
That supports my theory of DDoS protection from your ISP on new DHCP leases.
Well, maybe then it's worth adding the ability to make a delay of manitoring only the recently raised interface? Please think about it, maybe you can just implement it somehow. But I repeat, I made a static ip for WAN1 (for my device) and not for wan0 (provider). wan0 I'm afraid to change the settings remotely right now.
And then it turns out a vicious circle. connectred-monitoring-error--disconnected--connected---monitoring--error--e.t.c.
 
Last edited:
Well, maybe then it's worth adding the ability to make a delay of manitoring only the recently raised interface? Please think about it, maybe you can just implement it somehow. But I repeat, I made a static ip for WAN1 (for my device) and not for wan0 (provider). wan0 I'm afraid to change the settings remotely right now.
And then it turns out a vicious circle. connectred-monitoring-error--disconnected--connected---monitoring--error--e.t.c.

I’m trying to think of a way to implement this that would be sufficient yet simple and not confusing for the general population using the script. It would be easy for me to create a new variable in config and give it a value of 0 and tell you to adjust that setting? I am not saying no I’m just thinking about it before I proceed forward, I like to keep the script as user friendly as possible but allow flexibility for all users. It is my opinion we need to test some more things first before jumping to a new setting.
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top