What's new

WANFailover Dual WAN Failover ***v2 Release***

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Since the V2 beta's the option 1 (Status) doesn't work for me, just produces no output and I have to ctrl+c to quit the interface. Everything else seems to work ok though. I tried rolling back to V1.6 and the status works in that version. Router and firmware in my sig.
 
Upgrade V2.0.0-beta4 to latest beta4 and appears to be working correctly (not extensively tested yet).

Only error message I got was:

Code:
/jffs/scripts/wan-failover.sh update
wan-failover is up to date - Version: v2.0.0-beta4
***Checksum Failed***
Current Checksum: 38ca6179201da921283c37b6c313e3c0  Valid Checksum: 867506eaa6a00b80f078b7c284c380cf
wan-failover is up to date. Do you want to reinstall wan-failover Version: v2.0.0-beta4? ***Enter Y for Yes or N for No***
> y
wan-failover: Update - wan-failover has reinstalled version: v2.0.0-beta4
wan-failover: Restart - Restarting wan-failover ***This can take up to approximately 1 minute***
wan-failover: Restart - Killing wan-failover Process ID: 21256
wan-failover: Restart - Killed wan-failover Process ID: 21256
wan-failover: Restart - Killing wan-failover Process ID: 21257
wan-failover: Restart - Killed wan-failover Process ID: 21257
wan-failover: Restart - Waiting for wan-failover to restart from Cron Job
wan-failover: Restart - Successfully Restarted wan-failover Process ID(s): 6630 6633
/jffs/scripts/wan-failover.sh: line 5833: syntax error: unterminated quoted string

Option 1(status) works for me, although my "Active DNS Servers" shows nothing. Is this because I run my own DHCP and DNS Servers?

My "WAN DNS Setting | DNS Server" is populated with my local DNS Servers. Does it get its information from there?
 
Does this allow me to have different QoS settings then for my original WAN and the secondary line? Since my secondary line will be an LTE modem and I prefer not having QoS or having the bw limiter be auto instead of capped like my main line. If it does work, will the QoS different configurations conflict with flexqos? Thank you!!
 
Since the V2 beta's the option 1 (Status) doesn't work for me, just produces no output and I have to ctrl+c to quit the interface. Everything else seems to work ok though. I tried rolling back to V1.6 and the status works in that version. Router and firmware in my sig.
Send me the debug logs so I can look and see what is going on.
 
Does this allow me to have different QoS settings then for my original WAN and the secondary line? Since my secondary line will be an LTE modem and I prefer not having QoS or having the bw limiter be auto instead of capped like my main line. If it does work, will the QoS different configurations conflict with flexqos? Thank you!!
Yes, you will configure QoS settings for both and during failovers it will reapply the new settings and restart QoS service.
 
Yes, you will configure QoS settings for both and during failovers it will reapply the new settings and restart QoS service.
I see. Does it work with flexqos or will it conflict with it? I have an old cat 4 mifi which should work with USB. I'm trying to use it as a failover just in case my main line goes down. At least we'll have internet. Also I think being able to run a script to send a message letting us know if say the internet is down and it's been moved to the failover or if the main line is back and it has successfully moved to main line would be nice. I think using telegram is the easiest way to do this. I hope you'll implement this in the future :) thank you
 
I see. Does it work with flexqos or will it conflict with it? I have an old cat 4 mifi which should work with USB. I'm trying to use it as a failover just in case my main line goes down. At least we'll have internet. Also I think being able to run a script to send a message letting us know if say the internet is down and it's been moved to the failover or if the main line is back and it has successfully moved to main line would be nice. I think using telegram is the easiest way to do this. I hope you'll implement this in the future :) thank you
I already use amtm email notifications for failover events and shouldn’t conflict with FlexQoS
 
I see. I'll see if I can find my modem then. Also is the script's thread always changing? Since I found many threads with different versions. Thanks!
My original thread is locked from posting to so I'm not able to comment there and this is a beta release but this is the current thread at the moment.
 
Upgrade V2.0.0-beta4 to latest beta4 and appears to be working correctly (not extensively tested yet).

Only error message I got was:

Code:
/jffs/scripts/wan-failover.sh update
wan-failover is up to date - Version: v2.0.0-beta4
***Checksum Failed***
Current Checksum: 38ca6179201da921283c37b6c313e3c0  Valid Checksum: 867506eaa6a00b80f078b7c284c380cf
wan-failover is up to date. Do you want to reinstall wan-failover Version: v2.0.0-beta4? ***Enter Y for Yes or N for No***
> y
wan-failover: Update - wan-failover has reinstalled version: v2.0.0-beta4
wan-failover: Restart - Restarting wan-failover ***This can take up to approximately 1 minute***
wan-failover: Restart - Killing wan-failover Process ID: 21256
wan-failover: Restart - Killed wan-failover Process ID: 21256
wan-failover: Restart - Killing wan-failover Process ID: 21257
wan-failover: Restart - Killed wan-failover Process ID: 21257
wan-failover: Restart - Waiting for wan-failover to restart from Cron Job
wan-failover: Restart - Successfully Restarted wan-failover Process ID(s): 6630 6633
/jffs/scripts/wan-failover.sh: line 5833: syntax error: unterminated quoted string

Option 1(status) works for me, although my "Active DNS Servers" shows nothing. Is this because I run my own DHCP and DNS Servers?

My "WAN DNS Setting | DNS Server" is populated with my local DNS Servers. Does it get its information from there?
That would be likely why , do you not have DNS configured on your router at all? The checksum failure just means your version of the script doesn't match the latest available or it's modified. I'm trying to hunt down the unterminated quoted string but I'm not seeing it as of now.
 
My original thread is locked from posting to so I'm not able to comment there and this is a beta release but this is the current thread at the moment.
Hey. Another question. Is it fine to just ping 1 address to decide whether WAN0 or 1 is up or down? Since sometimes Google or even cloudflare IP can go down. Or sometimes the ISPs link to those certain servers can go down.

Isn't it better to have each wan ping 2 addresses? And if both times out then it's 100% sure down?

Also, my internet sometimes goes down for 10 secs or 60 secs. Depends. When that happens. Does the script still work properly? As in it won't conflict with the current progress running (for example it goes up while it's changing the QoS settings to the secondary one and before it finishes, the script starts another cycle of moving everything to the main WAN which will apply the original QoS values etc)
 
Hey. Another question. Is it fine to just ping 1 address to decide whether WAN0 or 1 is up or down? Since sometimes Google or even cloudflare IP can go down. Or sometimes the ISPs link to those certain servers can go down.

Isn't it better to have each wan ping 2 addresses? And if both times out then it's 100% sure down?

Also, my internet sometimes goes down for 10 secs or 60 secs. Depends. When that happens. Does the script still work properly? As in it won't conflict with the current progress running (for example it goes up while it's changing the QoS settings to the secondary one and before it finishes, the script starts another cycle of moving everything to the main WAN which will apply the original QoS values etc)
You will want to monitor both, I would suggest increasing PINGCOUNT and maybe increase RECURSIVEPINGCHECK (refer to readme). If it is in the middle of a failover it will not start a fail back until the failover process is complete and then checks status of each interface again and then another failover would be possible.
 
You will want to monitor both, I would suggest increasing PINGCOUNT and maybe increase RECURSIVEPINGCHECK (refer to readme). If it is in the middle of a failover it will not start a fail back until the failover process is complete and then checks status of each interface again and then another failover would be possible.
Thank you for the fast reply!! For the first question. That's not what I meant. I mean sometimes the target address is down. Like down down. Sometimes it's either the target itself or just the ISP link to that target. Doesn't that mean it'll trigger a fake internet down situation? My suggestion is to add 2 targets for each WAN connection. So total of 4 ping each time for 2 WAN . That way. If a target server is down, but the second one is working fine. The script will know that it's not that the wan is down. It's the target server. That way it's more resilient. Thank you for the script.



Also, I found my old USB modem. It works on USB on my Mac without drivers. I haven't tried it on the router though. Since I don't have a sim card to try it with yet. Can I check compatibility without a sim card by just plugging it in the USB port? It's a MiFi so technically it has DHCP too. I'm still confused though whether it's considered a USB modem or an Android USB since it tethers like a phone would on my laptop (has DHCP for WebGUI access etc)
 
Thank you for the fast reply!! For the first question. That's not what I meant. I mean sometimes the target address is down. Like down down. Sometimes it's either the target itself or just the ISP link to that target. Doesn't that mean it'll trigger a fake internet down situation? My suggestion is to add 2 targets for each WAN connection. So total of 4 ping each time for 2 WAN . That way. If a target server is down, but the second one is working fine. The script will know that it's not that the wan is down. It's the target server. That way it's more resilient. Thank you for the script.



Also, I found my old USB modem. It works on USB on my Mac without drivers. I haven't tried it on the router though. Since I don't have a sim card to try it with yet. Can I check compatibility without a sim card by just plugging it in the USB port? It's a MiFi so technically it has DHCP too. I'm still confused though whether it's considered a USB modem or an Android USB since it tethers like a phone would on my laptop (has DHCP for WebGUI access etc)
I have implemented and managed load balancers, firewalls, and routers in enterprise environments using HA configurations and the monitoring IPs have most always been Google and I can’t recall a time they went down and caused a false positive but I have seen a circuit have a partial outage that didn’t effect all traffic but it did effect some. I don’t think the extra monitoring is necessary in this sense and even if a false positive occurs and you have a failover, you’re network is still up and running. I have been streaming or in meetings when a failover has happened and it very quick and only has a slight blimp so I don’t necessarily think the reward is worth it considering the small risk and impact.
 
I have implemented and managed load balancers, firewalls, and routers in enterprise environments using HA configurations and the monitoring IPs have most always been Google and I can’t recall a time they went down and caused a false positive but I have seen a circuit have a partial outage that didn’t effect all traffic but it did effect some. I don’t think the extra monitoring is necessary in this sense and even if a false positive occurs and you have a failover, you’re network is still up and running. I have been streaming or in meetings when a failover has happened and it very quick and only has a slight blimp so I don’t necessarily think the reward is worth it considering the small risk and impact.
Thanks for the info. Also I'm just wondering in case you know. Does spdmerlin, connmon and vnstat all use the dual WAN or does it not? I do remember that spdmerlin has WAN selection to Speedtest and vnstat uses say eth0 by default. Does those change or will they just stop working until it goes back to the normal WAN?

Also sorry if it's out of topic, but do you have experience with USB dual WAN? If the MiFi USB tethering gives a DHCP address etc. Is it technically a USB android then for the built in Asus option?

Thanks!!
 
@Ranger802004 I tried installing the script. but it got stuck at : Getting System Settings... . Tried rebooting etc same thing. I'm on AC86U with 386.9. The first time it does the getting system settings i can see a init process hogging the cpu for awhile. then after a few sec it just stopped. and then it froze the terminal. can't even CTRL+C .
 
@Ranger802004 I tried installing the script. but it got stuck at : Getting System Settings... . Tried rebooting etc same thing. I'm on AC86U with 386.9. The first time it does the getting system settings i can see a init process hogging the cpu for awhile. then after a few sec it just stopped. and then it froze the terminal. can't even CTRL+C .
I need debug logging so I can see where it’s getting stuck. I’m working on a final beta revision so I’d like to include this.
 
Last edited:
@Ranger802004 ,

I am that edge case user - I am using wan-failover as part of a speedtest based failover...
Thanks for the option to disable the cron job. Works great,

With the previous release (1.6.1-beta3) I have a script that will call wan-failover switchwan when my primary (Starlink) drops to 11Mb or less). This switches to WAN1 (DSL 12Mbit). I then sleep 10 and call wan-failover kill. This has been working fine.

With this beta, I must be doing something incorrect. I run wan-failover, and from the menu select config and disable the Cron Job (23).

I tried wan-failover initiate. The script hangs. I have done some debugging.

initiate calls systemcheck() which immediately calls getsystemparameters().
In my case, the script hangs in this area (in getsystemparameters):


Code:
while [ -z "${systemparameterssync+x}" ] >/dev/null 2>&1 || [[ "$systemparameterssync" == "0" ]] >/dev/null 2>&1;do
  if [ -z "${systemparameterssync+x}" ] >/dev/null 2>&1;then

Any ideas?
 
I need debug logging so I can see where it’s getting stuck. I’m working on a final beta revision so I’d like to include this.
1677855920484.png

1677856000163.png


And again, CTRL+C doesn't work on the active window.
1677856031939.png



Let me know if you need anything !!
 
@Ranger802004 ,

I am that edge case user - I am using wan-failover as part of a speedtest based failover...
Thanks for the option to disable the cron job. Works great,

With the previous release (1.6.1-beta3) I have a script that will call wan-failover switchwan when my primary (Starlink) drops to 11Mb or less). This switches to WAN1 (DSL 12Mbit). I then sleep 10 and call wan-failover kill. This has been working fine.

With this beta, I must be doing something incorrect. I run wan-failover, and from the menu select config and disable the Cron Job (23).

I tried wan-failover initiate. The script hangs. I have done some debugging.

initiate calls systemcheck() which immediately calls getsystemparameters().
In my case, the script hangs in this area (in getsystemparameters):


Code:
while [ -z "${systemparameterssync+x}" ] >/dev/null 2>&1 || [[ "$systemparameterssync" == "0" ]] >/dev/null 2>&1;do
  if [ -z "${systemparameterssync+x}" ] >/dev/null 2>&1;then

Any ideas?
I just looked at the code. And yeah it might be that mine is stuck there too. When it's stuck. Does CTRL+C work?
 

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top