What's new

WANFailover Dual WAN Failover Script

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

OK, here is the output:


And:


Thank you for your patience, but I give up now.
It's cut off here:
  1. Aug 13 17:30:21 wan-failover.sh: WAN Status - Adding default route for wan1 Routing Table via 192.168.8.1 dev eth7
  2. Aug 13 17:30:21 wan-failover.sh: WAN Status - Added default route for wan1 Routing Table via 192.168.8.1 dev eth7
  3. Aug 13 17:30:21 wan-failover.sh: Debug - Checking wan1 for IP Rule to 9.9.9.9
  4. Aug 13 17:30:21 wan-failover.sh: WAN Status - Adding IP Rule for 9.9.9.9 to monitor wan1
  5. Aug 13 17:30:21 wan-failover.sh: WAN Status - Added IP Rule for 9.9.9.9 to monitor wan1
  6. Aug 13 17:30:21 wan-failover.sh: Debug - Recursive Ping Check: 1
  7. Aug 13 17:30:21 wan-failover.sh: Debug - Checking wan1 for packet loss via 9.9.9.9 - Attempt: 1
 
Need logs please
I'll update in the morning and send logs. Rolled back to vanilla beta14 and no more loops!

Sorry @Ranger802004 - gonna shut it down now.

Back to 14b and still no loops. So hope that helps in tracking this down.
 
Last edited:
Back to 14b and still no loops. So hope that helps in tracking this down.

NOOOOO !!

"You'll have pry my Beta 14e from my Cold Dead Hands" :eek:

Beta 14b was a regression for me, due to issues with not detecting my 4G LTE USB stick removal/replacement (Secondary WAN).
The latest Beta 14 d/e iterations fixed this, are VERY usable and address 95% of my (very) minor issues ...
 
I'll update in the morning and send logs. Rolled back to vanilla beta14 and no more loops!

Sorry @Ranger802004 - gonna shut it down now.

Back to 14b and still no loops. So hope that helps in tracking this down.
Where is it looping? I really need to see the debug logging to determine what is going on with your situation.
 
Last edited:
Aug 14 10:21:00 wan-failover.sh: System Check - ***386.8 is not supported, issues may occur from running this version***

Works great on this version. I think you can add it to the supported ones.
 
Aug 14 10:21:00 wan-failover.sh: System Check - ***386.8 is not supported, issues may occur from running this version***

Works great on this version. I think you can add it to the supported ones.
That log is just a warning so I'll allow more time for proper testing first.
 
here is it looping? I really need to see the debug logging to determine what is going on with your situation.

Good day,

Here you go.

- Was running 14b overnight as mentioned.
- Logged on around 7:57
- Did a restart of the service with 14e:

Code:
Aug 14 08:00:00 router wan-failover.sh: System Check - Version: v1.5.6-beta14e

- I don't think it should take over 4 mins to restart
- Looped until I reverted back to 14b:

Code:
Aug 14 08:04:00 router wan-failover.sh: System Check - Version: v1.5.6-beta14b

- This restart stabilized in about 12 seconds

Note: 14b doesn't look like it is applying/logging the QoS settings as v13 once did, but it isn't looping my logger, so I'll pause here til we can figure out what has changed/root cause.


Hope it helps, and sorry about the delay.
 
Last edited:
Good day,

Here you go.

- Was running 14b overnight as mentioned.
- Logged on around 7:57
- Did a restart of the service with 14e:

Code:
Aug 14 08:00:00 router wan-failover.sh: System Check - Version: v1.5.6-beta14e

- I don't think it should take over 4 mins to restart
- Looped until I reverted back to 14b:

Code:
Aug 14 08:04:00 router wan-failover.sh: System Check - Version: v1.5.6-beta14b

- This restart stabilized in about 12 seconds

Note: 14b doesn't look like it is applying/logging the QoS settings as v13 once did, but it isn't looping my logger, so I'll pause here til we can figure out what has changed/root cause.

Logs: https://1drv.ms/t/s!Agv7JzQ1nx3W6yKOR-zLr4yLEgEg?e=i8nO0w (stuck it on OneDrive for now).

Hope it helps, and sorry about the delay.
It looks like your WAN1 interface is unable to ping the WAN1 Target IP even on 14b, I'm assuming 14e is looping because it is going to WAN Disabled and seeing that both interfaces are Enabled, State 2, and plugged in so it's attempting to ping and returning to WAN Status to make sure IP Rule and Routes are in place. The differences between 14b and 14e won't effect you being able to ping, that is a different issue all together.
 
WAN1 interface is unable to ping the WAN1 Target IP

Well WAN1 is in Hot Standby. I tested and it can ping the Target but not in Standby.

1660511216224.png
1660511306637.png
1660511425964.png


As per the logs. I can install v13 and see if that behaviour changes if that helps. Thoughts appreciated.
 
I'm assuming 14e is looping because it is going to WAN Disabled and seeing that both interfaces are Enabled, State 2, and plugged in so it's attempting to ping and returning to WAN Status to make sure IP Rule and Routes are in place. The differences between 14b and 14e won't effect you being able to ping, that is a different issue all together

Ok cool, so two issues but one affecting the other, thus the loop. I'll track separately, with looping I guess being priority. Any tests you would like me to do, let me know.

Cheers and thanks again.
 
Well WAN1 is in Hot Standby. I tested and it can ping the Target but not in Standby.

View attachment 43530 View attachment 43531 View attachment 43532

As per the logs. I can install v13 and see if that behaviour changes if that helps. Thoughts appreciated.
Yea I figured so, if the WAN interface wasn’t in hot-standby it won’t try and ping it. What model router do you have and what firmware are you on? What is your WAN1 Target IP? While the script is looping on 14e try and add this rule and see if it starts acting correctly.
Replace “Target” with the IP of your WAN1 Target IP:
Code:
ip rule add from all iif lo to 8.8.4.4 lookup 200 priority 100
 
LOL - no need to, but appreciated!
No worries lol, just report back and tell me if that clears up the looping problem and allows your Failover to actually start actively monitoring. If so it is quite apparent several routers have issues with IP rules when an outbound interface is specified so if your results confirm my thought on that I’m going to update the logic to test with one rule and then add the other and test again.
 
If so it is quite apparent several routers have issues with IP rules when an outbound interface is specified so if your results confirm my thought on that I’m going to update the logic to test with one rule and then add the other and test again.
Guess it didn't so on to other options :D
 
Didn't correct the loop. Bouncing back down again.

1. Confirm the WAN 1 Routing table has a default route.
Code:
ip route show default table 200

2. Try turning up the Recursive Ping check setting up from 1 to let’s say….3 and restart script.

3. If the route is there try adding this test route if 2 doesn’t work, try this.
Code:
ip route add 8.8.4.4 via $(nvram get wan1_gateway) dev $(nvram get wan1_gw_ifname)
 
Last edited:

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top