What's new

WANFailover Dual WAN Failover Script

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Nope it didn't. It remained on 150/15.
I made a slight tweak to the logic for this in beta14 but I added debug logging so if the problem persists please enable debug logging and collect the data for me, thank you.
 
v1.5.6-beta14 Release: ***Disclaimer: This is a beta release and has been untested***

Manually upgrade to this beta by running the following command" ***Allow for cronjob to relaunch the script***
Clean installation:
Code:
/usr/sbin/curl -s "https://raw.githubusercontent.com/Ranger802004/asusmerlin/main/wan-failover_v1.5.6-beta14.sh" -o "/jffs/scripts/wan-failover.sh" && chmod 755 /jffs/scripts/wan-failover.sh && sh /jffs/scripts/wan-failover.sh install

Upgrade from previous installation:
Code:
/usr/sbin/curl -s "https://raw.githubusercontent.com/Ranger802004/asusmerlin/main/wan-failover_v1.5.6-beta14.sh" -o "/jffs/scripts/wan-failover.sh" && chmod 755 /jffs/scripts/wan-failover.sh && sh /jffs/scripts/wan-failover.sh restart

To revert back to Production Release:
Code:
/jffs/scripts/wan-failover.sh update

Beta Readme

***WARNING*** There are some major changes from v1.5.6-beta9 so if you experience issues please collect debug logs and forward to me via DM!

***WARNING*** If you are using an RT-AX88U, read release notes!


***HIGHLIGHT*** Script will now send emails in Failover Mode if the Primary or Secondary WAN fails or is disabled.

***HIGHLIGHT*** Script will now create an alias as "wan-failover", once script is updated and restarted. Consoles can now use the new alias instead of the full script path "/jffs/scripts/wan-failover.sh". Consoles open while the script is updated may need to be restarted or the following command executed.

Code:
source /jffs/configs/profile.add

Release Notes:
v1.5.6-beta14
Enhancements:
- General optimization
- Added a confirmation prompt to Restart Mode.
- Load Balance Monitor now triggers Service Restart function during failover events.
- Target IPs for both interfaces can now be the same the Target IP.
- Added Recursive Ping Check feature. If packet loss is not 0% during a check, the Target IP Addresses will be checked again based on the number of iterations specified by this setting before determing a failure or packet loss. RECURSIVEPINGCHECK (Value is in # of iterations). Default: 1
- Moved WAN0_QOS_OVERHEAD, WAN1_QOS_OVERHEAD, WAN0_QOS_ATM, WAN1_QOS_ATM, BOOTDELAYTIMER, PACKETLOSSLOGGING and WANDISABLEDSLEEPTIMER to Optional Configuration and no longer are required to be set during Config or Installation. They will be given Default values that can be modified in the Configuration file.
- Created new Optional Configured Option to specify the ping packet size. PACKETSIZE specifes the packet size in Bytes, Default: 56 Bytes.
- Load Balance Mode will now dynamically update resolv.conf (DNS) for Disconnected WAN Interfaces.
- Created Alias for script as wan-failover to shorten length of commands used in console.
- Enhanced WAN Disabled Logging, will relog every 5 minutes the condition causing the script to be in the Disabled State.
- Added additional logging throughout script.
- Added cleanup function for when script exits to perform cleanup tasks.
- Service Restarts now include restarting enabled OpenVPN Server Instances.
- Target IP Rules will now compensate for the RT-AX88U however this can create conflicts if the Target IPs are the same or are used for other services/scripts.
- Switch WAN Mode will now prompt for confirmation before switching.
- Script will now reset VPNMON-R2 if it is installed and running during Failover
- Enhanced Ping Monitoring to improve failure/packet loss detection time as well as failure and restoration logging notifications.
- An email notification will now be sent if the Primary or Secondary WAN fails or is disabled while in Failover Mode.
- If IPv6 6in4 Service is being used, wan6 service will be restarted during failover events.
- Updated Monitor Mode to dynamically search multiple locations for System Log Path such as if Scribe is installed

Fixes:
- Fixed visual bugs when running Restart Mode.
- YazFi trigger during service restart will no longer run process in the background to prevent issues with script execution of YazFi.
- IP Rules should no longer create conflict with other scripts such as VPNMON unless using the RT-AX88U.
- Resolved issues that prevented 4G USB Devices from properly working in Failover Mode.
- Resolve issue where script would loop from WAN Status to Load Balance Monitor when an interface was disabled.
- Fixed Cron Job deletion during Uninstallation.
- Corrected issue with Failure Detected log not logging if a device was unplugged or powered off from the Router while in Failover Mode.
- Modified Restart Mode logic to better detect PIDs of running instances of the script.
- Fixed issue where if the USB Device is unplugged and plugged back in, script will now leave Disabled State to go back to WAN Status.
- Email function will check if DDNS is enabled before attempting to use saved DDNS Hostname
- Fixed issue in DNS Switch in Load Balance Mode where WAN1 was using the Status of WAN0.
- Fixed issue where Switch WAN Mode would fail due to missing Status parameters acquired in Run or Manual Mode.
- Fixed issue where WAN Interface would not come out of Cold Standby during WAN Status Check.
- If an amtm email alert fails to send, an email attempt will be made via AIProtection Alerts if properly configured.
- Fixed issue in Load Balance Mode when a Disconnected WAN Interface would cause WAN Failover to error and crash when creating OpenVPN rules when OpenVPN Split Tunneling is Disabled.
- Fixed issue where QoS settings would not apply during WAN Switch
Everything is excellent.
 
Try out beta14 for this issue and report back :)
It should work for me too. But I have only syslog-ng without scribe so is not working.
Could you please add in the in the IF another else for my case?
The log file location is the same: / opt/ var/ log/ messages; but the location to check in the if is:
/opt / sbin/ syslog-ng

Code:
_B88X_>140159/opt/sbin#:ls | grep syslog-ng
-rwxr-xr-x    1 root       14.8K Apr  2 18:00 syslog-ng
-rwxr-xr-x    1 root       24.0K Apr  2 18:00 syslog-ng-ctl
-rwxr-xr-x    1 root       50.5K Apr  2 18:00 syslog-ng-debun

Later edit: or, because syslog-ng is included in scribe to keep only a IF for syslog-ng

Thank you very much,
amplatfus
 
Last edited:
It should work for me too. But I have only syslog-ng without scribe so is not working.
Could you please add in the in the IF another else for my case?
The log file location is the same: / opt/ var/ log/ messages; but the location to check in the if is:
/opt / sbin/ syslog-ng

Code:
_B88X_>140159/opt/sbin#:ls | grep syslog-ng
-rwxr-xr-x    1 root       14.8K Apr  2 18:00 syslog-ng
-rwxr-xr-x    1 root       24.0K Apr  2 18:00 syslog-ng-ctl
-rwxr-xr-x    1 root       50.5K Apr  2 18:00 syslog-ng-debun

Later edit: or, because syslog-ng is included in scribe to keep only a IF for syslog-ng

Thank you very much,
amplatfus
Is this the default location from the firmware or do you have something else installed to have it there?
 
@Stephen Harrington and @Ranger802004

Was just having a look at the code for QoS settings:

Code:
 [[ "$(nvram get qos_obw)" -gt "1024" ]] && logger -p 4 -st "${0##*/}" "WAN Switch - QoS Settings: Upload Bandwidth: $(($(nvram get qos_obw)/1024))Mbps" || logger -p 4 -st "${0##*/}" "WAN Switch - QoS Settings: Upload Bandwidth: $(nvram get qos_obw)Kbps"
[[ "$(nvram get qos_ibw)" -gt "1024" ]] && logger -p 4 -st "${0##*/}" "WAN Switch - QoS Settings: Download Bandwidth: $(($(nvram get qos_ibw)/1024))Mbps" || logger -p 4 -st "${0##*/}" "WAN Switch - QoS Settings: Download Bandwidth: $(nvram get qos_ibw)Kbps"

Right after this a couple IF/ELSE can be added IF the updated download OR upload bandwidth is greater than 300 THEN disable QoS, ELSE enable it. For Cake anything above 250/300 will limit the connection to about 250/300. Note with QoS/Cake off, you also get runner and hw accel re-enabled.

Might need a Global variable for users who want to keep QoS disabled regardless.

Thanks again for the awesome work.
 
Is this the default location from the firmware or do you have something else installed to have it there?
It is a entware package installed in default entware folder like this:

Code:
_B88X_>194059/tmp/home/root#:opkg install syslog-ng
Package syslog-ng (3.37.1-1) installed in root is up to date.
_B88X_>194125/tmp/home/root#:cd /tmp/mnt/Disk/entware/sbin/

_B88X_>194156/tmp/mnt/Disk/entware/sbin#:ls | grep syslog-ng
-rwxr-xr-x    1 root       14.9K Jul 21 12:14 syslog-ng
-rwxr-xr-x    1 root       24.0K Jul 21 12:14 syslog-ng-ctl
-rwxr-xr-x    1 root       50.6K Jul 21 12:14 syslog-ng-debun
Scribe is using syslog-ng too, that is the reason logs are kept in the same / opt/ var/ log/ messages

Thank you!
 
Last edited:
I published a revision to v1.5.6-beta14, reinstall from the same link and your version should report as v1.5.6-beta14b.

Changes:
- QoS Settings will set to Automatic Settings if WAN Interface Settings are set to 0 in configuration File.
- Updated Monitor Mode to dynamically search for Entware syslog-ng package if installed.
- Fixed issue where if WAN1 was connected but failing ping, the script would loop back and forth from WAN Status to WAN Disabled.
 

Attachments

  • System Log - enhanced by Scribe - 192.168.1.1 - 13.08.2022_01_50_46.jpg
    System Log - enhanced by Scribe - 192.168.1.1 - 13.08.2022_01_50_46.jpg
    111.6 KB · Views: 33
Yes 8 meaning all. I had 7 (debug) and after changed to 8 (all) like you is working for me too.
Code:
Aug 12 23:54:38 src@B88X wan-failover.sh: Debug - Script Mode: cron
Aug 12 23:54:38 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:54:38 src@B88X wan-failover.sh: Cron - Creating Cron Job
Aug 12 23:54:38 src@B88X wan-failover.sh: Cron - Created Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Script Mode: kill
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Function: killscript
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Calling CronJob to delete jobs
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:54:56 src@B88X wan-failover.sh: Cron - Removing Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Cron - Removed Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Kill - Killing wan-failover.sh
Aug 12 23:55:15 src@B88X wan-failover.sh: Debug - Script Mode: cron
Aug 12 23:55:15 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:55:15 src@B88X wan-failover.sh: Cron - Creating Cron Job
Aug 12 23:55:15 src@B88X wan-failover.sh: Cron - Created Cron Job
Thank you all :)
 
Last edited:
Yes 8 meaning all. I had 7 (debug) and after changed to 8 (all) like you is working for me too.
Code:
Aug 12 23:54:38 src@B88X wan-failover.sh: Debug - Script Mode: cron
Aug 12 23:54:38 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:54:38 src@B88X wan-failover.sh: Cron - Creating Cron Job
Aug 12 23:54:38 src@B88X wan-failover.sh: Cron - Created Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Script Mode: kill
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Function: killscript
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Calling CronJob to delete jobs
Aug 12 23:54:56 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:54:56 src@B88X wan-failover.sh: Cron - Removing Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Cron - Removed Cron Job
Aug 12 23:54:56 src@B88X wan-failover.sh: Kill - Killing wan-failover.sh
Aug 12 23:55:15 src@B88X wan-failover.sh: Debug - Script Mode: cron
Aug 12 23:55:15 src@B88X wan-failover.sh: Debug - Function: cronjob
Aug 12 23:55:15 src@B88X wan-failover.sh: Cron - Creating Cron Job
Aug 12 23:55:15 src@B88X wan-failover.sh: Cron - Created Cron Job
Thank you all :)
Hmmmm that means syslog-ng doesn't use the priority queues built in, I assign debug -p 6 for example and -p 2 for errors and that works perfect for default system logging but with Scribe or the Entware Package it doesn't seem to filter that out.
 
I published a revision to v1.5.6-beta14, reinstall from the same link and your version should report as v1.5.6-beta14b.

Changes:
- QoS Settings will set to Automatic Settings if WAN Interface Settings are set to 0 in configuration File.
- Updated Monitor Mode to dynamically search for Entware syslog-ng package if installed.
- Fixed issue where if WAN1 was connected but failing ping, the script would loop back and forth from WAN Status to WAN Disabled.
Published some minor tweaks to v1.5.6-beta14, reported version will show as v1.5.6-beta14c, I think this should cover everything and be ready for production.
 
Published some minor tweaks to v1.5.6-beta14, reported version will show as v1.5.6-beta14c, I think this should cover everything and be ready for production.

In general, super worked out all my tests. Even the email alerts came in as expected - one for the problem, one for the recovery! Super! Enough I am going to sleep ;)
 
In general, super worked out all my tests. Even the email alerts came in as expected - one for the problem, one for the recovery! Super! Enough I am going to sleep ;)
That’s what I have been spending most of my time on is improving the email alerts and avoiding duplicate sends.
 
I'm still getting a loop when clean installing the latest beta14:


My WAN1 is a LTE USB Stick with a SIM-card.
Any idea why your WAN1 can't ping out? I see where it restarts it to get it into State 2 "Connected" but it can't ping the target IP you set.

Aug 13 15:10:02 wan-failover.sh: WAN Status - Restarting wan1
Aug 13 15:10:02 rc_service: service 8834:notify_rc restart_wan_if 1
Aug 13 15:10:02 custom_script: Running /jffs/scripts/service-event (args: restart wan_if)
Aug 13 15:10:02 custom_script: Running /jffs/scripts/wan-event (args: 1 stopping)
Aug 13 15:10:02 custom_script: Running /jffs/scripts/wan-event (args: 1 init)
Aug 13 15:10:02 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:02 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 custom_script: Running /jffs/scripts/wan-event (args: 1 init)
Aug 13 15:10:03 custom_script: Running /jffs/scripts/wan-event (args: 1 connecting)
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 custom_script: Running /jffs/scripts/service-event-end (args: restart wan_if)
Aug 13 15:10:03 custom_script: Running /jffs/scripts/wan-event (args: 1 disconnected)
Aug 13 15:10:03 custom_script: Running /jffs/scripts/wan-event (args: 1 stopped)
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 custom_script: Running /jffs/scripts/wan-event (args: 1 connected)
Aug 13 15:10:03 custom_script: Running /jffs/scripts/firewall-start (args: eth7)
Aug 13 15:10:03 wan: finish adding multi routes
Aug 13 15:10:03 dhcp_client: bound 192.168.8.100/255.255.255.0 via 192.168.8.1 for 86400 seconds.
Aug 13 15:10:03 wan-failover.sh: Debug - Script Mode: cron
Aug 13 15:10:03 wan-failover.sh: Debug - Function: cronjob
Aug 13 15:10:03 wan-failover.sh: Debug - wan1 Post-Restart State: 2

Aug 13 15:10:33 wan-failover.sh: Debug - wan1 Packet Loss: 100%%
Aug 13 15:10:33 wan-failover.sh: WAN Status - wan1 has 100% packet loss ***Verify 1.1.1.1 is a valid server for ICMP Echo Requests***
 
I guess the script does "not properly start WAN1".
My secondary WAN (WAN1) is just on hot-standby instead of connected.
I also tried other ping targets
 
I guess the script does "not properly start WAN1".
My secondary WAN (WAN1) is just on hot-standby instead connected.
It's in State 2 according to logs which is Connected and Dual WAN Mode will show this as "Hot-Standby" as opposed to "Cold-Standby" when it is the Secondary WAN and not Primary, that is correct. The problem is the inability to ping 1.1.1.1 out of WAN1.
 

Similar threads

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top