WANFailover Dual WAN Failover Script

rlj2 · May 22, 2022

@Ranger802004 Ive noticed the logging doesnt seem to be working properly here, Ive been looking through the scripts, and Im not seeing why. For instance my failback kicked in a few times last night over thunderstorms. The log showed switching to Wan1, but it never showed "Monitoring Wan0 for restoration", or
ever showed when it switched back.

Ranger802004 · May 22, 2022

rlj2 said:
@Ranger802004 Ive noticed the logging doesnt seem to be working properly here, Ive been looking through the scripts, and Im not seeing why. For instance my failback kicked in a few times last night over thunderstorms. The log showed switching to Wan1, but it never showed "Monitoring Wan0 for restoration", or
ever showed when it switched back.

Did it actually switch or not working as well? Are you opening the log file or watching it with tail command?

rlj2 · May 22, 2022

Ranger802004 said:
Did it actually switch or not working as well? Are you opening the log file or watching it with tail command?

It actually switched, then would switch back a few minutes later. I used tail (which the log cleanup can screw that up) and opened the file. Ive never seen it show monitoring wan0 for restoration, or switching back to wan0. I unplugged my primary a few times to check.
I made a empty wan-event except for
#!/bin/sh
"echo "$(date "+%D @ %T"): Wan$1 is now $2" >> /tmp/wanstats"
Just to so I can compare the 2 log files.

Ranger802004 · May 22, 2022

rlj2 said:
It actually switched, then would switch back a few minutes later. I used tail (which the log cleanup can screw that up) and opened the file. Ive never seen it show monitoring wan0 for restoration, or switching back to wan0. I unplugged my primary a few times to check.
I made a empty wan-event except for
#!/bin/sh
"echo "$(date "+%D @ %T"): Wan$1 is now $2" >> /tmp/wanstats"
Just to so I can compare the 2 log files.

Send me a full log read out so I can follow the flow of what it is doing exactly.

rlj2 · May 22, 2022

Ranger802004 said:
Send me a full log read out so I can follow the flow of what it is doing exactly.

I should of saved it, but didnt. So it will be later.. The wife is using tv, so I cant play with the internet right now

Ranger802004 · May 22, 2022

rlj2 said:
I should of saved it, but didnt. So it will be later.. The wife is using tv, so I cant play with the internet right now

Lol no worries, go ahead and send me the full output of “nvram show | grep -e “wan” as well so I can see if anything else is sticking out. You can send that to my DM for privacy.

Ranger802004 · May 22, 2022

I have updated the script. Please reread the original post and if you have already set up the script, read my notes about deleting your set up and reinstalling with the new method, thank you!

Martinski · May 23, 2022

Ranger802004 said:
Link to Script:
https://raw.githubusercontent.com/Ranger802004/asusmerlin/main/wan-failover.sh

First, let me preface this post by saying that I wouldn't consider myself an expert shell script programmer. I first learned the C-Shell programming language during the early 1990s while in undergraduate school; I learned Bash programming later on as well. Over the decades, I have written many shell scripts for work & personal use, mainly to simplify common CLI terminal sessions, and to automate some regular routine tasks that happen often during my S/W development work. However, I don't write shell scripts frequently enough to consider myself an expert, so the following questions, suggestions & recommendations are made in the spirit of sharing some "nuggets of wisdom" that I've learned along the way; they are certainly not meant as destructive criticism of the work you've done, nor are they meant to diminish what you have accomplished with the script you have shared on this thread.

Usage of the "continue" command??

Bash:

# Set Script Mode
if [ -z "$(echo ${1#})" ] >/dev/null;then
...
...
else
mode="${1#}"
continue
fi

Bash:

if [[ ! -f "$CONFIGFILE" ]] >/dev/null;then
  echo -e "${RED}${0##*/} - No Configuration File Detected - Run Install Mode${NOCOLOR}"
else
continue
exit
fi

As shown above, I noticed that fairly often in your script you use the "continue" command within if-else statements, and I'm wondering what your reasons are for using it that way. AFAIK, the only purpose of the "continue" command is to skip the current iteration when executing *within* a loop (e.g. for, while, and until loops), and I've never seen it used within simple if-else statements so its inclusion seems superfluous & completely unnecessary to me. However, it's possible that I may be unaware of some other syntax forms where the "continue" command serves a purpose. Could you elaborate on what your reasons are (for my own edification)?

In your script, you have created a custom function named "kill"

Bash:

# Kill Script
kill ()
{
echo -e "${RED}Killing ${0##*/}...${NOCOLOR}"
  echo "$(date "+%D @ %T"): $0 - Kill: Killing ${0##*/}..." >> $LOGPATH
sleep 3 && killall ${0##*/}
exit
}

It's generally considered bad coding practice to give a custom shell function exactly the same name as a native, built-in command already found in the underlying OS *unless* you are intentionally replacing the built-in command with your own version that improves upon or enhances the existing native command.

At one point in the code you have a statement where the script calls itself to create a cron job:

Bash:

# Create Initial Cron Jobs
sh "$0" cron

It's generally considered bad coding practice for a shell script to call itself (and generate another process) *unless* it's for performing some recursive operations that require a controlled repetitive execution of all or part of the script. It's much simpler to call the existing internal function from within the current execution instance of the script.

Bash:

# Create Initial Cron Jobs
cronjob

In the current "cronjob()" function you have an unnecessary 'else' statement:

Bash:

# Cronjob
cronjob ()
{
if [ -z "$(crontab -l | grep -e "$0")" ] >/dev/null; then
...
...
else
exit
fi
exit
}

Here's the modified version:

Bash:

# Cronjob
cronjob ()
{
   if [ -z "$(crontab -l | grep -e "$0")" ] >/dev/null; then
   ...
   ...
   fi
   exit
}

Here is a modified version of the "logclean()" function which not only avoids creating a temporary file but also avoids doing all that work when the logfile has fewer lines than indicated by the LOGNUMBER variable:

Bash:

logclean ()
{
   if [ "$mode" != "logclean" ] || [ ! -f "$LOGPATH" ] ; then exit 1 ; fi

   local NumOfLogLines="$(wc -l "$LOGPATH" | awk '{print $1}')"
   if [ $NumOfLogLines -gt $LOGNUMBER ]
   then
       echo "${0##*/} - Deleting logs older than last $LOGNUMBER messages..."
       echo "$(date "+%D @ %T"): $0 - Log Cleanup: Deleting logs older than last $LOGNUMBER messages..." >> $LOGPATH

       sed -i 1,$((++NumOfLogLines - LOGNUMBER))d "$LOGPATH"
       sleep 1
   fi

   exit 0
}

There are other code sections in the script that would probably benefit from further review and then applying more refactoring techniques, but some parts are not easy to follow because of the formatting style. Also, I think I might be getting on a topic that's beyond the scope of this thread, and perhaps even beyond the scope/purpose of this forum (if I'm not mistaken).

I would like, however, to give one last suggestion. I highly recommend that you reformat the script to include appropriate indentation not only to make it much more readable but also to visually provide clues to the reader as to the structure and logic flows in the script. This is not merely an aesthetic concern. When source code is formatted with proper indentation, it helps to follow much more easily & readily the different possible paths of execution and makes it easier to debug, change, remove, or add lines of code as needed while at the same time minimizing the risk of introducing new bugs. All this is conducive to better software quality. It's especially important when the code has statements placed at multiple nesting levels. The "wanstatus()" function is an example where if reformatted with proper indentation it would certainly make it much easier to follow the logic flows & execution paths found at the different nesting levels.

My 2 cents.

Ranger802004 · May 23, 2022

Martinski said:
First, let me preface this post by saying that I wouldn't consider myself an expert shell script programmer. I first learned the C-Shell programming language during the early 1990s while in undergraduate school; I learned Bash programming later on as well. Over the decades, I have written many shell scripts for work & personal use, mainly to simplify common CLI terminal sessions, and to automate some regular routine tasks that happen often during my S/W development work. However, I don't write shell scripts frequently enough to consider myself an expert, so the following questions, suggestions & recommendations are made in the spirit of sharing some "nuggets of wisdom" that I've learned along the way; they are certainly not meant as destructive criticism of the work you've done, nor are they meant to diminish what you have accomplished with the script you have shared on this thread.

Usage of the "continue" command??

Bash:

# Set Script Mode if [ -z "$(echo ${1#})" ] >/dev/null;then ... ... else mode="${1#}" continue fi

Bash:

if [[ ! -f "$CONFIGFILE" ]] >/dev/null;then echo -e "${RED}${0##*/} - No Configuration File Detected - Run Install Mode${NOCOLOR}" else continue exit fi

As shown above, I noticed that fairly often in your script you use the "continue" command within if-else statements, and I'm wondering what your reasons are for using it that way. AFAIK, the only purpose of the "continue" command is to skip the current iteration when executing *within* a loop (e.g. for, while, and until loops), and I've never seen it used within simple if-else statements so its inclusion seems superfluous & completely unnecessary to me. However, it's possible that I may be unaware of some other syntax forms where the "continue" command serves a purpose. Could you elaborate on what your reasons are (for my own edification)?

In your script, you have created a custom function named "kill"

Bash:

# Kill Script kill () { echo -e "${RED}Killing ${0##*/}...${NOCOLOR}" echo "$(date "+%D @ %T"): $0 - Kill: Killing ${0##*/}..." >> $LOGPATH sleep 3 && killall ${0##*/} exit }

It's generally considered bad coding practice to give a custom shell function exactly the same name as a native, built-in command already found in the underlying OS *unless* you are intentionally replacing the built-in command with your own version that improves upon or enhances the existing native command.

At one point in the code you have a statement where the script calls itself to create a cron job:

Bash:

# Create Initial Cron Jobs sh "$0" cron

It's generally considered bad coding practice for a shell script to call itself (and generate another process) *unless* it's for performing some recursive operations that require a controlled repetitive execution of all or part of the script. It's much simpler to call the existing internal function from within the current execution instance of the script.

Bash:

# Create Initial Cron Jobs cronjob

In the current "cronjob()" function you have an unnecessary 'else' statement:

Bash:

# Cronjob cronjob () { if [ -z "$(crontab -l | grep -e "$0")" ] >/dev/null; then ... ... else exit fi exit }

Here's the modified version:

Bash:

# Cronjob cronjob () { if [ -z "$(crontab -l | grep -e "$0")" ] >/dev/null; then ... ... fi exit }

Here is a modified version of the "logclean()" function which not only avoids creating a temporary file but also avoids doing all that work when the logfile has fewer lines than indicated by the LOGNUMBER variable:

Bash:

logclean () { if [ "$mode" != "logclean" ] || [ ! -f "$LOGPATH" ] ; then exit 1 ; fi local NumOfLogLines="$(wc -l "$LOGPATH" | awk '{print $1}')" if [ $NumOfLogLines -gt $LOGNUMBER ] then echo "${0##*/} - Deleting logs older than last $LOGNUMBER messages..." echo "$(date "+%D @ %T"): $0 - Log Cleanup: Deleting logs older than last $LOGNUMBER messages..." >> $LOGPATH sed -i 1,$((++NumOfLogLines - LOGNUMBER))d "$LOGPATH" sleep 1 fi exit 0 }

There are other code sections in the script that would probably benefit from further review and then applying more refactoring techniques, but some parts are not easy to follow because of the formatting style. Also, I think I might be getting on a topic that's beyond the scope of this thread, and perhaps even beyond the scope/purpose of this forum (if I'm not mistaken).

I would like, however, to give one last suggestion. I highly recommend that you reformat the script to include appropriate indentation not only to make it much more readable but also to visually provide clues to the reader as to the structure and logic flows in the script. This is not merely an aesthetic concern. When source code is formatted with proper indentation, it helps to follow much more easily & readily the different possible paths of execution and makes it easier to debug, change, remove, or add lines of code as needed while at the same time minimizing the risk of introducing new bugs. All this is conducive to better software quality. It's especially important when the code has statements placed at multiple nesting levels. The "wanstatus()" function is an example where if reformatted with proper indentation it would certainly make it much easier to follow the logic flows & execution paths found at the different nesting levels.

My 2 cents.

You have great suggestions and I don’t mind any criticism and I like some of your suggestions and I’m working on this one solo so I’m sure there are things overlooked. The unnecessary else continues are just a bad habit on my end. This is still an early iteration so I have definitely been taking feedback and making updates. As far as using kill, yes I wanted to use it because a regular kill would not actually kill the script and instead would keep spawning new instances to kill the old ones but creating a function and changing it to killall ends all instances of the script running as intended. The cron job calling the script I will probably update to just go to the function but I think I copy pasted that from the install line where it adds that to the wan-event script and overlooked switching it back to a function. Good call on the log clean, I definitely prefer it to do less work if necessary so that is an excellent suggestion to have it sleep if there is less than 1000 lines. I have it on my list to to indentation to make it easier to read and to clean up some of the techniques, this script started out as a private script because I hated not having proper WAN Failover properly work from ASUS and then I saw there were a lot in the same boat as me so I figured I’d share. Thank you for talking the time to review my script and coming up with some really good suggestions that will help improve the quality of it!

rlj2 · May 23, 2022

Ill mess with this later tonight, but the new script isnt working at all. It leaves the variables WAN0TARGET empty in configs, It also thinks its always running unless I switch the grep from 1 > 5.
Right now I cannot get it to do anything really

Also in your help section Kill and Cron need the information swapped.

Ranger802004 · May 23, 2022

rlj2 said:
Ill mess with this later tonight, but the new script isnt working at all. It leaves the variables WAN0TARGET empty in configs, It also thinks its always running unless I switch the grep from 1 > 5.
Then it dies with

admin@RT-AX86U-D7D0:/jffs/scripts# ./wan-failover.sh run
wan-failover.sh - Run Mode
Checking if wan-failover.sh is already running...
RTNETLINK answers: File exists

Also in your help section Kill and Cron need the information swapped.

What are you inputting for WAN0TARGET when you install? Also do you still have an old instance of the script running after replacing the file? It could make it think it is already running if they have the same path. I’ll take a look at the descriptions and correct the mishap.

rlj2 · May 23, 2022

Ranger802004 said:
What are you inputting for WAN0TARGET when you install? Also do you still have an old instance of the script running after replacing the file? It could make it think it is already running if they have the same path. I’ll take a look at the descriptions and correct the mishap.

Entering the addresses it showed, and nothing else is running.

This kills the script as a error also
dmin@RT-AX86U-D7D0:/jffs/scripts# ./wan-failover.sh run
wan-failover.sh - Run Mode
RTNETLINK answers: File exists

Once I temporarily got rid of the check for script running, added the Wantargets , and allowed it to error over the creating a route that existed it seems to be working in manual mode at least.

admin@RT-AX86U-D7D0:/jffs/scripts# ./wan-failover.sh run
wan-failover.sh - Run Mode
RTNETLINK answers: File exists

Done.
Switching to wan1

Done.

Done.

Done.

Done.
RTNETLINK answers: File exists

Done.
RTNETLINK answers: File exists
RTNETLINK answers: File exists
RTNETLINK answers: File exists
RTNETLINK answers: File exists
Switching to wan0

Done.

Done.

Done.

Done.
RTNETLINK answers: File exists
RTNETLINK answers: File exists

Ranger802004 · May 23, 2022

rlj2 said:
Entering the addresses it showed, and nothing else is running.

This kills the script as a error also
dmin@RT-AX86U-D7D0:/jffs/scripts# ./wan-failover.sh run
wan-failover.sh - Run Mode
RTNETLINK answers: File exists

Once I temporarily got rid of the check for script running, added the Wantargets , and allowed it to error over the creating a route that existed it seems to be working in manual mode at least.

admin@RT-AX86U-D7D0:/jffs/scripts# ./wan-failover.sh run
wan-failover.sh - Run Mode
RTNETLINK answers: File exists

Done.
Switching to wan1

Done.

Done.

Done.

Done.
RTNETLINK answers: File exists

Done.
RTNETLINK answers: File exists
RTNETLINK answers: File exists
RTNETLINK answers: File exists
RTNETLINK answers: File exists
Switching to wan0

Done.

Done.

Done.

Done.
RTNETLINK answers: File exists
RTNETLINK answers: File exists

When you get a chance can you send me your log output?

rlj2 · May 23, 2022

Ranger802004 said:
When you get a chance can you send me your log output?

I just killed the log again, but the script died at
05/23/22 @ 10:11:31: /jffs/scripts/wan-failover.sh - WAN Status: Creating route 100.64.0.1 via 100.64.0.1 dev eth0...
all I did was add "|| true" at the end of the ip route lines so it didnt error
I should be busy working , but i keep bouncing back and forth.

Ranger802004 · May 23, 2022

rlj2 said:
I just killed the log again, but the script died at
05/23/22 @ 10:11:31: /jffs/scripts/wan-failover.sh - WAN Status: Creating route 100.64.0.1 via 100.64.0.1 dev eth0...
all I did was add "|| true" at the end of the ip route lines so it didnt error
I should be busy working , but i keep bouncing back and forth.

Try using an external IP Address like Google. 8.8.8.8 / 8.8.4.4 - Let me know your results.

rlj2 · May 23, 2022

Ranger802004 said:
Try using an external IP Address like Google. 8.8.8.8 / 8.8.4.4 - Let me know your results.

Where would I put that?

Ranger802004 · May 23, 2022

rlj2 said:
Where would I put that?

When you run the install and it prompts for Target IP Address for WAN0 / WAN1, or you can Simply go to the config file under /jffs/configs/wan-failover.conf
and update the following lines and kill current instances of the script to let it restart with the new config:

Code:

WAN0TARGET=8.8.8.8
WAN1TARGET=8.8.4.4

rlj2 · May 23, 2022

Ranger802004 said:
When you run the install and it prompts for Target IP Address for WAN0 / WAN1, or you can Simply go to the config file under /jffs/configs/wan-failover.conf
and update the following lines and kill current instances of the script to let it restart with the new config:

Code:

WAN0TARGET=8.8.8.8 WAN1TARGET=8.8.4.4

Doesnt seem to cause any errors with that.

Ranger802004 · May 23, 2022

rlj2 said:
Doesnt seem to cause any errors with that.

Yea I think the problem is the route wouldn't add because there is already a route for your gateway for default and you were trying to use it as a target IP to monitor, I'll later expand on this check in a later release.

rlj2 · May 23, 2022

Ranger802004 said:
Yea I think the problem is the route wouldn't add because there is already a route for your gateway for default and you were trying to use it as a target IP to monitor, I'll later expand on this check in a later release.

This may be better in my situation anyways, ill have to watch. My primary is starlink. Sometimes I will show connected, but starlink ground infracture has issues alot, ie no internet past the pop.

Thread starter	Title	Forum	Replies	Date
D	Dual Wan Failover	Asuswrt-Merlin AddOns	5	Oct 27, 2024
D	DUAL WAN Failover version v2.1.2 and router RT-AX88U_PRO	Asuswrt-Merlin AddOns	0	Oct 4, 2024
	Detect Dual WAN in use	Asuswrt-Merlin AddOns	4	Mar 22, 2024
R	WANFailover WAN Failover v2.1.3 Release	Asuswrt-Merlin AddOns	3	Feb 8, 2025
J	Unbound Unbound - Warning WAN: Use local caching DNS server as system resolver=YES	Asuswrt-Merlin AddOns	0	Jan 3, 2025
H	WAN monitoring…..	Asuswrt-Merlin AddOns	9	Dec 8, 2024
R	Wan Failover Script is reported constant switching	Asuswrt-Merlin AddOns	5	Sep 1, 2024
D	amtm The amtm update command does not work when using 2x WAN.	Asuswrt-Merlin AddOns	3	Jun 13, 2024
B	amtm WAN service schedule off/on script ?	Asuswrt-Merlin AddOns	14	May 22, 2024
S	Unbound DHCP DNS same as WAN DNS based on merlin-dns-monitor.sh	Asuswrt-Merlin AddOns	0	May 4, 2024

WANFailover Dual WAN Failover Script

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Very Senior Member

Very Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Very Senior Member

Senior Member

Similar threads

Similar threads

Support SNBForums w/ Amazon

Sign Up For SNBForums Daily Digest