What's new

VPN Failover script

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Thank you @Martineau this script is exactly what I was looking for and works perfectly!
Just a side question.. if you use the multiconfig option where you list server/ports in /jffs/configs/VPN_Failover.. how do handle the OpenVPN configuration files (certs, settings, authorization, etc) that is usually uploaded for the 5 WebUI clients?

The 'multiconfig' option was designed to be used for round-robin selection from a list of OpenVPN servers (i.e. UDP/TCP sockets) provided by the same VPN ISP.

Code:
# Tricky business of restoring NVRAM and the three cert files 'ca,crt and key'  # v1.12??
# Should the script save a working VPN Client NVRAM config when it requests and successfully starts a VPN Client?
#       nvram show >2/dev/null | grep vpn_client5_ | sort >/jffs/openvpn/VPN_Failover/vpn_client5_NVRAM
#    or
#       ./nvram-save.sh -i ${THIS_VPN}ovpn.ini -nojffs -nouser
#
#
# Read the NVRAM vars and restore 'em...       # v1.12??
Restore_VPN_NVRAM ${THIS_VPN}_ovpn.sh

# Since '/jffs/openvpn' holds the 'live' keys, use '/jffs/openvpn/VPN_Failover' to restore them
if  [ ! -f /jffs/openvpn/VPN_Failover ] then
 cp -/jffs/openvpn /jffs/openvpn/VPN_Failover
else
 # Now we can overwrite the target certs....                     # v1.12??
 cp -af /jffs/openvpn/VPN_Failover/vpn_crt_client${THIS_VPN}_ca  /jffs/openvpn/vpn_crt_client${THIS_VPN}_ca
 cp  af /jffs/openvpn/VPN_Failover/vpn_crt_client${THIS_VPN}_crt /jffs/openvpn/vpn_crt_client${THIS_VPN}_crt
 cp -af /jffs/openvpn/VPN_Failover/vpn_crt_client${THIS_VPN}_key /jffs/openvpn/vpn_crt_client${THIS_VPN}
fi

I believe for most users, switching between different VPN configs/providers is best handled without invoking the 'multiconfig' option - although there is then of course a limit of five round-robin entities.
 
I need some clarification now. I had my first actual VPN client drop from my provider since I installed VPN_Failover script. Below is output from the syslog-ng filter to a /opt/var/log/chkwan.log.
Code:
Apr  9 08:50:00 RT-AC86U-4608 (VPN_Failover.sh): 31244 VPN Client Monitor: Checking VPN Client 1 connection status....
Apr  9 08:50:05 RT-AC86U-4608 (VPN_Failover.sh): 31244 **VPN Client Monitor: Switching VPN Client 1 to VPN Client 2 (Reason: VPN Client 1 STATE=0;Disconnected)
Apr  9 08:50:05 RT-AC86U-4608 (VPN_Failover.sh): 31244 VPN Client Monitor: Monitoring VPN Client 2 terminated
Apr  9 08:55:00 RT-AC86U-4608 (VPN_Failover.sh): 31997 VPN Client Monitor: Checking VPN Client 1 connection status....
Apr  9 08:55:05 RT-AC86U-4608 (VPN_Failover.sh): 31997 **VPN Client Monitor: Switching VPN Client 1 to VPN Client 2 (Reason: VPN Client 1 STATE=0;Disconnected)
Apr  9 08:55:05 RT-AC86U-4608 (VPN_Failover.sh): 31997 VPN Client Monitor: Monitoring VPN Client 2 terminated
Here is my /jffs/scripts/wan-start script. I have a VPN client configured in #1, #3, and #5 per RMerlin instructions to configure so that only one cpu runs the vpn client and not the same one that runs router processes.
Code:
/tmp/home/root# cat /jffs/scripts/wan-start

#!/bin/sh
# check WAN connection every 5
# reboot router if WAN is down
cru a WAN_Check */5 * * * * /jffs/scripts/ChkWAN.sh force once nowait quiet"
# check VPN client connection every 5 minutess
# restart VPN client connection if VPN disconnected
cru a VPN_Failover "*/5 * * * * /jffs/scripts/VPN_Failover.sh 1 ignore=2,4 once"
mtn_dance@RT-AC86U-4608:/tmp/home/root#
Here is what I saw in the router syslog webGUI that did not get filtered by syslog-ng.
Code:
Apr  9 08:49:48 RT-AC86U-4608 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex
Apr  9 08:50:05 RT-AC86U-4608 (VPN_Failover.sh): 31244 *Warning VPN Client 2 not configured? - auto IGNORED/SKIPPED
Apr  9 08:50:18 RT-AC86U-4608 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link DOWN.
Apr  9 08:50:21 RT-AC86U-4608 kernel: eth0 (Int switch port: 3) (Logical Port: 3) Link UP 1000 mbps full duplex
Apr  9 08:50:22 RT-AC86U-4608 (ChkWAN.sh): 31243 ***ERROR WGET 'http://proof.ovh.net/files/100Mb.dat' transfer FAILED RC=6
Apr  9 08:50:22 RT-AC86U-4608 (ChkWAN.sh): 31243 v1.11 Monitoring WAN connection using PING method to 71.93.52.1 check FAILED
Apr  9 08:50:22 RT-AC86U-4608 (ChkWAN.sh): 31243 Private LAN 192.168.1.1 will be skipped for WAN PING check!
Apr  9 08:50:43 RT-AC86U-4608 (ChkWAN.sh): 31243 ***ERROR WGET 'http://proof.ovh.net/files/100Mb.dat' transfer FAILED RC=6
Apr  9 08:50:43 RT-AC86U-4608 (ChkWAN.sh): 31243 v1.11 Monitoring WAN connection using PING method to 8.8.8.8 check FAILED
Apr  9 08:51:01 RT-AC86U-4608 (ChkWAN.sh): 31243 ***ERROR WGET 'http://proof.ovh.net/files/100Mb.dat' transfer FAILED RC=6
Apr  9 08:51:01 RT-AC86U-4608 (ChkWAN.sh): 31243 v1.11 Monitoring WAN connection using PING method to 1.1.1.1 check FAILED
Apr  9 08:51:31 RT-AC86U-4608 (ChkWAN.sh): 31243 ***ERROR WGET 'http://proof.ovh.net/files/100Mb.dat' transfer FAILED RC=6
Apr  9 08:51:31 RT-AC86U-4608 (ChkWAN.sh): 31243 v1.11 Monitoring WAN connection using PING method to 71.93.52.1 check FAILED
Apr  9 08:51:31 RT-AC86U-4608 (ChkWAN.sh): 31243 Private LAN 192.168.1.1 will be skipped for WAN PING check!
Apr  9 08:55:05 RT-AC86U-4608 (VPN_Failover.sh): 31997 *Warning VPN Client 2 not configured? - auto IGNORED/SKIPPED
At that point I manually started vpnclient 1 from router webGUI. Below is the ouput of the router syslog in the webGUI
Code:
Apr  9 08:55:42 RT-AC86U-4608 rc_service: httpds 26426:notify_rc start_vpnclient1
Apr  9 08:55:42 RT-AC86U-4608 custom_script: Running /jffs/scripts/service-event (args: start vpnclient1) - max timeout = 120s
Apr  9 08:55:51 RT-AC86U-4608 custom_script: Running /jffs/scripts/openvpn-event (args: tun11 1500 1553 10.200.0.26 10.200.0.25)
I thought I understood that the command in the wan-start thread would monitor 1, if down, ignore 2, start and monitor 3, if down, ignore 4, start and monitor 5.

At this point I know I don't understand the script options or what options are needed to accomplish using vpn clients 1,3,5 consecutively and ignoring non-configured clients 2 & 4.

Thank you.
 
I just reread the output of running the script with the -h parameter, I *think* I found my error in understanding. I've changed the /jffs/scripts/ wan-start line to this -
Code:
cru a VPN_Failover "*/5 * * * * /jffs/scripts/VPN_Failover.sh 1 & once"
Is that what is needed to monitor client 1, if down, start and monitor client 3, and to 5 if 3 goes down? Clients 2 & 4 are not configured, so it appears I don't need the "ignore" option.
 
I just reread the output of running the script with the -h parameter, I *think* I found my error in understanding. I've changed the /jffs/scripts/ wan-start line to this -
Code:
cru a VPN_Failover "*/5 * * * * /jffs/scripts/VPN_Failover.sh 1 & once"
Is that what is needed to monitor client 1, if down, start and monitor client 3, and to 5 if 3 goes down? Clients 2 & 4 are not configured, so it appears I don't need the "ignore" option.

I am curious if the 1,3,5 VPN client rule goes also for AX88U. I have applying this to my AX88U and I also used to do it on AC86U. I would assume yes due to its similarities with AC86U but would appreciate the clarification.


Sent from my iPhone using Tapatalk
 
I am curious if the 1,3,5 VPN client rule goes also for AX88U. I have applying this to my AX88U and I also used to do it on AC86U. I would assume yes due to its similarities with AC86U but would appreciate the clarification.


Sent from my iPhone using Tapatalk
I would guess probably, but I think you should ask RMerlin, since I do not know the exact internal differences, but the AC86U is dual core and the AX88U is quad core.
 
I would guess probably, but I think you should ask RMerlin, since I do not know the exact internal differences, but the AC86U is dual core and the AX88U is quad core.

Yes but it is my understanding that OpenVPN is still handled by a single core on both routers.


Sent from my iPhone using Tapatalk
 
I would guess probably, but I think you should ask RMerlin, since I do not know the exact internal differences, but the AC86U is dual core and the AX88U is quad core.

And thank you as well for reviving this thread. I am very much interested to know how this script will be working for you. I am planning on trying it this weekend.


Sent from my iPhone using Tapatalk
 
And thank you as well for reviving this thread. I am very much interested to know how this script will be working for you. I am planning on trying it this weekend.


Sent from my iPhone using Tapatalk
I thought it was running fine, until something closed the vpn client 1 connection, and the way I have the script configured was incorrect, so it did not bring client 3 as expected. That is why my post above, to find where I messed up. I know the script will work from what I see it doing in the log, I just did not tell it the correct parameters for my setup. Now I have scribe (syslog-ng installer) working, the main webGUI syslog get cleaned up so I run it every 5 minutes instead of twice an hour. I highly recommend this script.
 
I am curious if the 1,3,5 VPN client rule goes also for AX88U. I have applying this to my AX88U and I also used to do it on AC86U. I would assume yes due to its similarities with AC86U but would appreciate the clarification.


Sent from my iPhone using Tapatalk

If I had to guess here, I would say that the AX88U would be able to handle double the simultaneous OpenVPN instances, being quad-core. But the 1, 3, 5 order should probably be the same. @RMerlin is the authority here, maybe he can weigh in on this too for us.

It should be easy to test? Load them all up, including the even numbered ones and see what breaks. :)
 
I am curious if the 1,3,5 VPN client rule goes also for AX88U. I have applying this to my AX88U and I also used to do it on AC86U. I would assume yes due to its similarities with AC86U but would appreciate the clarification.

No. Since it's a quad core router, the CPU core usage order is as follow: 1,2,3,0,1

So, only client 4 shares the same core as most of the system processes.
 
Last edited:
No. Since it's a quad core router, the CPU core usage order is as follow: 1,2,3,4,0

So, only client 5 shares the same core as most of the system processes.

'0', 'zero'? Does this mean it goes back to '1'?
 
No. Since it's a quad core router, the CPU core usage order is as follow: 1,2,3,4,0

So, only client 5 shares the same core as most of the system processes.

This is definitely interesting! Thank you!


Sent from my iPhone using Tapatalk
 
No. Since it's a quad core router, the CPU core usage order is as follow: 1,2,3,4,0

So, only client 5 shares the same core as most of the system processes.

Assuming equal performance of each VPN client prior to setup, does the usage order offer any advantages in terms of overall performance for each client on the first 4 cores (for example, will client 1 perform better than client 3, etc)?

Thank you!


Sent from my iPhone using Tapatalk
 
'0', 'zero'? Does this mean it goes back to '1'?

I mis-typed that. I meant 1,2,3,0,1. So, client 4 is the one sharing the same core as most of the other processes.
 
@Martineau

Any pointers or suggestions as to how I can modify this script to suite the following use case?:
  • I just want it to try a curl and if it fails restart the existing VPN client not start another VPN client
Reason being is that sometimes the VPN client is running and connected to the VPN server, so things like ping restart won't work since that succeeds. When in this state any clients that are being routed through this VPN exhibit symptoms of a bad connection until I restart the VPN client. I was hoping to find a way that could allow me to ping say via cron every X minutes to try a curl and if that fails or is very slow (hinting at some problem) simply restart the VPN client.

Thanks in advance.
 
@Martineau

Any pointers or suggestions as to how I can modify this script to suite the following use case?:
  • I just want it to try a curl and if it fails restart the existing VPN client not start another VPN client
Doesn't the following invocation achieve your goal, assuming you only need to monitor VPN client 1
Code:
./VPN_Failover 1 forcesmall ignore=2,3,4,5
 
Hi @Martineau, what command do I type to temporarily stop the VPN_Failover.sh script? I currently have it started from services on boot and want to temporarily disable it. Thanks kindly.
 
Hi @Martineau, what command do I type to temporarily stop the VPN_Failover.sh script? I currently have it started from services on boot and want to temporarily disable it. Thanks kindly.
I assume you are using v1.17 ?

So you can gracefully request the monitor to self-terminate by confirming the semaphore file
Code:
./VPN_Failover.sh status

(VPN_Failover.sh): 30112 v1.17 Started..... [status]

 Active VPN Failover monitor processes

29128 admin 1472 S {VPN_Failover.sh} /bin/sh /jffs/scripts/VPN_Failover.sh 1 delay=120 ignore=2,3,4,5 verbose interval=1200
 6 Jun 10 10:15 /tmp/vpnclient1-VPNFailover Status/PID=29128
and deleting the file e.g. VPN Client 1 monitor
Code:
rm /tmp/vpnclient1-VPNFailover

EDIT: There is the undocumented 'reset' command, but it will terminate ALL monitoring for ALL VPN clients.

NOTE: Simply killing the process
Code:
killall VPN_Failover.sh
will leave an orphaned semaphore file (referring to a non-existent PID)
Code:
(VPN_Failover.sh): 30222 v1.17 Started..... [status]

 Active VPN Failover monitor processes

 6 Jun 10 10:15 /tmp/vpnclient1-VPNFailover Status/PID=29128
 
Last edited:

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!

Staff online

Top