What's new

Clarification and Discussion on Recent 388.8 Changes

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

eibgrad

Part of the Furniture
There are two changes in the 388.8 release that need clarification.

VPN Kill Switch

Initially (and quite some ago at this point), the OpenVPN kill switch used to be persistent. IOW, even if the OpenVPN client was disabled/off, the kill switch remained active. However, so many ppl complained about that behavior, it was changed (again, quite some time ago) so that if you purposely disabled (i.e, turned OFF) the OpenVPN client, the kill switch was disabled as well.

Well, w/ 388.8, we're back to the kill switch being persistent! And that has led to some confusion. Particularly since (apparently) we now have more users than ever enabling remote access over the WAN, either for the actual purpose of remote access, or just from the LAN side, if only to take advantage of the cert bound to the WAN's ip/domain-name and avoid any TLS/SSL errors.

As I've reported on several other threads, when remote access is enabled on the GUI, this is implemented in an unusual fashion. For security reasons, the GUI is never bound directly to the WAN. Instead, there is a DNAT rule in the firewall to *redirect* the public IP of the WAN to the LAN ip of the router in order to gain access to the GUI. As a result, if you ever bind the router's LAN ip to the OpenVPN client w/ the VPN Director, either explicitly (e.g., 192.168.1.1) or implicitly (e.g., 192.168.1.0/24), any attempt to reference the WAN ip or domain-name will have its replies sent back over the VPN, and NOT the WAN! Hence, access to the GUI will be lost!

Understand that this has *always* been the way it works. Nothing has changed in this regard w/ this or any prior release.

Also, NONE of this relevant when accessing the GUI from a client on the LAN side as long as the reference to the GUI is also on the LAN side (e.g., 192.168.1.100 to 192.168.1.1). It's ONLY an issue when you attempt remote access over the WAN, or reference the WAN ip/domain name from the LAN (i.e., NAT loopback).

In the recent past, I suspect this wasn't nearly as much of a problem since a) most users were NOT referencing the WAN ip/domain-name from inside the LAN, and b) the kill-switch wasn't active when the OpenVPN client was OFF. But now we have a situation in which remote access is more common, and the kill switch is persistent. So users are finding themselves unexpectedly locked out of the GUI, at least if they continue to reference the GUI w/ the WAN ip/domain-name, and do NOT disable the kill switch. Ironically, this failure to disable the kill switch when the OpenVPN client was disabled/OFF was why this behavior was previously changed; to satisfy complaints. With this most recent change, we're now revisiting old problems.

IMO, the safest thing to do given the current state of 388.8 is to *always* bind the router's LAN ip to the WAN w/ the VPN Director. Even if you're not otherwise using the VPN Director at this time. This should guarantee the router remains accessible at all times, irrespective of the state of the OpenVPN client, kill switch, or how you reference the GUI (WAN vs. LAN). It just keeps things simple.

OpenVPN ip rule change

Here too we have another change, where the ip rules now limit access of the OpenVPN client's routes to only those devices on the LAN (br0). Prior to 388.8, that access was unconditional. If you needed access to the OpenVPN client's routes from local OpenVPN/Wireguard servers, or even the WAN, that was possible, but no more. I have no idea why that change was made, or if it will remain a part of the on-going firmware. Presumably it was intended to fix some (as yet unknown) problem, but it is going to break any configurations that require such access (as has been shown in the release thread).

The following is a *temporary* fix.

Code:
ip rule del iif br0 lookup ovpnc1 prio 10001
ip rule add lookup ovpnc1 prio 10001

Note: This assumes OpenVPN client #1. For other OpenVPN clients, you need to change it accordingly, including the table name (ovpnc#) and priority (1000#).

Even so, I don't know the impact of this rollback beyond the fact it fixes previously working configurations. These changes will NOT survive a restart of the OpenVPN client either. Not unless you use an openvpn-event script to apply them. That's why I only recommend this fix for those who are desperate. In most cases, I strongly suggest you revert to 388.7 (or whatever was your prior release) and await further information regarding what will be done (if anything) about the negative impact of these changes.

I'm just laying out the field here so we have a single point of reference for discussing these changes. Otherwise, it's going to be difficult to deal w/ user after user experiencing the negative impact in different ways, and just creating more and more threads.
 
Last edited:
IMO, the safest thing to do given the current state of 388.8 is to *always* bind the router's LAN ip to the WAN w/ the VPN Director.
That's one option I was considering - hardcoding a WAN rule for the router's LAN IP as destination. Or, making sure that the whole LAN subnet when it's the destination address does not get routed through the VPN (which logically should already be the case, but apparently it possibly mught not be).
 
That's one option I was considering - hardcoding a WAN rule for the router's LAN IP as destination. Or, making sure that the whole LAN subnet when it's the destination address does not get routed through the VPN (which logically should already be the case, but apparently it possibly mught not be).

Seems to me you might as well bind the LAN ip to the WAN. By default, all other router based services are bound directly to the WAN. It's just the GUI that's an exception. And once the VPN Director is activated, all those other services remain bound to the WAN. IOW, in every other respect except the GUI, the router never participates in the OpenVPN client. But if the user naively makes that happen (e.g., 192.168.1.0/24), NOW they've created a problem, and only for the GUI. And I can't image any reason why that would be the user's intent. If for some reason it was, then probably they should be forcing ALL the router's services over the VPN, which means NOT using the VPN Director. Everything just gets routed out the VPN, LAN clients and router, except the connection to the OpenVPN server itself over the WAN w/ a static route.

I could be missing something of course. But I suspect there is no perfect solution. Users can come up w/ some pretty wild and unanticipated configurations. You just have to consider what will work best for most users.
 
Last edited:
Seems to me you might as well bind the LAN ip to the WAN.
Can you think of any reason why I shouldn't bind the whole LAN subnet instead of just the LAN IP? I vaguely remember seeing people with the weird scenario where they had problems accessing LAN devices even tho logically it should be switched, not routed.
 
I think one potential issue that might happen if I bind the LAN IP as the source is that the router will no longer be able to obtain the tunnel remote IP through stun. I would need to test that.
 
Can you think of any reason why I shouldn't bind the whole LAN subnet instead of just the LAN IP? I vaguely remember seeing people with the weird scenario where they had problems accessing LAN devices even tho logically it should be switched, not routed.

How is the VPN Director going to work for those same LAN clients if they are also bound to the WAN, esp. when the WAN has the higher priority?
 
How is the VPN Director going to work for those same LAN clients if they are also bound to the WAN, esp. when the WAN has the higher priority?
I meant binding the destination, not the source. I'm not sure if that would help with the original issue however, the rule might need to be based on the source IP, which means only the LAN IP can be ruled through the WAN.

I also thought about excluding the WAN IP as destination, but that would be problematic for people whose WAN IP changes all the time. Plus people with CGNAT, Dual WAN, and so on...
 
I meant binding the destination, not the source. I'm not sure if that would help with the original issue however, the rule might need to be based on the source IP, which means only the LAN IP can be ruled through the WAN.

I also thought about excluding the WAN IP as destination, but that would be problematic for people whose WAN IP changes all the time. Plus people with CGNAT, Dual WAN, and so on...

Well here's another thought. Why not bind the GUI directly to the WAN like any other services and avoid the problem completely? It's that DNAT that's causing the headache. Now even if the user includes the LAN ip of the router in their rules, it has ZERO effect. It's a NOOP. Nothing else the router is currently offering in terms of services ends up being effected by the inclusion of its LAN ip, so I don't see any downside.
 
Well here's another thought. Why not bind the GUI directly to the WAN like any other services and avoid the problem completely? It's that DNAT that's causing the headache. Now even if the user includes the LAN ip of the router in their rules, it has ZERO effect. It's a NOOP. Nothing else the router is currently offering in terms of services ends up being effected by the inclusion of its LAN ip, so I don't see any downside.
I'd rather not change that as this was done by Asus, probably for a good reason. Could be due to the dynamic nature of the WAN interface, or to accomodate dual WAN - I don't know. I prefer to avoid making these types of architectural changes when it's theirs.
 
Things are somewhat confusing here, because the 3006 and 3004 code are not fully in sync in terms of generating the rules.

When looking specifically at the 3004 code:

- iif is used if you set redirection to either "NONE" or "ALL"
- No iff is used if you set redirection to "VPNDirector"

When looking at the 3006 code (where the new VPN routing was first developped and implemented):

- iif is not used if you set redirection to either "NONE" or "ALL" (this was removed in the commit I referenced in the other thread)
- iif is not used if you set rediretion to "VPNDirector"

So my first thoughts here:

- The 3006 changes where I removed the iff from NONE and ALL is missing in 3004
- People who have the issues are not using VPNDirector (I somehow assumed they were), since they have an iif. I just tested my RT-AX86U_PRO, and there was no iif interface specified.

I trust more the 3006 code there because it's where I did all the development & testing back in May, before porting it to 3004. So, regarding the specific case involving the iif interface, it's probably a bug in 3004, and it should not be there. And the issue experienced by people is quite possibly the very issue I fixed in 3006 with the commit that removed it...

I'll post in the other thread to ask people affected to try switching to VPNDirector, which would remove the interface.
 
So far I cannot reproduce the issue regarding accessing the webui through the WAN IP.
  • Enabled webui WAN access
  • Configured VPN client
  • Enabled VPNDirector + killswitch
  • Created a single rule to redirect the whole LAN subnet through the VPN
  • Killed the OpenVPN client process

Result:
  • Can no longer access the Internet
  • Still can access the router's webui using the WAN IP

Was there an additional element I am missing to reproduce that scenario?
 
Using the WAN ip from where? The LAN side or the internet side of the WAN?

I suspect the problem is when it's the latter. In the former, the DNAT returns to a LAN ip due to NAT loopback, NOT a public IP that is now inaccessible via the VPN routing table due to the prohibit default.

We were only speculating about ppl using the WAN ip internally for the sake of the cert. It may actually be more likely they're accessing remotely, outside the WAN.
 
Using the WAN ip from where? The LAN side or the internet side of the WAN?
The test was done from the LAN.

I just tested the following:
  • From the WAN, with VPNDirector and the whole subnet redirected: works OK
  • From the WAN, with redirection set to "None" (still with the buggy iif present): works OK
  • From the WAN, with redirection set to "All" (still with the buggy iif present): works OK

The one scenario I haven't tested is in a site2site situation, as I don't have a test setup for that particular scenario.
 
The test was done from the LAN.

I just tested the following:
  • From the WAN, with VPNDirector and the whole subnet redirected: works OK
  • From the WAN, with redirection set to "None" (still with the buggy iif present): works OK
  • From the WAN, with redirection set to "All" (still with the buggy iif present): works OK

The one scenario I haven't tested is in a site2site situation, as I don't have a test setup for that particular scenario.

Ppl were having issues w/ site2site, but specifically for the above test, you just need simple remote access via the WAN. Whether the OpenVPN client is or isn't up and running (at least w/ the kill switch enabled), it shouldn't work since the replies will be bound to the VPN, NOT back to the WAN.
 
Ppl were having issues w/ site2site, but specifically for the above test, you just need simple remote access via the WAN.
Then I cannot reproduce it here with any of these four scenarios above.
 
Other scenarios tested:
  • From the LAN, redirection set to "None", accessing WAN IP: works ok
  • From the LAN, redirection set to "All", accessing WAN IP: works ok

Only scenario left to try is in a site2site tunnel, accessing the remote router's interface through its LAN IP - unable to test that at the moment.
 
Since there is a lot to cover and I think it's very hard to cover all up, I'll try to describe again in a few words my problem (for Merlin)

Site A
Site B

Openvpn as a permanent connection from A to B - all good
Another vpn (ipsec or wireguard, no matter) hosted on Site B


1721853150638.png


left is fw .7
right is fw .8

PS: Internet is not redirected through tunnel

Now:
3004_388.7_0 - All servers hosted on both sites can be accessed by tunnel made by vpn connection from site B - all good as it supposed to work
3004_388.8_0 - Servers hosts on Site A cant be accessed with VPN server hosted on Site B. You must connect specific vpn's each for both sites, one by one ... This is not ok

Thanks !
 
Last edited:
Then I cannot reproduce it here with any of these four scenarios above.

I think we're mixing different issues here. There's the issue of remote access to the GUI, vs. users chaining tunnels, where (for example) they have a site2site tunnel (A<->B), w/ A being the OpenVPN client, and they want some OpenVPN client of OpenVPN server on A to have access to A's local OpenVPN client. If the ip rule is conditional, it won't work.

P.S. I realize you don't have the ability to test the latter. Neither do I. I don't even have a suitable router and firmware, or I'd test it myself.
 
There's the issue of remote access to the GUI, vs. users chaining tunnels,
That's not really the kind of setup that I can easily reproduce here locally.

Does simply removing the interface resolve that specific issue? If yes, then it will be solved once the 3006 fix gets ported to 3004.

There's the issue of remote access to the GUI
For that one I have pretty much reproduced all the possible scenarios I can think of then.
 
Similar threads
Thread starter Title Forum Replies Date
F Solved Need Clarification Asuswrt-Merlin 30
S Another DNS Director clarification Asuswrt-Merlin 1

Similar threads

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top