What's new
  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

RT68U: Dual WAN on Merlin as unreliable as stock ASUS

Nocturnal

Occasional Visitor
Hi,

I was really hoping to retire my Cisco RV042 and use my RT68U to do WAN failover. Nope. Stunningly unreliable in stock form and really the same using Merlin. Frankly I doubt this was ever properly tested or this feature would have been removed from both firmware trees. WAN links keep dropping when they are stable when used in single WAN config.

So I have put my RV042 between DSL and cable modems and the 68U. The link is stable now but because the RV042 is only 10/100 capable my cable speed dropped from 120 Mb/s to about 60 Mb/s. Perhaps the '42 is maxing out before the 100 Mb/s link maxes out, who knows.

I have some syslogs I can post that show how the dual WAN config behaved when I ran it for about 4 hours but it may contain information I shouldn't post publicly. Let me know which lines I should remove, if any, before posting. Actually, let me know first if there is any hope that this may get fixed or if I'm out of luck and need to call Comcast to downgrade my internet to a lower tier because I can't use the current 100+ speed.

I'll add I use Dual WAN in failover mode because my backup is a 3.5 Mb/s DSL/U-verse line. Load balancing makes little sense in this case.

Merlin 374.43_2

Thanks,

Sander
 
Actually, let me know first if there is any hope that this may get fixed or if I'm out of luck and need to call Comcast to downgrade my internet to a lower tier because I can't use the current 100+ speed.

Any fix will have to come from Asus. The Dual WAN code is too complex for me to fully understand (the code isn't commented), so I don't want to spend that much time on it just trying to understand it properly before doing anything to it.
 
Nocturnal, I'm tracking the exact same issue here:

http://forums.smallnetbuilder.com/showthread.php?t=18628

I don't think this has received much attention from Asus because this has been going on since the original release. There are a few older threads covering the same topic but I didn't see any resolution and wouldn't expect to see any if Asus hasn't updated/fixed their code.

Like you, I'm using the device in WAN failover mode because my primary WAN connection is three times faster, both on the download and upload, than my secondary.

My suspicion is that the vast majority of customers aren't using the dual WAN function and because of this and the drive to push out new products, fixes for this problem have been deprioritized.

I bought this router specifically for the dual WAN functionality. The 5Ghz radio is nice as is the horsepower, but as of right now, only my iPhones and iPad have 5Ghz radios. Everything else is 2.4 N.

For what it's worth, I've contacted Gary and he's supposed to get an answer back engineering as to the status of fixes for this problem.

If this is fixed soon, or least a commitment to a fix, this router is going back to Amazon.
 
Thanks to both for the replies. I have tried several times to log a ticket with Asus but their support website was so horribly broken I couldn't even get that far. This after registering my product and going trough all that. Always get a javascript popup about something being NULL even though all fields are filled in.

I bought the RT68 because I would rather spend a little more and get a quality product that will last a few years. The Dual WAN capability was the feature that pushed this router to the top of my list. So I bought the wrong router. Too late to return it too so I'll just use it as a lesson. No more Asus routers for me. They look good but quality is marginal and support non-existent.

Thanks Merlin for adding useful features like named static DHCP entries. That was another 'WTH' moment when I saw that my old DRL-655 had a nicer static DHCP reservation page.
 
Nocturnal, Asus engineering provided me with a beta release today that is supposed to address dual WAN operation and issues with Uverse connections (nothing detailed as to what the Uverse fixes are...).

I installed and four hours later experienced the same PPPoE error message and disconnection of the primary Uverse WAN, failover to the secondary Time Warner WAN, and then fallback to the Uverse WAN.

To be fair, I didn't do a factory reset after installing the beta firmware. I've done that now, but I'm not real hopeful.
 
Texashoser, if ASUS wants more testing on their firmware I'd be willing to give it a try. I have a Comcast cable primary and U-verse failover. If they're interested I'm sure we can use private messages here to exchange email addresses.

Thanks for the update!
 
Go ahead and PM Gary Key - he's a member on this board and part of Asus' technical marketing staff.

I can tell you that after loading this beta firmware - and resetting the device back to factory defaults - that this new load isn't performing any better. I'm still getting the same PPPoE error messages and the WAN(0) disconnects even thought the ATT Uverse modem doesn't support PPPoE on the LAN side.

My suspicion is that the WAN failover logic is over-engineered/sensitive and disconnecting the primary WAN at the slightest hint of problems/errors (real or imagined) even though the connection is actually working. This would explain why these errors aren't reported when dual WAN is disabled. Ie, there is no reason for the router to attempt to disconnect the WAN connection if it's the only connection available (excepting physical cable disconnect or the inability of the router to obtain a valid DHCP address).
 
My suspicion is that the WAN failover logic is over-engineered/sensitive and disconnecting the primary WAN at the slightest hint of problems/errors (real or imagined) even though the connection is actually working. This would explain why these errors aren't reported when dual WAN is disabled. Ie, there is no reason for the router to attempt to disconnect the WAN connection if it's the only connection available (excepting physical cable disconnect or the inability of the router to obtain a valid DHCP address).

Have you tried playing with the watchdog settings under Dual WAN?
 
Have you tried playing with the watchdog settings under Dual WAN?

Yes I have. I've increased/decreased fail/failback counts, frequency of pings, etc. I've even disabled it. None of this prevents the primary WAN connection from dropping with the errors previously posted.

But I'm sure you'd agree that even if disabling it 'fixed' the WAN disconnects, it wouldn't be an acceptable solution. The watchdog function, as I understand Asus has implemented it, is necessary to a) determine if the primary WAN connection has access to the internet, and if not, fail over to the secondary WAN connection and b) determine, after a failover, when the primary WAN connection has restored internet access in order to fail back.
 
On my RV042 I have to use a 'ping' method of link-down detection because in many WAN failure scenarios with cable and DSL modems the Ethernet link stays up even if it can't serve any traffic. In other words the default fail over detection is only for the bluntest and crudest of scenarios and therefore quite useless.

I'm pretty sure the Asus fail over detection works similarly so I also had watchdog enabled in that case. My logs show lots of items like this:

Jul 15 20:57:02 rc_service: wanduck 539:notify_rc restart_wan_if 1
Jul 15 20:57:02 stop_wan(): perform DHCP release

and then 20 seconds later it restores it 'WAN was restored'. These blocks only happen in dual WAN config. In single WAN with the same primary (cable) this doesn't happen.

In any case, I'll ping Gary Key and see if he'd like me to help.
 
On my RV042 I have to use a 'ping' method of link-down detection because in many WAN failure scenarios with cable and DSL modems the Ethernet link stays up even if it can't serve any traffic. In other words the default fail over detection is only for the bluntest and crudest of scenarios and therefore quite useless.

I'm pretty sure the Asus fail over detection works similarly so I also had watchdog enabled in that case. My logs show lots of items like this:

Jul 15 20:57:02 rc_service: wanduck 539:notify_rc restart_wan_if 1
Jul 15 20:57:02 stop_wan(): perform DHCP release

and then 20 seconds later it restores it 'WAN was restored'. These blocks only happen in dual WAN config. In single WAN with the same primary (cable) this doesn't happen.

In any case, I'll ping Gary Key and see if he'd like me to help.

I've disabled the watchdog function in the beta release Gary provided me to see if things are more stable. Five hours later - so far so good. Of course, that hardly means anything.

Like you, I agree the watchdog functionality is critical because it's really the only way to know, albeit with limitations, if a particular WAN connection has actual internet access vs. just an established ethernet link with the ISP's modem/CPE.

If disabling this important feature does prevent the WAN disconnects, I'm really curious how pinging a host on the internet would result in DHCP and PPPoE errors that have nothing to with programming logic that attempts to use ICMP packets as a method to decide if a WAN connection is usable or not.

This makes me want to some testing with watchdog enabled... Unplug the ATT line from the modem and see if the watchdog/failover logic actually works.
 
Yeah, disconnecting the phone line from the DSL modem or the cable line from the cable modem is a reasonable test to see if fail over is triggered.

I still think that pinging a DNS server (say 8.8.8.8) or rather a series of DNS servers is needed to have any idea if a link is usable. The physical link being up is not even half the story.
 
I've disabled the watchdog function in the beta release Gary provided me to see if things are more stable. Five hours later - so far so good. Of course, that hardly means anything.

Out of curiosity, what's the build number that you got sent?
 
Same here, on rt-n66u

Really bummed. I was having trouble with the early, Linksys-branded RV042 and decided to give Asus DualWAN a whirl.

Same issue: Failover ON means frequent (brief) WAN failures, on both Asus and Merlin-Asus (praised be the Wizard who's dripping with integrity).

Even after turning Dual-WAN off I had to factory reset to get reliability back.

Now I guess it's on to the RV042 V3 and Asus gets demoted to an access point.

Unless somebody stops me with good news or advice (or a spiritual revelation that has me out the door forever…)
 
Yeah, disconnecting the phone line from the DSL modem or the cable line from the cable modem is a reasonable test to see if fail over is triggered.

I still think that pinging a DNS server (say 8.8.8.8) or rather a series of DNS servers is needed to have any idea if a link is usable. The physical link being up is not even half the story.

You would be better to ping the gateway of the wan connection. I've seen these type of active/passive link failure only type of connections even on enterprise devices.
Asus will probably fix it over time, but I suspect that the number of actual customers who use this feature is actually pretty low.
 
You would be better to ping the gateway of the wan connection. I've seen these type of active/passive link failure only type of connections even on enterprise devices.
Asus will probably fix it over time, but I suspect that the number of actual customers who use this feature is actually pretty low.

The problem is that the gateway of the WAN connection can change over time and is not always a routable public IP address. And it may or may not respond to ICMP queries. And, unfortunately, the Asus routers only have one global watchdog address you can use instead of allowing you to configure a watchdog address per WAN connection.
 
Last edited:
Gary gave me some instructions to downgrade/upgrade firmware, reset to factory defaults, etc. I've done that.

We'll see if this stabilizes anything. Watchdog has been turned back on.

With watchdog disabled, I saw no WAN connection failures in 24 hours, but disabling watchdog is not an acceptable solution.
 
Do let us know! If I read you right, the Asus beta with the right factory reset might be an answer. (Then if that reset was unusual we'll want details.)
 
Two hours into the latest load/factory resets to defaults, still getting WAN(0) disconnects. At least in the beta load I'm running, I'm only seeing this with watchdog enabled.

I'm wondering if these errors pop with watchdog disabled and the firmware code is instructed to ignore the errors and not disconnect the WAN or if these errors are generated by buggy code specifically when watchdog is enabled.

Frustrating:

Aug 13 12:59:58 WAN(0) Connection: Detected that the WAN Connection Type was PPPoE. But the PPPoE Setting was not complete.
Aug 13 12:59:58 stop_nat_rules: apply the redirect_rules!
Aug 13 13:02:00 rc_service: wanduck 1273:notify_rc restart_wan_if 0
Aug 13 13:02:04 rc_service: wanduck 1273:notify_rc restart_wan_line 1
Aug 13 13:02:04 start_nat_rules: apply the nat_rules(/tmp/nat_rules_vlan3_vlan3)!
Aug 13 13:02:05 miniupnpd[6798]: received signal 15, good-bye
Aug 13 13:02:06 miniupnpd[7269]: HTTP listening on port 44573
Aug 13 13:02:06 miniupnpd[7269]: Listening for NAT-PMP traffic on port 5351
Aug 13 13:02:27 rc_service: wanduck 1273:notify_rc restart_wan_line 0
Aug 13 13:02:27 start_nat_rules: apply the nat_rules(/tmp/nat_rules_vlan2_vlan2)!
Aug 13 13:02:27 miniupnpd[7269]: received signal 15, good-bye
Aug 13 13:02:27 miniupnpd[7334]: HTTP listening on port 56190
Aug 13 13:02:27 miniupnpd[7334]: Listening for NAT-PMP traffic on port 5351
Aug 13 13:02:33 WAN(0) Connection: WAN was restored.
 

Similar threads

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!

Members online

Back
Top