I had a secret fear that my fix from yesterday was only a temporary reprieve from the real problem, and today that fear was confirmed.
PROBLEM STATEMENT
After my network has been up for some amount of time (between minutes and hours), clients lose the ability to load the RT-AC5300's web interface. It doesn't matter whether the client is wireless or wired, or connected over a media bridge or directly connected to the router. At some point, ALL clients on the network lose the ability to access the router's web UI. I THINK this is more than simply a matter of HTTP access to the web UI though... for example, when the problem happens I THINK I remember pings to the router address failing- have confirmed that pings work when things are working- but I need to retest this the next time the problem occurs.)
NETWORK TOPOLOGY
I have an RT-AC5300 set up as my wifi router, and two RT-AC86U devices serving as Media Bridges (there are Ethernet devices on all three floors of my home). I live in a condominium with HOA-provided fiber internet access. No cable modems; I just connect the router's WAN port to an RJ-45 jack in the wall. I am NOT using AiMesh. AC5300 firmware version is: 3.0.0.4.384_81981-g19f55de
OBSERVATIONS SO FAR
PROBLEM STATEMENT
After my network has been up for some amount of time (between minutes and hours), clients lose the ability to load the RT-AC5300's web interface. It doesn't matter whether the client is wireless or wired, or connected over a media bridge or directly connected to the router. At some point, ALL clients on the network lose the ability to access the router's web UI. I THINK this is more than simply a matter of HTTP access to the web UI though... for example, when the problem happens I THINK I remember pings to the router address failing- have confirmed that pings work when things are working- but I need to retest this the next time the problem occurs.)
NETWORK TOPOLOGY
I have an RT-AC5300 set up as my wifi router, and two RT-AC86U devices serving as Media Bridges (there are Ethernet devices on all three floors of my home). I live in a condominium with HOA-provided fiber internet access. No cable modems; I just connect the router's WAN port to an RJ-45 jack in the wall. I am NOT using AiMesh. AC5300 firmware version is: 3.0.0.4.384_81981-g19f55de
OBSERVATIONS SO FAR
- All clients can access the internet and performance is good when problem occurs. The only issue appears to be the inability to access the router via TCP/IP.
- WORKAROUND: If I disconnect the WAN connection and reboot the router, I can connect a client to the router and load the web interface. At that time I can change the router's base IP address (see my "fix" from yesterday, linked earlier), and then after clients re-connect I can use the web interface for a while- even after re-connecting the WAN cable- but eventually that stops working and I have to repeat the process again.
- The "ASUS Router" iOS app also fails to connect to the router when other clients are having the same problem.
- Rebooting the router alone does NOT fix the problem, even temporarily. The only thing that seems to work is physically disconnecting the WAN cable AND THEN rebooting, after which point I can access the web UI and change the device's IP address.
- I suspected an ARP issue, since the behavior is similar to when there are ARP cache corruptions (when you open a browser and enter the router's IP address, the browser just "spins" for a long time as if trying to resolve the proper destination). However, on Windows clients, the arp tool shows zero collisions/dupes on any of the IPs I've used for the router's address (192.168.1.1, 192.168.10.1).
- The only node on the network where this appears to be a problem is the router. For example, I can access the web interfaces of both of my AC86Us, at least from clients physically connected to them. (Wireless clients can't load the WebUI on the media bridges, only Ethernet clients plugged into the bridge devices can.)
- I only just now learned of the "service restart_httpd" command, and I have SSH turned on, so the next time this happens I will see if A) I can SSH in at all- if not, would point to some kind of topology issue or other routing problem and B) if restarting the HTTP daemon does anything. Would be surprised, considering rebooting doesn't even work on its own, but I will try it.
- Also... if I remember right, ping to the router from clients didn't work either- which again indicates a routing problem. But things are working now (and so are pings), so I need to re-test to confirm this is the case.