What's new

ARP resolution issues

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

cameloid

New Around Here
I'm experiencing strange ARP issues with ARP resolution/broadcast population in my home network.

The LAN configuration is pretty straightforward:

1. Two ASUS routers (RT-AC3100 and RT-AX86U), both in AP mode, both running the latest Asuswrt-Merlin version (386.3_2)
2. Cisco SG110D-05 unmanaged 5-port switch, which connects both APs together
3. A router also connected to the switch (irrelevant to the question though)

The problem is often times the wireless clients connected to AP #1 and #2 do not "see" each other (pings do not go through, etc).

In more detail, it typically it looks like this:

admin@RT-AX86U-7EB8:/tmp/home/root# arp -n
? (c) at dc:a6:32:34:e2:61 [ether] on br0
? (192.168.1.1) at b4:fb:e4:ca:ea:fe [ether] on br0
? (192.168.1.96) at 4c:1d:96:2b:c8:07 [ether] on br0
? (192.168.1.111) at 98:fe:94:46:79:54 [ether] on br0
? (192.168.1.102) at 84:c5:a6:7c:71:8f [ether] on br0
? (192.168.1.50) at 00:16:e8:99:91:30 [ether] on br0

In other words, 192.168.1.101 and 192.168.1.102 hosts connected to AP #1 do exist in the ARP table of AP #2.

However, when I ping them from the host 192.168.1.96 (which is connected to AP #2), it fails to resolve their MAC addresses (as the AP #2 was blocking ethernet broadcasts sent by 192.168.1.96 or something like that). There are no ARP records corresponding to 192.168.1.101 or 102 on 192.168.1.96 host, and obviously they don't ping:

Pinging 192.168.1.101 with 32 bytes of data
Reply from 192.168.1.96: Destination host unreachable.
Reply from 192.168.1.96: Destination host unreachable.
Reply from 192.168.1.96: Destination host unreachable.
Reply from 192.168.1.96: Destination host unreachable.

Sometimes the problem rectifies itself. Also, it often happens that ARP resolution works for _some_ hosts connected to AP #2 but not for the others.

Am I missing something?
 
Last edited:
Would be helpful if we could eliminate the switch as a source of the problem. Can you temporarily try a different switch? Even if that's only the router's switch.

Is there symmetry here when it comes to .96 not being able to ping .101 and .102? IOW, can .101 and .102 not ping .96 as well? If they can, can .96 now ping .101 and .102?
 
I already tried to replace the switch temporarily, with no result. I'm not 100% sure but it seems like the issue is somehow related to wireless clients only, since the same laptop connected to the same AP #2 via Ethernet pings both .101 and .102 successfully.

There is a symmetry, .101 and .102 are unable to ping .96, and vice versa.
 
Did you previously run one or both of the APs in router mode and have something like YazFi guest wifi configured?
I remember running my router as a router with YazFi and guest ssid's then putting the router into AP mode and the guest ssid's still being present. As if it was still able to at least partially utilize some of the router plugin script functions.

The other thought, if your only using these as APs then there should be no value in running merlin fork vs OEM.
Factory reset and install OEM firmware; test again.
 
No, both of them were configured as Access Points from the very beginning.

I didn't try to revert back to OEM firmware yet but I thought that this report might be useful for Merlin since the issue is pretty obvious and easily, reliably reproducible.
 
If it's only affecting wireless, then that's interesting, and suggests it's something specific to the wireless drivers, or how they're configured.

What I suggest is defining static ARP entries for these devices. Let's see if communications is possible in that case, and make sure it's *only* a failure of dynamic ARP. Because if static ARP doesn't work either, then there's more than just an ARP issue here.

I suppose in the worst case, you could use static ARP for the long term, or at least until you find the actual culprit(s). I know it's not convenient. But it's better than nothing at this point.
 
To my surprise, static ARPs don't work either. Reverted AP #2 to stock firmware, same result.

I can ping .101 and .102 from AP #2 itself, but not from any wireless clients connected to AP #2.
 
Exactly same problem. AC86U w/ latest 387.0. Wired clients are good. Wireless clients can't see each other. AP isolation never turned on. I'm currently configured to /32 on clients to let them go through router's routing which has proper ARP.
 
You can't turn off hardware acceleration they don't go through CPU so you can't debug the packets yourself...
 
I have never used either one of these units in AP mode, so this is just a throw out ..... Does either unit have a wireless client isolation setting at all?
 
Interesting I have the same symptoms and a similar configuration. I have two WIFi routers connected to a wired router, which in turn is connected to my cable modem. All devices on my network can access the internet just fine (always), but frequently one machine cannot access another on the same (only) subnet — ping, ssh, curl all fail with variations on “no route to host” errors. Waiting while (sometimes 20-30 minutes) usually allows a host to connect to the other. As with the original poster, the inability to connect is symmetric.

However, for me, it looks like adding static ARP entries to each end (any two machines unable to connect to each other) allows them to connect. As long as the ARP table isn’t cleared, entries deleted, or machine rebooted, my connection issues go away.

if the ARP entry for host 1 on host 2 goes away, those two machines will frequently (but not always) NOT be able to connect. The ARP entries show “incomplete” in cases where connection doesn’t work.

So, in my case, I know the issue is that ARP resolution, for some reason, doesn’t work. Static ARP entries fix the problem.

How can I track this down? I assume that the device that initiates the connection is issuing an ARP request. Who answers it? The target machine, or any machine having an entry for that IP address in its ARP table. What if no ARP response to the broadcast is received, is it retried?

I suppose I’ll need to run tcpdump to watch the ARP packets. Does that program allow watching ARP packets?
 
One more thing — it seems the original poster is having issues across devices connected wirelessly to different access points. In my case, I’m having the ARP issue across devices connected to the same access point. It is an ASUS RT-AC3100. I’m pretty sure that if I disconnect my other access point (TP-Link Archer), the problem will still persist. Also in my case, neither access point has DHCP enabled. Only my wired router (to which both access points are connected) has its DHCP server enabled.
 
Just to add a note, that I have been struggling with a similar issue with my network, running on a single ASUS RT-AX88U, where I was often unable to ping between certain hosts. It seemed fairly random as to what was going on, but after some investigation, I too determined it was related to missing ARP entries.

But I seem to have stumbled on a possible solution for my particular problem. It seems that it may be something to do with my Guest Network, which was running on 2.4GHz Guest Network 1. I switched it off temporarily for unrelated reasons, and my ARP problem seemed to disappear. I have since re-enabled it on 5GHz Guest Network 2 and so far, after about 24 hours, I'm not seeing any problems.

No idea if this is coincidental and something else has changed, or whether my issues will resurface at some point. Also, if this is the resolution to my issue, I don't know whether it's related to the change from 2.4GHz to 5GHz or the Guest Network change from 1 to 2 (I seem to remember that there is a difference to how these work?). But I thought it worth posting my experience, in case it helps anybody else.
 
I wish it were that simple for me. I disabled my guest network a while back, with not improvement on my issue. I had previously reported that manually setting making the ARP binding on all machines (to each other) on my local network made the problem "go away". Now, unfortunately, it seems that it happens much more rarely now, but still does happen sometimes -- namely, that I can't reach my printer (for example) from my laptop despite their being an ARP entry for the printer on my laptop. Or that I can't reach my Ubuntu laptop from my Mac laptop, despite their being ARP entries for the other machine on both machines. I'm quite stymied. My current thought is to replace my ASUS RT-AC3100 wireless router. I've ordered, and am awaiting arrival, of a couple ASUS ZenWIFI AX6600 Triband Mesh routers. My plan is to get that up and running, turn off my ASUS RT-AC3100 wireless router and see if the problem goes away.
 
To my surprise, static ARPs don't work either. Reverted AP #2 to stock firmware, same result.

I can ping .101 and .102 from AP #2 itself, but not from any wireless clients connected to AP #2.
So do you have your AP receiving a static address? If so, try allowing the AP to receive its address from dhcp assignment instead of statically assigning it. See if that makes a difference when accessing the clients which connect to it.
 
Yes, in my case, my Asus AP has a static address. It doesn't get a DHCP address from my wired router. And I've found the same as you -- machines that are on a wired network have no trouble reaching each other (having gotten DHCP addresses from wired router). But machines that are only wifi connected (having also received DHCP addresses from the same wired router), will intermittently not be able to connect to each other.
 
Yes, in my case, my Asus AP has a static address. It doesn't get a DHCP address from my wired router. And I've found the same as you -- machines that are on a wired network have no trouble reaching each other (having gotten DHCP addresses from wired router). But machines that are only wifi connected (having also received DHCP addresses from the same wired router), will intermittently not be able to connect to each other.
So i believe what you are experiencing is an anomalous behavior created when wireless clients access an access point who receives address from static assignment. Especially if that static assignment is within the actual dhcp pool range. In other words, it might be the possible cause.
 
Yes, in my case, my Asus AP has a static address. It doesn't get a DHCP address from my wired router. And I've found the same as you -- machines that are on a wired network have no trouble reaching each other (having gotten DHCP addresses from wired router). But machines that are only wifi connected (having also received DHCP addresses from the same wired router), will intermittently not be able to connect to each other.
A couple of things you could try,

Set the dhcp pool higher, example 192.168.1.10-192.168.1.254, make your access point static assignment reside from an address within 192.168.1.2-192.168.1.9.

Another possibility; just let the accesspoint get an address from the dhcp server.
 
Disable Guest(2.4 No1) seems to somewhat mitigating the issue. May go further debugging.
 

Similar threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top