I wanted to share some of my findings recently, which I have been banging my head against the desk for. Not 100% sure when it started, but I noticed that after going back and install diversion that sometimes I would see my internet drop/pause, google homes would say they are disconnected, etc. I would see dnsmasq running taking 100% cpu, and high network processing. This led me to turn off diversion temporarily and I kept seeing other issues.
In the logs I would regularly see:
br0: received packet on eth3 with own address as source address
and in most cases, it was pre-pended with a rate limit of the same message showing thousands of them.
I also noticed that bcmsw_rx would take 1 full core (~25% usage) at certain times, not always correlating with the messages above. The CPU usage could be reproduced by unplugging my laptop which is hard-wired into the router directly. It was uncanny, I would unplug and bcmsw_rx would spin up, I would wait a bit and plug it back in, and it would spin back down.
Both things made me think something was wrong. I had recently done a factory reset and manual reconfigure, and I was ready to do it again, just in case. I turned off AIProtection, I turned off Protected Mgmt Frames and Wifi6 roam. I turned off Spanning Tree (and reboot), and still saw the log messages and errors. Then looking at my bridge information I would see entires like wds0.1.0 or wds1.0.6, which had the same MAC as the ethX or wireless. This got me thinking that it was tied to my AIMesh nodes running.
I turned off one node, and could still reproduce. I disconnected the other and boom the problem went away. I plugged back in and could reproduce the issue. I would notice that at the same time kworker (kernel tasks) would spin up and take ~8-10% CPU on the AiMesh node. Something was generating traffic since bcmsw_rx seems to run with network traffic.
I tried doing:
for all the ones created (it varies on each boot, sometimes 4 sometimes 2 or 3). And guess what, it worked, the CPU spikes are not reproducible, and the log warnings of the same source address have gone away (the odd one here and there, think it is when my device moves from node to node).
I think that when my router comes up, the AiMesh nodes get WDS wireless backhaul channels setup. Sometimes more and sometimes less, I would guess depending on timing if it detects ethernet setup. Not sure of the pattern exactly yet. It seems that if I follow those wds* connections via ifconfig that I see the send traffic go up, to a few megabytes, so something is being sent. They have no IP address and show up in the Wireless Log page as well. They are gone once I disconnect them "down" via ifconfig.
So I have made myself a quick script to run on boot (with some delay) to turn these off manually for now.
Code:
#disconnect wds interfaces
for item in $(ifconfig | awk 'BEGIN {FS=" "} /wds/ {print $1}')
do
echo "Disconnecting $item ..."
ifconfig $item down
done
Will keep watching to see if it comes back or what else happens. Just waned to share my findings. Unsure if this is a bug or expected.
---- edit ----
running Merlin 384.16 alpha 1 - the original one. Have done a couple full resets and reconfigures as I have played around. This also lead me to the understanding that 160Mhz channels stop working once you AiMesh in a 68u and reboot. Seems to stick to just 80Mhz after that, but that makes sense.