Hey all — I’m running OpenSpeedTest with NGINX directly on my Broadcom-based router (Asuswrt). During Speedtest downloads (i.e., when the router is serving data to a wired client), I’m seeing a major performance bottleneck over Ethernet.
The download speeds are much lower than expected, while upload speeds are unaffected. Watching htop, I noticed that the [bcmsw_rx] kernel thread is pegged at 100% on a single core (usually CPU0) during the download. This wasn’t the case before as noted above.
As mentioned previously, on that firmware release, I was able to hit full speed and could see all four CPU cores being utilized during heavy download traffic. Now, it looks like only one core is doing all the work — and that core gets saturated. This seems to be a new bottleneck, and I’m wondering if something changed at the kernel or driver level.
From what I understand, [bcmsw_rx] is responsible for handling RX packets from the Broadcom switch. Since it’s a kernel thread, its CPU usage doesn’t show under user-space processes like NGINX — which aligns with what I’m seeing.
Questions:
- Has anyone seen [bcmsw_rx] bottleneck like this during router TX?
- Could this be due to IRQ/core binding changes in the firmware?
- Any way to spread RX processing across multiple cores or tweak IRQ affinity?
Happy to provide logs or additional testing details if it helps. Appreciate any insights!
@ColinTaylor - I know this is an area of your expertise, so hoping you may have some insight as well.
Results:
htop:
top: