What's new

RT-AC68U 802.11 DSCP QoS issue?

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

brocellous

New Around Here
Hi.

I have an old RT-AC68U router running Asuswrt-merlin 386.7, and an Arch Linux laptop with linux 6.8.9, and an intel ac 8265 wireless card.

I had trouble with severe packet loss (70-99%) to cloudflare services and web hosts specifically when connected to the wifi, and I finally discovered it was a consequence of a quirk of cloudflare's servers that unconditionally set TOS byte 0x10 (DSCP 0x04) in ipv4 traffic received by the router. This seems to set the socket priority in linux for forwarded packets, and some consequence of this modified skb priority results in severe packet loss transmitting to my laptop specifically — it doesn't affect all wireless clients on the network. Stangely, the packet loss seems to be mitigated with a high enough packet rate, and is more severe with sparse traffic.

Anyway, the following iptables rule completely resolves the issue for me:

$ iptables -t mangle -A FORWARD -j CLASSIFY --set-class 0:0

Simply washing the DSCP bits without resetting the priority does not help.

Right now I'm wondering, what is the actual expected effect of this priority? What is changed that in the transmission that could cause differential treatment by my laptop? On the router, QoS is not enabled in the Adaptive QoS tab. Disabling wme with `wl wme 0` also doesn't seem to have any effect, but the difference with and without the iptables rule is night and day, and the effect is immediate: I can remove and re-add the rule to watch the problem reappear and get fixed instantly.

I wasn't able to discern any difference in the recieved packets with tcpdump. Any clue what it could be?
 
Without any QoS on the router, I can’t think of a reasonable explanation. What about on the laptop? Any qdiscs or WME there?
 
Not that I'm aware of.

$ tc qdisc show
qdisc noqueue 0: dev lo root refcnt 2
qdisc noqueue 0: dev wlan0 root refcnt 2

I understand that iwlwifi incorporates fq-codel internally in some way, I think to better account for variable airtime, but in any case I think these qdisc would apply only to egress, and I don't have any trouble with the egress traffic. In every case the drop happens for returning packets between the AP and my laptop. To be clear, I _can_ see the dropped replys on the lan bridge interface with `tcpdump -i br0` on the router before they are (presumably) emitted by the AP.

I have no idea how to characterize the WME/802.11e behavior of the intel card. I think wl is a bcm specific command? It doesn't seem to be packaged for Arch anyway.
 
So, I really wasn't expecting any difference but I can't seem to reproduce after upgrading to 386.13.

The issue was always mitigated by a high enough packet rate, and in practice I would have to then idle for ~60-90 seconds before the issue would be reproducible again. It was finnicky, but I was able to reliably reproduce this bug many many times over several days (it took me a while to find the cause). Now this idling strategy doesn't seem to work. I'll update if it reproduces again.
 

Latest threads

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top