Thanks for the extensive testing. Odd that in my particular case, I saw slightly better results with SFQ. I'm on a cable connection however, so typical latency is quite different from DSL. I also didn't test with a lot of simultaneous streams.
For me, I can get A regardles of whether I use SFQ or FQ_CODEL. However I do limit by down/upstream slightly below their maximum speed, as is recommended in most QoS guides (including OpenWRT's own guide on SQM).
Last night's tests mostly involved having two computers connected to the router. One had a high priority, the other one a low priority (within Asus's QoS rules). Then I ran speedtest.net on both computers at the same time, comparing results, while doing a constant stream of pings at
www.google.ca (my ISP has a local proxy for it, so my typical ping to it averages 13-14 ms while the connection is at rest). Both schedulers did as well at spreading upstream data between the low and high priority computers (out of 10 Mbps one was steady at 1 Mbps, the other was as 9 Mbps). Downstream still didn't work properly (I would probably need to change Asus's QoS rules to use IFB to help with this, something that's a bit outside of my current expertise level). Ping results were inconclusive (too much variation in the collected data).
I also did some tests using Pingplotter so I'd get a visual report of the latency, once again the data was inconclusive.
As for my tests using OpenWRT's SQM rather than Asus's QoS, something was totally wrong with it (speeds were cut by 1/3 in both directions).
So far, the kernel changes don't seem to create any issues with the closed source components, so that would allow me to include it on master, and hide/expose the sched control knob based on the router model.