What's new

QoS Mysteries

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

shelleyevans

Regular Contributor
I am trying to learn the ways of QoS, in order to improve VOIP calling. As recommended, I have installed the great script by @FreshJR, and am using adaptive QoS with manual download/upload bandwidth numbers. I am using stock firmware, because I am running AiMesh. My ultimate goal was to have bufferbloat scores of A+, but I have spent many hours over several months changing the numbers and running speed tests on DSLreports, and can't raise my bufferbloat above an "A". That's good enough that am ready to call it a day. But I remain extremely baffled by how QoS works. I would love to hear what more experienced members have to say about my observations:
(In the interest of a simpler read, I have edited this original post, with answers that emerged from the conversation below as well as further testing. My final (October 2) edits to these observations: AiMesh did NOT work for me when it came to controlling bufferbloat. I ran AiMesh for a long time, and was forced to use the stock SFQ queue discipline, which gave me very unpredictable results on DSLReports speedtest, and poor VOIP. I returned to Merlin WRT firmware, using FQ_Codel queue discipline and @FreshJR script, and everything is now working beautifully. )

1) Buffer-bloat scores on DSL reports are not consistent, even with the same set up on the router. It turns out that many combinations of download/upload bandwidth can sometimes give me an "A" in bufferbloat. In order to get a true sense of how a given download/upload combination performs, I have to run the test multiple times and then do the math-- 90% of the time 410/42 gives me an "A", for example, while 0% of the time 710/42 gives me a "A". Running the tests 8 or more times for each possible combination is unbelievably time consuming. Is this the way the tests are supposed to work, or am I doing something wrong?
Edit: unless somebody corrects me, my answer is, YES, this is how it works. You just have to run the numbers and study them. Painful. The best solution to that turns out to be make a DSLReports account, and start saving all those tests, making sure to NOTE on the test what your settings are when you test (ie, "AiMesh enabled, 700/40 Bandwidth"). You can go back later and add a note, by clicking the "results and share" button.

2) I can't get a reliable "A" in bufferbloat unless I set my download bandwidth at 410, which is less than half of my rated speed. I have a gigabit connection, rated at 900/35, with average speeds of 740/42. Many people advise using 80-90% of your average speeds, but if I enter those download bandwidth numbers my bufferbloat grade drops to a "C" and sometimes worse. Is it common to have to give up so much of your actual speed to get good bufferbloat scores?
Again, unless someone corrects, the answer is it could be the case, especially if you have really high download that the router has trouble managing, and/or really erratic download speeds, both of which are true for me. I currently set my download speeds at 440, slightly less than 1/2 of my rated ISP speeds. This gives me consistently good results with the setup described above.

3) Does setting my download bandwidth to 410 reduce my actual download speed to 410? Please excuse this apparently ignorant question. I still don't quite understand what happens to the other 250-450 bits of data coming from my ISP. YES

4) Lowering my upload bandwidth causes my bufferbloat grades to drop, sometimes dramatically. EDIT: This turns out not to be true, when using fq_codel queue discipline (Merlin Firmware)! Most of my bufferbloat woes were finally solved by moving away from sfq.

5) The upload bufferbloat (on the little test meter) fluctuates between 20-50ms, and sometimes spikes much higher, no matter what settings I change in the router. EDIT: This turns out not to be true, when using fq_codel queue discipline (Merlin Firmware)! Most of my bufferbloat woes were finally solved by moving away from sfq.

6) Raising my download bandwidth doesn't hurt the download bufferbloat numbers, but sometimes dramatically hurts the upload numbers. Why would raising the download bandwidth negatively impact upload bufferbloat? My bad: it DOESN'T. AT ALL. Once I studied the numbers more closely, I realized that raising download bandwidth hurts download BB, and doesn't do a thing to upload.

Apologies for this long thread, and thanks in advance to anybody who can provide insights. I have read a great deal on this topic, and invested a crazy amount of time trying to understand how QoS works. I'm ready to put the project to bed, but would like to try and make sense of what I have learned.
 
Last edited:
If you are using the factory firmware the freshjr add on for QOS may not work. Actually it may not start on a reboot. Yes, it is possible to get a script to run on factory firmware start but you would be best to factory reset your router and reconfigure it. For most the built in QOS works well.

Sent from my P01M using Tapatalk
 
Actually, the script works very well. I did complete factory resets and reconfiguration when I set up AiMesh-- more than once. What I discovered is that I experience glitches in VOIP even with QOS enabled. Since the script had worked beautifully for me when I was using Merlin firmware, I installed it, and everything seems back to normal-- except for the QoS anomalies I describe in my original post. I'm happy with the results I got, but curious about the answers to my above questions. Many of the behaviors (inconsistent results on speedtest, higher bufferbloat on upload) were present before I installed the script, after upgrading my router and installing AiMesh. So I don't think it's the script. :confused:
 
What type of connection do you have? What router model do you have? What is is your modem model?
 
Modem: SB8200
AiMesh Router: RT-AC86U running FreshJR's script (I ran the mesh for about a week before installing the script and got similar numbers on DSLReports/speedtest)
AiMesh Nodes: RT-AC68U and RT-AC66U_B1
Connection: Comcast Gigabit Cable, 1000/35
 
Modem: SB8200
AiMesh Router: RT-AC86U running FreshJR's script (I ran the mesh for about a week before installing the script and got similar numbers on DSLReports/speedtest)
AiMesh Nodes: RT-AC68U and RT-AC66U_B1
Connection: Comcast Gigabit Cable, 1000/35
When are you running these tests? During high traffic times? DOCSIS connections degrade during high traffic times. They are shared with neighbors. You will get different results based on available bandwidth for bonded channels depending on time of test, and network congestion. The more neighbors in close proximity is also bad for example an apartment complex. Is internal network traffic at a minimum running these tests?
 
DOCSIS connections degrade during high traffic times. They are shared with neighbors. You will get different results based on available bandwidth for bonded channels depending on time of test, and network congestion.

Actually not that much - DOCSIS is shared, but it's a scheduled MAC, and most providers will present a level of service contracted to. The cable modem "shared with neighbors" thing - that's a myth promoted by the DSL providers... DSL is hub/spoke, but they share bandwidth upstream with all DSLAM's... so it's all the same, and Cable beats DSL like a red-headed stepchild in overall capability - only Fiber is faster than CM's..

Asus Adaptive QOS is a bit broken - and this has been reported many times over the Sub-Forums here...
 
Thanks for both observations. I usually test in the very early morning, before people are using the network (although you never know what's happening behind closed doors!).

Asus Adaptive QOS is a bit broken - and this has been reported many times over the Sub-Forums here...

This is helpful to hear. Possibly in addition I have unreasonable expectations of how much QoS can actually do, given the many variables (ISP, Ping, nodes, etc.). I have been scouring the internet for discussions of bufferbloat, and saw on another forum, for a different brand of router, reports of similarly inconsistent (sometimes a grade of B, sometimes A) and baffling (raising the bandwidth sometimes improves bufferbloat) results. I have also come across more than one seasoned contributor who says, basically, what matters most is your actual experience. Meaning: if I have a bufferbloat score of B but my VOIP is breaking up, I have a problem, no matter what my "grade" is. And, conversely, if my grade is B and my call quality is excellent, time to stop tinkering. :)
 
1) Your speeds are very high. They are both past the limits of the CPU && even the limits of wifi connection itself. Inputted QOS limits have to always be beneath max attainable speeds for QOS to manage bufferbloat properly. QOS doesn't care if the bottleneck is the wifi link or the ISP itself or even the CPU. Simply put, the instantaneous limit has to be lower than max attainable at any given time. The issue with your connection is that it is faster than your wifi link, as the wifi link has a variable speed depending on your distance from the router. It also might be maxing the CPU.

These variable wifi link rates are the reason why a lower value is performing better than your actual attainable value from your ISP. Your CPU also most likely can't push 700 mbps either, let alone 1000mbps.

2) See answer to 1 (Are you testing on a wired ethernet connection by chance?)

3) Whatever speeds you input into QOS will limit overall bandwidth to that value. Any potential above inputted speeds will not be utilized. Think of it as a funnel. Even if your ISP allows a higher flow rate, the QOS restriction beforehand not utilize the full potential of the funnel alloted to you.

4) No idea why this is happening !! Lower speeds should have a better buffer bloat grade. Potentially check CPU usage and wireless link rate when this is happening to see if those are introducing the bloat.

5) While you are experiencing variance, 50-70 ms of bloat isn't terrible. (For reference I get 1000-3000 of bloat without QOS and 10-40 with QOS)

6) This shouldn't be a correlation, something else might be causing it.

@bbunge There is no reason QOS shouldn't work on stock firmware, that is assuming you don't mess with the settings in routerUI after the script performs its initial run. If you do mess with the webUI after the script has ran, the script will remain in an off state until the router is restarted or 3:30 am. I tested this functionality personally. These limits mentioned are simply due to lack of ability of what can trigger the script on stock firmware after meddling with the settings. If this is not what you experience, uninstall the script, and follow the install instructions more closely.

overall, @shelleyevans I feel like the download portion of your conenction is too much for QOS to handle. The upload portion should be working properly, so IDK what is causing your bufferbloat during that duration.

Did the same occur on RMerlin firmware?

SFQ, the schedular present on stock ASUS firmware is not very good

RMerlin had fq-codel available which gave me very good results.

Don't fret over a B grade. QOS peronally took me up for from an F!!!
 
Last edited:
@FreshJR, thank you for the thoughtful reply! And needless to say, for your script!

1. and 2.
Your speeds are very high. They are both past the limits of the CPU && even the limits of wifi connection itself.
This might explain most of my problems. Since I'm trying to fix VOIP, and all of my ATA's are hard-wired, I always test on a wired connection on the same desktop computer. I discovered early on that my wifi test results are not the same as my hard-wired tests, and that even a wireless AC connection will give me different results than wireless N.

4.
Lower speeds should have a lower buffer bloat grade. Potentially check CPU usage and wireless link rate when this is happening to see if those are introducing the bloat.
I think you mean lower speeds should give a higher grade, right? And lower bufferbloat numbers? Glad to have this confirmed, even though my upload bandwidth numbers defy the rule.

5.
While you are experiencing variance, 50-70 ms of bloat isn't terrible. (For reference I get 1000-3000 of bloat without QOS and 10-40 with QOS)
Good to know. Makes it easier to move on;).

Bottom line, I feel like the download portion of your conenction is too much for QOS to handle.

Edit: it might be too much for QoS to handle well, but when I turn QoS OFF, my bufferbloat grades drop to a C. The gigabit service is pretty unstable, with download speeds ranging between 5oo and 900. Perhaps that's the problem? In which case it makes sense that setting my download bandwidth lower than the lowest download number would give me better bufferbloat scores...?

Some have suggested that with these speeds I should turn QoS off. But with QoS off, I have choppy audio. Turning QoS on allows me to give priority to my ATA devices in the bandwidth monitor, and also priority to VOIP traffic in your script. This SEEMS to make a big difference. Even if QoS isn't working well with these high download speeds, is it possible that it can still successfully prioritize these devices/traffic?

Edit: just saw this--
Did the same occur on RMerlin firmware?
The answer is no. I didn't use fq-codel but I did have better luck adjusting QoS on Merlin, and could set my download bandwidth higher.
 
Last edited:
Some have suggested that with these speeds I should turn QoS off. But with QoS off, I have choppy audio. Turning QoS on allows me to give priority to my ATA devices in the bandwidth monitor, and also priority to VOIP traffic in your script. This SEEMS to make a big difference.

I am in the same camp that with speeds so high that QOS **should** be unnecessary.

But as you experienced if you get choppy audio without QOS, and better performance with QOS, then the solution is clear in that keeping QOS ON is beneficial.

Since your VOIP devices are wired, I would input the limits that you find while testing on the wired devices. Don't worry about the bufferbloat that can be present on wireless devices as it will NOT be present on wired devices.

--

One last thing, after your perform a speedtest.

1) click "Results & Share"
2) on the following page you should see a bar graph of your download/upload speeds, with variance bars overlayed.
3) click on the yellow upload bar and it will expand your upload results.
4) post a upload bloat over time so I can visualize the performance

qos.png


If you look at my results, my download speed has a nasty spike at the beginning of the test, but looking closely at my download results **not pictured** I see that it settles down after the soon after the test begins.

EDIT:

I think you mean lower speeds should give a higher grade, right? And lower bufferbloat numbers? Glad to have this confirmed, even though my upload bandwidth numbers defy the rule.

Correct. Lower speeds should give lower bufferbloat which will result in a higher grade. I edited this after proof reading the post, but I have a bad habit of proofreading after I submit the post!!
 
Last edited:
Just for yucks, I ran a couple of speed tests with my download bandwidth set to 1000 and then 410 (my current magic number), and watched the CPU in the router GUI. In both cases during download the CPU usage went up to about the same amount-- ~40% with 1000 and ~30% with 410. That doesn't sound like it's getting overwhelmed....?

Screen Shot 2018-08-19 at 1.59.47 AM.png

That's my upload bufferbloat. I cheated and used a wifi connection, because it's late and I'm far from my desktop. That said, my upload numbers are well below the limits of my wifi adapter, so it shouldn't be too different. And I see lots of spiking, which is what I observe on my desktop. The plot thickens...
 
Those spikes only last a few milliseconds. The average/overall results look good.

Dslreports should more closely look at average speeds as it seems intermittent spikes affect the grade quickly.

Feel free to test less servers simultaneously, but ones close to your location to rule out traffic congestion beyond your control.

EDIT:

Let me upload some results without QOS to make you feel better.

Worrying between an A and a B is significantly different than worrying between an B and an F.

qos_bad.png


100-150ms plus performance is poor.
1000+ is ridiculous!
 
Last edited:
@FreshJR, thank you for the thoughtful reply! And needless to say, for your script!

1. and 2. This might explain most of my problems. Since I'm trying to fix VOIP, and all of my ATA's are hard-wired, I always test on a wired connection on the same desktop computer. I discovered early on that my wifi test results are not the same as my hard-wired tests, and that even a wireless AC connection will give me different results than wireless N.

4. I think you mean lower speeds should give a higher grade, right? And lower bufferbloat numbers? Glad to have this confirmed, even though my upload bandwidth numbers defy the rule.

5. Good to know. Makes it easier to move on;).



Edit: it might be too much for QoS to handle well, but when I turn QoS OFF, my bufferbloat grades drop to a C. The gigabit service is pretty unstable, with download speeds ranging between 5oo and 900. Perhaps that's the problem? In which case it makes sense that setting my download bandwidth lower than the lowest download number would give me better bufferbloat scores...?

Some have suggested that with these speeds I should turn QoS off. But with QoS off, I have choppy audio. Turning QoS on allows me to give priority to my ATA devices in the bandwidth monitor, and also priority to VOIP traffic in your script. This SEEMS to make a big difference. Even if QoS isn't working well with these high download speeds, is it possible that it can still successfully prioritize these devices/traffic?

Edit: just saw this-- The answer is no. I didn't use fq-codel but I did have better luck adjusting QoS on Merlin, and could set my download bandwidth higher.
I just want to point out its important to plan your qos priorities. Your voip shud high if not top priority but also important that its not top priority in a container with tons of other services all using your max bandwidth or essentially all you habe is bandwidth limiting going on. Everything in that container will share evenly sorta. Fqcodel helps with this a little tho by pushing small packets thru 1st. So you want your voip like top priority in a container that wont be using max bandwidth ever if possible then it can for sure get all your voip could ever need.
 
Let me upload some results without QOS to make you feel better.
HAHA. This puts it in immediate perspective! What's interesting about this forum is you have all these really smart people chasing 0 bufferbloat because they can, and/or because they have really intense gaming demands, and it's easy to get the message that without an A+ score your internet will fail. Which leads me to--

I just want to point out its important to plan your qos priorities.
Exactly! Thank you for the reminder. My ONLY priority is to have phone calls that don't sound like ransom threats. ;)
I must remember this! For those following this thread, to educate themselves, I will say that my VOIP is almost entirely ATA based, and those devices have highest priority in bandwidth limiter, which I believe (according to FreshJR's advice in other threads) gives them their own special container right at the top. And the other VOIP (wifi calling, FaceTime, Skype, etc.) is probably in "Other" because I'm using FreshJR's script. We are not a heavy gaming, Snapchatting household, so I think my "Other" container is probably nice and empty. I definitely observed this when I re-installed the script. Wifi calling was BORKED after I set up AiMesh with stock QoS. Once I got the script running again, it was back to normal.

Fqcodel helps with this a little tho by pushing small packets thru 1st.
FWIW, when I was running RT-AC68U with Merlin and script, I could NEVER get fq-codel to work. Some others reported difficulty getting it to work, so I gave up, since sfq allowed me to get the results I needed. Here's what upload bb looked like with Merlin FW and FreshJR script, using the handy bufferbloat over time graph:
8:2:18 900:40 RT-AC68U with Merlin and Script - UPLOAD.png

This is the holy grail I have been chasing since going back to stock. So hats off to @Merlin and @FreshJR for creating a winning combination.

Because I am trying to learn, not just fix my problem, I went back through the shocking number of entries I have created this month, and studied the bb graphs, which I didn't even know existed until @FreshJR pointed them out. I saw that, indeed, QoS works-- here's what happens to my upload bufferbloat without QoS enabled:

8:14:18 AiMESH RT-86U NO QOS UPLOAD.png
This steady climb into the red zone repeats over every single test! Once QoS and the script are applied, all the tests look like this:
8:16:18 AiMesh 410:42 UPLOAD.png
Bufferbloat is definitely being controlled, with spikes. Checking my history, over many tests, my maximum upload bufferbloat seems to fluctuate between 54 and 76, mostly in the 60's, whether I set my upload maximum at 40, 41 or 42. This tells me that bufferbloat can be controlled, but spikes can't, and my current router/mesh/ISP combination seems to be giving ~60 upload bufferbloat at best. (Studying the history, I see that I can make those upload bufferbloat numbers MUCH WORSE by lowering my upload maximum past 40-- the lower I set the threshold, the higher my bufferbloat. No idea why.) There is no correlation as I thought between lowering the download threshold and upload bufferbloat. Lower download thresholds give me better BB grades because my download BB improves, from averages in the 30's when the download thresholds are high, to 20's as I lower .

With that knowledge, and with @FreshJR's observation that 30 is a good number for BB, I think I will raise my download numbers to ~720, which is the recommended 90% of my measured download speed, and call it a day-- unless my VOIP gets strange.

The best part of this whole exercise is that I understand much better how QoS works, and how to use the tools that will help me test it! HUGE thanks to all who have helped along the way. Mission accomplished. (Edit: I have edited the OP to reflect these lessons as succinctly as possible...)
 
Last edited:
The only two things I can add to this conversation are that getting the frame right right for your technology (DSL/cable) is important, and the spikes I see in these reports look partially like that. I'm not sure if freshjr's scripts do that? There is enormous technical detail as to why and how, here: https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details

Secondly, lines do sag, and active monitoring of the physical link speed is needed.

Both of these things are difficult to do today with the data we get back from the underlying drivers and devices upstream. Only evenroute has come close to getting it right so far (and we made it much easier to do both these right with sch_cake, but still active monitoring helps). I continue to push firmware makers to make the right stats available, without a lot of luck. (feel free to nag broadcom/comcast/qualcomm/sagecom and your ISP about it, it's kind of a lonely quest, also nag 'em to add BQL and native fq_codel support)

And @FreshJR ? If ever you are in the bay area, I'd like to buy you a beer. Perhaps several. You're doing a great job over here!
 
Asus QoS is quite buggy and not flexible enough, I also had some bad experience when using it with VOIP and my result similar to yours. Here is some test result of another brand router with PCQ only (means only guarantee bandwidth, no priority), I am on 1G ethernet (FTTB), you can see that there is a huge different when compare to typical asus result.
37446346.png

untitled-2-jpg.14150
 

Attachments

  • Untitled-2.jpg
    Untitled-2.jpg
    48 KB · Views: 2,401
There is enormous technical detail as to why and how, here: https://openwrt.org/docs/guide-user/network/traffic-shaping/sqm-details
This looks like it will be a fun read. Might possibly lead to me being fired from my day job, but very interesting. Thank you! Edit: it WAS a fun read! I was going to say that it should be a sticky, but I'm pretty sure I wouldn't have understood any of it if I hadn't spent all this time tinkering.

And @FreshJR ? If ever you are in the bay area, I'd like to buy you a beer. Perhaps several. You're doing a great job over here!
I'm sure MANY on this forum would agree.
 
Last edited:
Would like recommendation.
Have a Comcast cable modem with four ethernet and one USB connections. Speed is 30 meg down and 5 meg up. Three routers connected two of which are RT-AC68U's. The Asus routers are running John's fork. I would like to set the QOS on the Asus routers but am wondering if I should use speeds lower than the total bandwidth of the modem.

Sent from my P01M using Tapatalk
 
Would like recommendation.
Have a Comcast cable modem with four ethernet and one USB connections. Speed is 30 meg down and 5 meg up. Three routers connected two of which are RT-AC68U's. The Asus routers are running John's fork. I would like to set the QOS on the Asus routers but am wondering if I should use speeds lower than the total bandwidth of the modem.

Sent from my P01M using Tapatalk

Yes you should. Just follow the QOS Setup instruction my thread, even if you are not using the script.
(I do recommend it tho)
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top