What's new

[Release] FreshJR Adaptive QOS (Improvements / Custom Rules / and Inner workings)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Status
Not open for further replies.
I couldn't get the issue repeated. How fast is your internet connection? What were you doing.

The bucket error doesn't seem to refer to the QOS scheduler buckets overflowing, but rather too many TCP connections in the time wait stage. There wasn't enough resources to keep track of more connections.

Code:
Time WAIT =
(1)Server wants to close connection, send you a FIN (terminate connection) message.
(2)You receive that message, and you send an AWK (acknoldge that message)
(3)You also sen a FIN that you are terminating this connection
(4) You are now in TIME WAIT for the server to send you last message that it AWK'd your termination aswell

This is done to prevent potential overlap when opening the new connection.

QOS should never of dropped that last AWK packet since that is net control. Even if that packet was dropped, no new connection should have been established between you and speedtest since that time wait was not finished.

Could it be that you were getting DDOS'd or doing something on your network that created an insane amount of connections. Something had to be opening a lot of connections during that time.

I do not think QOS should be able to go rouge with its setup. Your changes were correctly implemented and really minor from mine.

You do not need App Analysis turned on for QOS to function. If you did that imposed a way larger load on the router that was already strained from high connections. Not only does each packet now get inspected and sorted with QOS, with app analysis every connection has to be looked up, referenced by name, and also have bandwidth tracked. If you had were already pushing the limits of active open connections, and added the burden of looking up each connection + tracking its bandwidth, no wonder the kernel dropped all services to deal with this high amount work.

If you want to try again and encounter this issue check open connections under the Tools in webUI.
Connections 515 / 300000 - 28 active

If it is high, go to Network Tools in webUI, netstat, netstat-nat, sort by state, netstat.
See what demon on your network is unleashing this hell. Note, drop app analysis if you keep it running to prevent additional unnecessary load.

You may have been getting DDOS or have an infected computer part of a botnet on your network. Kernal panic may probably would have been avoided without the QOS overhead but besides this theory I have no idea what happened.

If you regularly run that many connections, than you might be over the limits of this router.
 
Last edited:
I do not think QOS should be able to go rouge with its setup. Your changes were correctly implemented and really minor from mine.

Thanks for checking it out, I appreciate it. I'll give it another try on a fresh install in the weekend.

Regarding speed: I have a DOCSIS 3.0 Cable connection with 150 Mbit downstream and 15 Mbit upstream, which are most of the time when measured slightly higher. I've never had any issues before, nor have I noticed any changes with the new fq_codel discipline QoS with Docsis preset by @RMerlin in the current stable release.

I wasn't doing anything special, just a bit of surfing and only my laptop out of 21 devices was actively in use, besides a baby cam streaming live video on the LAN to my iPad, furthermore some low traffic from/to several IoT devices around the house. So nothing out of the ordinairy. I haven't found any evidence that I was dDOS'ed as my outgoing connection was stable as usual. Could it be that changing the variables while the script was in use, caused this? Even if I rebooted after I noticed the errors, but the errors still filled the logs after the reboot?
 
On the topic of app anaysis im qurious, so how does it affect qos, if its tracking band width, is it onlynlike you said before jsut there to show the bandwidth usage of devices and which apps are pulling data on eg a tablet the facebook app or is there any other purpose to it does it tie into traffic history logging page. Ive always wondered about it.
 
Correct, it just tracks bandwidth per packet pattern match per client so it can display that information to you.
I do not use that information, so I keep it OFF to save/not waste processing power/ram.

Procedure is like this

Adaptive QOS (ON)
-inspects packet
-packet matches pattern in database
-packet gets marked according to database entry
-marked packet gets sorted into its traffic container

Traffic Analyzer (ON)
-on packet pattern match it additionally gets cross reference to the pattern name identifier in database
-that packet size is added to its pattern identifier counter to show bandwidth (each user has separate counter per same pattern ID)
-this information is processed and shown in webUI

All in all, the traffic analyzer procedure is optional. Adaptive QOS will work without traffic analyzer

Traffic history functions on same principle but is also optional.
 
Last edited:
thanks i appreicate the explantion, seems like it woeld be awesome if your wrote a guide to adaptive qos with explantions on the bells and whistles.
Also thanks for the tip i noticed after disabling it the ram usage went down.
 
Last edited:
Hi. I've been using your script for couple days with good results but google photos goes to net control packet although I've included HTTPS filters indicated on page 9. I've also tried filtering port 443 by using custom rules without much success. Any idea ? Thanks
 
Hi. I've been using your script for couple days with good results but google photos goes to net control packet although I've included HTTPS filters indicated on page 9. I've also tried filtering port 443 by using custom rules without much success. Any idea ? Thanks

If it is going to netcontrol that means it is matching some filter rule, but not matching your custom rule.


Out of box these two filters point to net control

-0x80090000 (Management tools / protocols)
-0x80140000 (Network protocol)

You can run

Code:
 tc filter show dev br0 | grep "1:10" -A 1

To see which filter it is matching by looking at the success counter incrementation between successive calls.

So while you also added 0x8013, which also handles https traffic shown on page 9, I am willing to bet that google photos is instead matching either marks 0x8009 or 0x8014 instead. You will have to fix your custom rule instead so it performs a match and directs it to your desired container.

Note: the custom rules are EGRESS traffic, that means traffic physically moving AWAY from the router.

Download EGRESS traffic is all the the packets pushed from your router towards the PC.
The dst port for download egress is your PC's receiving port. **Template rule
(If your PC is not receiving the photos through 443 the rule will fail)​
The ip src port for download egressis your router originating port. **Not included as script template
(Due to port forwarding, the routers src port and the PC's receiving port may differ)​

Upload EGRESS traffic is all packets pushed from your router towards the google server.
The ip dst for upload egress is the servers receiving port.**Not included as script template
(If your router is not sending photos to WAN port 443, the rule will fail)​
The ip src for upload egress is the router sending port. **Template rule
(If your router is not sending the photos from 443 the rule will fail) **Example rule
First double check your custom rule syntax. Next see where traffic is actually going using the routers netstat-nat command. Finally, if you really need ingress traffic matching, that has to be done within iptables. That example was included within the script but for different types of rules.
 
Last edited:
@FreshJR I've upgraded to @RMerlin's 380.68_alpha2-g5d6b6dd in the meantime. I did a fresh install. Can I use your script with 380.68 as well, given the amount of changes under the hood between .67 and .68? Anyone running it already on 380.68_alpha2 successfully?
 
@FreshJR I've upgraded to @RMerlin's 380.68_alpha2-g5d6b6dd in the meantime. I did a fresh install. Can I use your script with 380.68 as well, given the amount of changes under the hood between .67 and .68? Anyone running it already on 380.68_alpha2 successfully?
I'm running it, since it's not a gui based script it should be fine, seems to work for me.
 
@FreshJR I've upgraded to @RMerlin's 380.68_alpha2-g5d6b6dd in the meantime. I did a fresh install. Can I use your script with 380.68 as well, given the amount of changes under the hood between .67 and .68? Anyone running it already on 380.68_alpha2 successfully?

Most likely yes. Just try it.

The script just changes the value parameters that were passed into the the traffic control engine originally by asus. It doesn't really break between updates since it doesn't try to integrate anywhere or install anything.
 
If it is going to netcontrol that means it is matching some filter rule, but not matching your custom rule.

Out of box these two filters point to net control

-0x80090000 (Management tools / protocols)
-0x80140000 (Network protocol)

You can run

Code:
 tc filter show dev br0 | grep "1:10" -A 1

To see which filter it is matching by looking at the success counter incrementation between successive calls.

So while you also added 0x8013, which also handles https traffic shown on page 9, I am willing to bet that google photos is instead matching either marks 0x8009 or 0x8014 instead. You will have to fix your custom rule instead so it performs a match and directs it to your desired container.

Results:
filter parent 1: protocol all pref 12 u32 fh 800::800 order 2048 key ht 800 bkt 0 flowid 1:10
mark 0x80090000 0x803f0000 (success 1712)
--
filter parent 1: protocol all pref 23 u32 fh 801::800 order 2048 key ht 801 bkt 0 flowid 1:10
mark 0x80140000 0x803f0000 (success 100095)

So, it seems to be
-0x80140000 (Network protocol)

Added rules :
${tc} filter add dev br0 protocol all prio 15 u32 match mark 0x80140000 0x803f0000 flowid ${Web}
${tc} filter add dev br0 protocol all prio 15 u32 match mark 0x80130000 0x803f0000 flowid ${Web}
${tc} filter add dev eth0 protocol all prio 15 u32 match mark 0x40140000 0x403f0000 flowid ${Web}
${tc} filter add dev eth0 protocol all prio 15 u32 match mark 0x40130000 0x403f0000 flowid ${Web}

What I am wondering is: Can I get problems with real network control packet that could go in the wrong container ?

Thanks for you help
 
That's a terrible rule.

You will be moving all your net controls into the web container. Net controls are intended to the processed asap for a responsive internet.

You should try to figure out a better rule that would only catch the photos instead.

Maybe google has a set iprange for their photo severs. Try Whois/dns lookups for those domains to see the ranges they resolve too.

https://support.google.com/a/answer/2589954?visit_id=0-636379033563589379-2115053252&hl=en&rd=1

If not maybe dump the packets and see if they have a unique TOS/dscp mark or anything else you can filter on.

HTTPS traffic is rough. Trend micro identifies a suprising amount of https traffic.

Worst case just filter 443 tracfic for packets above a certain size. Get creative.

Note: When replacing existing rules, watch the PRIO definition since both those rules should not be prio 15.
8014 was an existing rule located at pref 23. To redefine its container destination you should of used pref23 instead of 15. By placing it on pref15 now you have a duplicate with the original on pref23 (the one on pref15 will get matched first).

I updated the table on page 9 to make this easier to which prefs factory rules are located at. This should be kept in mind when changing factory rule destinations.

Pref15 was a the first blank spot within ASUS's rules so thats where I put in 8013. That was arbitrary, more logically it should be at pref 22.
 
Last edited:
That's a terrible rule.

You will be moving all your net controls into the web container. Net controls are intended to the processed asap for a responsive internet.

You should try to figure out a better rule that would only catch the photos instead.

Maybe google has a set iprange for their photo severs. Try Whois/dns lookups for those domains to see the ranges they resolve too.

https://support.google.com/a/answer/2589954?visit_id=0-636379033563589379-2115053252&hl=en&rd=1

If not maybe dump the packets and see if they have a unique TOS/dscp mark or anything else you can filter on.

HTTPS traffic is rough. Trend micro identifies a suprising amount of https traffic.

Worst case just filter 443 tracfic for packets above a certain size. Get creative.

Note: When replacing existing rules, watch the PRIO definition since both those rules should not be prio 15.
8014 was an existing rule located at pref 23. To redefine its container destination you should of used pref23 instead of 15. By placing it on pref15 now you have a duplicate with the original on pref23 (the one on pref15 will get matched first).

I updated the table on page 9 to make this easier to which prefs factory rules are located at. This should be kept in mind when changing factory rule destinations.

Pref15 was a the first blank spot within ASUS's rules so thats where I put in 8013. That was arbitrary, more logically it should be at pref 22.

Well. Not enough knowledge...

I'll remove any added rules and will only use your examples.

Thanks again
 
Hi,

Question, when I make any changes to the bandwidth allocations I am met with a string of errors saying "rate must be defined" or something along those lines. I am not sure what I am doing wrong here.

Secondly, where should I place the script once it's up and running so that it starts at router boot?

Thanks,
J
 
Hi,

Question, when I make any changes to the bandwidth allocations I am met with a string of errors saying "rate must be defined" or something along those lines. I am not sure what I am doing wrong here.

Secondly, where should I place the script once it's up and running so that it starts at router boot?

Thanks,
J

If your bandwidth allocation changes are failing you may have introduced a space into the variable.

There should be NO space before or after the equal sign. There should be NO decimals in the percentage aswell.
 
IMG_1001.PNG
IMG_0996.PNG
Screenshot shows qos statistics whil playing cod-IW on Xbox 1.

Definitely going to give this a try. I was looking at my qos statistics while i was playing Call of Duty on an Xbox one console--it is classifying all in game traffic as default. Just found out this was the case --there is a call of duty classification under bandwidth monitor...it shows up when I load the game, but disappears when I'm actually in a game since none of the traffic is correctly identified,

Also, something of note. When I download actual games and add-one from the Xbox store, it shows as gaming traffic. I had an update yesterday going at 160mb on my 150/20 line that pretty much shut down all my other device's internet.

I'm wondering if my trend micro engine has actuall updated... I see in the logs. Trend micro forward module 1.0.34
IMG_0999.PNG
However, when I go to the firmware update page I see 1.176?update 8/5/17
IMG_1001.PNG
I'm thinking I need to restore to factory then re-update
 
Last edited:
Status
Not open for further replies.

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top