What's new

[Release] FreshJR Adaptive QOS (Improvements / Custom Rules / and Inner workings)

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Status
Not open for further replies.
Hi Fresh,

Just woke up and the router had crashed, which it has been doing quite randomly the last couple of weeks. RT-AC86U, latest stable Merlin, latest script update. The crash seems to have occured during nightly QOS check:

Mar 7 03:30:01 adaptive QOS: TC HTB version 3.3
Mar 7 03:30:01 adaptive QOS: Illegal "rate"
Mar 7 03:30:01 adaptive QOS: Usage: ... qdisc add ... htb [default N] [r2q N]
Mar 7 03:30:01 adaptive QOS: [direct_qlen P]
Mar 7 03:30:01 adaptive QOS: default minor id of class to which unclassified packets are sent {0}
Mar 7 03:30:01 adaptive QOS: r2q DRR quantums are computed as rate in Bps/r2q {10}
Mar 7 03:30:01 adaptive QOS: debug string of 16 numbers each 0-3 {0}
Mar 7 03:30:01 adaptive QOS: direct_qlen Limit of the direct queue {in packets}
Mar 7 03:30:01 adaptive QOS: ... class add ... htb rate R1 [burst B1] [mpu B] [overhead O]
Mar 7 03:30:01 adaptive QOS: [prio P] [slot S] [pslot PS]
Mar 7 03:30:01 adaptive QOS: [ceil R2] [cburst B2] [mtu MTU] [quantum Q]
Mar 7 03:30:01 adaptive QOS: rate rate allocated to this class (class can still borrow)
Mar 7 03:30:01 adaptive QOS: burst max bytes burst which can be accumulated during idle period {computed}
Mar 7 03:30:01 adaptive QOS: mpu minimum packet size used in rate computations
Mar 7 03:30:01 adaptive QOS: overhead per-packet size overhead used in rate computations
Mar 7 03:30:01 adaptive QOS: linklay adapting to a linklayer e.g. atm
Mar 7 03:30:01 adaptive QOS: ceil definite upper class rate (no borrows) {rate}
Mar 7 03:30:01 adaptive QOS: cburst burst but for ceil {computed}
Mar 7 03:30:01 adaptive QOS: mtu max packet size we create rate map for {1600}
Mar 7 03:30:01 adaptive QOS: prio priority of leaf; lower are served first {0}
Mar 7 03:30:01 adaptive QOS: quantum how much bytes to serve from leaf at once {use r2q}

This keeps repeating for like 20 times in the log.

Debug:

adaptive QOS: Undf Prio: 2
adaptive QOS: Undf FlowID: 1:13
adaptive QOS: Classes Present: 8
adaptive QOS: Down Band: 32768
adaptive QOS: Up Band : 18432
adaptive QOS: ***********
adaptive QOS: Net = 1:10
adaptive QOS: VOIP = 1:11
adaptive QOS: Gaming = 1:12
adaptive QOS: Others = 1:13
adaptive QOS: Web = 1:14
adaptive QOS: Streaming = 1:15
adaptive QOS: Downloads = 1:16
adaptive QOS: Defaults = 1:17
adaptive QOS: ***********
adaptive QOS: Downrates -- 1638, 6553, 4915, 3276, 3276, 9830, 1638, 1638
adaptive QOS: Downceils -- 32768, 32768, 32768, 32768, 32768, 32768, 32768,
adaptive QOS: Downbursts -- 3198b, 7997b, 3197b, 3198b, 3198b, 3197b, 3198b, 31b
adaptive QOS: DownCbursts -- 39985b, 39985b, 39985b, 39985b, 39985b, 39985b, 39b
adaptive QOS: ***********
adaptive QOS: Uprates -- 921, 3686, 2764, 5529, 1843, 1843, 921, 921
adaptive QOS: Upciels -- 18432, 18432, 18432, 18432, 18432, 18432, 18432, 18432
adaptive QOS: Upbursts -- 3198b, 3198b, 3198b, 3197b, 3198b, 3198b, 3198b, 3198b
adaptive QOS: UpCbursts -- 22394b, 22394b, 22394b, 22394b, 22394b, 22394b, 2239b
 
@Oleg Eremenko

There is something wrong with the calulated rate for "Game Transferring"

You didn't manage to stick a decimal, space, or something else weird in that field did you?

Let me get the output of

Code:
nvram get fb_email_dbg

The WebUI does sanitize input and limit available text entry.
The terminal version does NOT sanitize text entry and will accept malformed inputs.

So if there is something weird present, Id like to ask how it got there.
 
@Oleg Eremenko

There is something wrong with the calulated rate for "Game Transferring"

You didn't manage to stick a decimal, space, or something else weird in that field did you?

Let me get the output of

Code:
nvram get fb_email_dbg

The WebUI does sanitize input and limit available text entry.
The terminal version does NOT sanitize text entry and will accept malformed inputs.

So if there is something weird present, Id like to ask how it got there.

Here's the output:

;;;;;;>;>;>;>;>192.168.1.4>FF>5;20;15;10;10;30;5;5>100;100;100;100;100;100;100;100>5;20;15;30;10;10;5;5>100;100;100;100;100;100;100;100

I've got PS4 on 192.168.1.4 in the gaming rule, no spaces or anything else there. Then a Chromecast Ultra and a set-top box forced to Streaming (HBO went to Web Surfing here in Nordics) with selected ports (Chromecast Both 443, Set Top Box TCP 80,443,8080,8443, Both 123,).
 

Attachments

  • Capture.PNG
    Capture.PNG
    229.6 KB · Views: 320
Here's the output:

So malformed rates stored in nvram weren't it. For furthur diagnosis

Clear the system log
Restart QoS
Wait 10 minutes
(check for AdaptiveQoS errors on initial setup)

Next manually trigger daily check via: /jffs/scripts/FreshJR_QOS -check
Wait 1 minute
(check for AdaptiveQoS errors on "scheduled" startup)

--

This will check if the error is repeatable. If the error is repeatable then fixing it will be simple.

--

Something is wrong with the calculated rate but all the calculated rates are appearing correctly in debug() so I am scratching my head a little bit right now.

Was this the only parameter the errors were complaining about?
Code:
Mar 7 03:30:01 adaptive QOS: Illegal "rate"
 
Last edited:
Mar 7 05:53:42 rc_service: httpd 777:notify_rc restart_qos;restart_firewall
Mar 7 05:53:43 kernel: Initialized Runner Unicast Layer
Mar 7 05:53:43 kernel: Initialized Runner Multicast Layer
Mar 7 05:53:43 kernel: ^[[0;36;44mBroadcom Packet Flow Cache HW acceleration enabled.^[[0m
Mar 7 05:53:43 kernel: Enabled Runner binding to Flow Cache
Mar 7 05:53:43 kernel: Initialized Runner Protocol Layer (800)
Mar 7 05:53:43 kernel: Broadcom Runner Blog Driver Char Driver v0.1 Registered <3009>
Mar 7 05:53:45 BWDPI: fun bitmap = 17f
Mar 7 05:53:46 miniupnpd[32519]: shutting down MiniUPnPd
Mar 7 05:53:46 nat: apply nat rules (/tmp/nat_rules_eth0_eth0)
Mar 7 05:53:46 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
Mar 7 05:53:46 miniupnpd[6432]: HTTP listening on port 41476
Mar 7 05:53:46 miniupnpd[6432]: Listening for NAT-PMP/PCP traffic on port 5351
Mar 7 05:54:10 rc_service: httpd 777:notify_rc restart_qos;restart_firewall
Mar 7 05:54:11 kernel: ^[[0;33;45mBroadcom Packet Flow Cache HW acceleration disabled.^[[0m
Mar 7 05:54:11 kernel: Disabled Runner binding to Flow Cache
Mar 7 05:54:13 BWDPI: fun bitmap = 1ff
Mar 7 05:54:13 A.QoS: qos_count=0, qos_check=0
Mar 7 05:54:16 A.QoS: qos rule is less than 22
Mar 7 05:54:16 A.QoS: restart A.QoS because set_qos_conf / set_qos_on / setup rule fail
Mar 7 05:54:17 A.QoS: qos_count=0, qos_check=1
Mar 7 05:54:20 A.QoS: qos rule is less than 22
Mar 7 05:54:20 A.QoS: restart A.QoS because set_qos_conf / set_qos_on / setup rule fail
Mar 7 05:54:21 A.QoS: qos_count=1, qos_check=1
Mar 7 05:54:24 A.QoS: qos rule is less than 22
Mar 7 05:54:24 A.QoS: restart A.QoS because set_qos_conf / set_qos_on / setup rule fail
Mar 7 05:54:25 A.QoS: qos_count=2, qos_check=1
Mar 7 05:54:28 A.QoS: qos rule is less than 22
Mar 7 05:54:28 A.QoS: restart A.QoS because set_qos_conf / set_qos_on / setup rule fail
Mar 7 05:54:28 miniupnpd[6432]: shutting down MiniUPnPd
Mar 7 05:54:28 nat: apply nat rules (/tmp/nat_rules_eth0_eth0)
Mar 7 05:54:29 custom_script: Running /jffs/scripts/firewall-start (args: eth0)
Mar 7 05:54:29 miniupnpd[15842]: HTTP listening on port 35440
Mar 7 05:54:29 miniupnpd[15842]: Listening for NAT-PMP/PCP traffic on port 5351
Mar 7 05:54:29 adaptive QOS: Applying - Iptable Down Rules
Mar 7 05:54:29 adaptive QOS: Applying - Iptable Up Rules (eth0)
Mar 7 05:54:29 adaptive QOS: TC Modification Delayed Start (5min)
Mar 7 05:54:57 kernel: htb: htb qdisc 15: is non-work-conserving?
Mar 7 05:55:08 kernel: htb: htb qdisc 15: is non-work-conserving?
Mar 7 05:59:30 adaptive QOS: Applying TC Down Rules
Mar 7 05:59:30 adaptive QOS: Applying TC Up Rules
Mar 7 05:59:30 adaptive QOS: Modifying TC Class Rates
Mar 7 06:05:52 kernel: dcd[6345]: unhandled level 3 translation fault (11) at 0x00000000, esr 0x92000007
Mar 7 06:05:52 kernel: pgd = ffffffc00a341000
Mar 7 06:05:52 kernel: [00000000] *pgd=0000000011ce8003, *pud=0000000011ce8003, *pmd=000000000b2fa003, *pte=0000000000000000
Mar 7 06:05:52 kernel: CPU: 1 PID: 6345 Comm: dcd Tainted: P O 4.1.27 #2
Mar 7 06:05:52 kernel: Hardware name: Broadcom-v8A (DT)
Mar 7 06:05:52 kernel: task: ffffffc019320b80 ti: ffffffc0118ac000 task.ti: ffffffc0118ac000
Mar 7 06:05:52 kernel: PC is at 0xf7501f44
Mar 7 06:05:52 kernel: LR is at 0x1dc74
Mar 7 06:05:52 kernel: pc : [<00000000f7501f44>] lr : [<000000000001dc74>] pstate: 600e0010
Mar 7 06:05:52 kernel: sp : 00000000ff9c1958
Mar 7 06:05:52 kernel: x12: 000000000009ff10
Mar 7 06:05:52 kernel: x11: 00000000f67ff024 x10: 00000000000a02b4
Mar 7 06:05:52 kernel: x9 : 00000000f67ffc7c x8 : 00000000000a076c
Mar 7 06:05:52 kernel: x7 : 00000000f67ffcb4 x6 : 00000000000a0766
Mar 7 06:05:52 kernel: x5 : 0000000000000000 x4 : 00000000f67ffc60
Mar 7 06:05:52 kernel: x3 : 0000000000000000 x2 : 0000000000000000
Mar 7 06:05:52 kernel: x1 : 000000000007c674 x0 : 0000000000000000
Mar 7 06:06:34 adaptive QOS: Scheduled Persistence Check -> No modifications necessary

Here's the new log output, apparently it wasn't repeatable? :)

I dunno, the router's been quite wonky lately and I can't quite pinpoint the moment it began, I did a soft factory reset (through UI) a while back, maybe there's a more thorough way?

Each time it crashes it's some different error, sometimes it's ISP's DHCP, sometimes it's too many DNS queries (max 150). I've got 15 devices connected, sometimes the router crashes on DHCP assignment (turning a device on), sometimes it's in the middle of the night.

I think I'll try to run it without QOS enabled for a while, see if it's in that or something else.
 
@Oleg Eremenko

Sure thing, but I did find the cause of that bug you were experiencing and will fix it if you ever decide to come back in v8.5+

--

Update: V8.5 released
-bug in daily scheduled check has been fixed

(Too many updates today, hopefully its the last)
 
Last edited:
I definitely will, 'cause when everything is up, it works like a charm: I'm on LTE 4G, bandwidth fluctuates from 65/40 in the night to about 36/30 in the evening. With QOS rates set to 32/24 and your script installed, I get A+/A/A+ on DSLReports, Bandwith usage is often maxed, but everyone is streaming HD without a hitch and and PS4 latency stays stable. So I would really like to get to stay up for more than 24 hours. A very hard reset should probably get the job done. Maybe try with stock firmware.
 
Maybe try with stock firmware.

Would NOT recommend that. While the script DOES work with stock firmware missing out on codel/fq-codel is a large step back.

I would more recommend hopping back one RMerlin firmware backwards onto v384.8 if your v384.9 continues to have issues for whatever odd reason.

Thanks for the detailed bug report.
 
Last edited:
@FreshJR ... Simply AWESOME work on the latest releases of your QOS adaptation - MANY thanks to you and all the other members who actively supported your efforts in refining and improving an essential add-on to our Asus Routers. Special thanks for backporting the enhanced web interface on the QOS statistics page of the Webgui for versions earlier than 384.9 - and for making it available to stock firmware users [I have a DSL-AC68U at the office where there is only VDSL, no Fibre and no MerlinWare]. :)

I did try Merlin 384.9 [full factory reset etc] on my AC5300 but was unhappy with the Asus closed sourced bugs over which RLMerlin has no control or means of fixing. After a week on 384.9 my router became unresponsive, would not reboot from Webgui, had to kill it - so recovered to 384.8.-2 [full reset again] which has always been rock steady and stable.

Even though I am a gazillion miles from the common servers under DSLReports tests - I get solid A+ scores for all 4 categories tested - thanks to YOU. :):):):).

Edit: hopped over to your Paypal link and donated.
 
Last edited:
There is a bug in your version check code;

Line 2135
tWbEdXK.png


This checks the details of the user who built the firmware, which in my case is my vm
7xC5BbP.png


Instead use;

Code:
if [ "$(uname -o)" != "ASUSWRT-Merlin" ] ; then

(had to post pictures, cloudflare exploit protection wouldn't let me post the code)
 
@Adamm
Got it!

As an extention to the above discussion, should I be pulling current FW version from "nvram get buildno" ?

Is there a rigid syntax that buildno follows?

I have danced around in trying to extract the numbers pertinent to my needs assuming for the worst case scenario.

buildno="User12's v17.2.1.5 Beta4""

should only extract 17.2 in the above scenario
 
Last edited:

There's also a bug with the (lack of) UI page mounting, I'll look into in it shortly and report back.

EDIT; Looks to be the same bug but in a different area.
 
There's also a bug with the (lack of) UI page mounting, I'll look into in it shortly and report back.

Same as before. UI page only mounts if it detects RMerlin firmware.

eg.
Code:
  if grep -iq "merlin" /pr oc/version ; then
 
As an extention to the above discussion, should I be pulling current FW version from "nvram get buildno" ?

Is there a rigid syntax that buildno follows?

Thats what I do, but I don't rely on it for anything beyond user convenience.

Code:
echo "FW Version; $(nvram get buildno)_$(nvram get extendno)"
 
Update:
v8.6 released
-correctly detect user compiled builds of RMerlin firmware

Another bug with the mount code :p

In both instances its wrapped in an if statement that will only execute if the firmware isnt ASUSWRT-Merlin.

Line 2005
Code:
        ##check if should mount QoS_stats page
        if [ "$(uname -o)" != "ASUSWRT-Merlin" ] ; then           
            buildno="$(nvram get buildno)";                                        #Example "User12 v17.2 Beta4"
            if [ "$(echo ${buildno} | tr -cd '.' | wc -c)" -ne 0 ]    ; then                    #if has decimal   
                CV="$(echo ${buildno} | cut -d "." -f 1 | grep -o '[0-9]\+' | tail -1)"        #get first number before decimal --> 17
                MV="$(echo ${buildno} | cut -d "." -f 2 | grep -o '[0-9]\+' | head -1)"        #get first number after decimal  --> 2
            else                                                               
                CV="$(echo ${buildno} | grep -o '[0-9]\+' | head -1)"                        #get first number --> 17
                MV="0"
            fi
            
            if [ "${CV}" -ge "382" ] ; then
                if ! [ "${webpath}" -ef "/www/QoS_Stats.asp" ] ; then
                    mount -o bind "${webpath}" /www/QoS_Stats.asp
                fi
            #elif [ "${CV}" = "384" ] && [ ${MV} -ge "9" ] ; then
            fi
        fi

Line 2185
Code:
    if [ "$(uname -o)" != "ASUSWRT-Merlin" ] ; then                  #Mounts webpage on v384.9+   
        buildno="$(nvram get buildno)";                                        #Example "User12 v17.2 Beta4"
        if [ "$(echo ${buildno} | tr -cd '.' | wc -c)" -ne 0 ]    ; then                    #if has decimal   
            CV="$(echo ${buildno} | cut -d "." -f 1 | grep -o '[0-9]\+' | tail -1)"        #get first number before decimal --> 17
            MV="$(echo ${buildno} | cut -d "." -f 2 | grep -o '[0-9]\+' | head -1)"        #get first number after decimal  --> 2
        else                                                               
            CV="$(echo ${buildno} | grep -o '[0-9]\+' | head -1)"                        #get first number --> 17
            MV="0"
        fi
        
        if [ "${CV}" -ge "382" ] ; then
            if ! [ "${webpath}" -ef "/www/QoS_Stats.asp" ] ; then
                mount -o bind "${webpath}" /www/QoS_Stats.asp
            fi
        #elif [ "${CV}" = "384" ] && [ ${MV} -ge "9" ] ; then
        fi
    fi
 
Another bug with the mount code :p

In both instances its wrapped in an if statement that will only execute if the firmware isnt ASUSWRT-Merlin.

Line 2005
Code:
        ##check if should mount QoS_stats page
        if [ "$(uname -o)" != "ASUSWRT-Merlin" ] ; then         
            buildno="$(nvram get buildno)";                                        #Example "User12 v17.2 Beta4"
            if [ "$(echo ${buildno} | tr -cd '.' | wc -c)" -ne 0 ]    ; then                    #if has decimal 
                CV="$(echo ${buildno} | cut -d "." -f 1 | grep -o '[0-9]\+' | tail -1)"        #get first number before decimal --> 17
                MV="$(echo ${buildno} | cut -d "." -f 2 | grep -o '[0-9]\+' | head -1)"        #get first number after decimal  --> 2
            else                                                             
                CV="$(echo ${buildno} | grep -o '[0-9]\+' | head -1)"                        #get first number --> 17
                MV="0"
            fi
          
            if [ "${CV}" -ge "382" ] ; then
                if ! [ "${webpath}" -ef "/www/QoS_Stats.asp" ] ; then
                    mount -o bind "${webpath}" /www/QoS_Stats.asp
                fi
            #elif [ "${CV}" = "384" ] && [ ${MV} -ge "9" ] ; then
            fi
        fi

Line 2185
Code:
    if [ "$(uname -o)" != "ASUSWRT-Merlin" ] ; then                  #Mounts webpage on v384.9+ 
        buildno="$(nvram get buildno)";                                        #Example "User12 v17.2 Beta4"
        if [ "$(echo ${buildno} | tr -cd '.' | wc -c)" -ne 0 ]    ; then                    #if has decimal 
            CV="$(echo ${buildno} | cut -d "." -f 1 | grep -o '[0-9]\+' | tail -1)"        #get first number before decimal --> 17
            MV="$(echo ${buildno} | cut -d "." -f 2 | grep -o '[0-9]\+' | head -1)"        #get first number after decimal  --> 2
        else                                                             
            CV="$(echo ${buildno} | grep -o '[0-9]\+' | head -1)"                        #get first number --> 17
            MV="0"
        fi
      
        if [ "${CV}" -ge "382" ] ; then
            if ! [ "${webpath}" -ef "/www/QoS_Stats.asp" ] ; then
                mount -o bind "${webpath}" /www/QoS_Stats.asp
            fi
        #elif [ "${CV}" = "384" ] && [ ${MV} -ge "9" ] ; then
        fi
    fi
I got bored over the weekend and forked the repo to tweak the mounting code to fit my setup, and I did wonder about these sections!
 
@Adamm argg, you are right. (It’s late).
The changes from 8.5 -> 8.6 introduced a mount bug affecting ALL users rather than expanding functionality to support user compiled firmwares.

@Jack Yaz originally the code logic was one block of (if NOT RMerlin) while the other two blocks were (IF RMerlin)

I copy pasted the new alternate RMerlin detection block three times and this turned all three logic statements into (if NOT RMerlin) which was not intended. I forgot they originally differed.

Pushing another fix...

--

Update v8.7 is out
-fix mount bug introduced in v8.6 changes

Apologize for the update roller coaster today. Stuff kept popping up.
Hopefully everything is now settled for a nice calm stretch of time.
 
Last edited:
At line #193 is this a copy/paste error?
Code:
        if ! [ -z "$ip1_up" ] ; then                                                    #Script Interactively Defined Rule 3
            if [ "$(echo ${ip3_up} | grep -c "both")" -ge "1" ] ; then

#1188 should be
Code:
#read user input
read input

And the break at #2150 probably shouldn't be there.
 
Status
Not open for further replies.

Latest threads

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!

Staff online

Top