I have a strange issue that has spanned the last few builds (both stable, beta and now the latest test alpha build) where accessing the router is slow e.g. SSH connections are prone to timing out and dropping, or pausing for 30+ seconds before suddenly becoming responsive again.
It doesn't appear to be affecting internet or general network (wired or wireless) usage though I have things like QOS turned off to reduce the workload.
I believe the problem is caused because nearly all interrupts are handled by CPU0 instead of being evenly distributed:
The smp_affinity for all IRQ interrupts is set to '3' (cat /proc/irq/179/smp_affinity returns 3), I can manually change the smp_affinity to be 2 for one of the interrupts and then CPU1 will process them but my understanding is that the default value of 3 should cause them to be shared between the two cores evenly as it is a binary mask of '11'.
The CPU/core mappings appear correct:
$ cat /sys/devices/system/cpu/cpu0/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu0/topology/core_siblings
1
$ cat /sys/devices/system/cpu/cpu0/topology/thread_siblings
1
$ cat /sys/devices/system/cpu/cpu1/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu1/topology/core_siblings
2
$ cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
2
I have made the assumption that for ASUS routers the IRQ workload should be distributed between the two cores and it is isn't a specific design choice for one core to handle them as I don't have a correctly performing router to check against.
The reason I believe it is the interrupts causing the performance issues is because in htop kworker/1:2 is the third highest user of cpu time, though mtdblock3 being the highest by a significant margin does make me wonder if the internal flash memory is starting to fail, or if it being the highest is expected, again I don't have another router to compare against.
Viewing just the kernal workers you can see kworker/1:2 is doing all the work while the others are doing pretty much nothing.
If anyone is able to check to see if what I am seeing is or isn't normal behaviour to know if I am looking at the right things, as well as any advice or guidance on what else to look into it would be much appreciated
It doesn't appear to be affecting internet or general network (wired or wireless) usage though I have things like QOS turned off to reduce the workload.
I believe the problem is caused because nearly all interrupts are handled by CPU0 instead of being evenly distributed:
The smp_affinity for all IRQ interrupts is set to '3' (cat /proc/irq/179/smp_affinity returns 3), I can manually change the smp_affinity to be 2 for one of the interrupts and then CPU1 will process them but my understanding is that the default value of 3 should cause them to be shared between the two cores evenly as it is a binary mask of '11'.
The CPU/core mappings appear correct:
$ cat /sys/devices/system/cpu/cpu0/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu0/topology/core_siblings
1
$ cat /sys/devices/system/cpu/cpu0/topology/thread_siblings
1
$ cat /sys/devices/system/cpu/cpu1/topology/core_id
0
$ cat /sys/devices/system/cpu/cpu1/topology/core_siblings
2
$ cat /sys/devices/system/cpu/cpu1/topology/thread_siblings
2
I have made the assumption that for ASUS routers the IRQ workload should be distributed between the two cores and it is isn't a specific design choice for one core to handle them as I don't have a correctly performing router to check against.
The reason I believe it is the interrupts causing the performance issues is because in htop kworker/1:2 is the third highest user of cpu time, though mtdblock3 being the highest by a significant margin does make me wonder if the internal flash memory is starting to fail, or if it being the highest is expected, again I don't have another router to compare against.
Viewing just the kernal workers you can see kworker/1:2 is doing all the work while the others are doing pretty much nothing.
If anyone is able to check to see if what I am seeing is or isn't normal behaviour to know if I am looking at the right things, as well as any advice or guidance on what else to look into it would be much appreciated