What's new

kmalloc-96 - Memory leak in Kernel

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

It's good that folks are going in and doing maint work - it's a grind, but I think as a whole, it's gotta be done.

But until the leak is confirmed and located, it remains an hypothesis. Increasing memory usage aren't always leaks. It can be a cache gradually getting used, log files slowly taking more space in /tmp (which uses tmpfs), historical data that grows over time, etc...
 
But until the leak is confirmed and located, it remains an hypothesis. Increasing memory usage aren't always leaks. It can be a cache gradually getting used, log files slowly taking more space in /tmp (which uses tmpfs), historical data that grows over time, etc...

Very true... and sometimes it's a symptom, and not a cause in and of itself.
 
Have been experimenting with different BWDPI features.

Turning off AiProtection (malicious sites, VP & CC) alone slows down growth - quite obvious. Turning off Traffic Analyzer alone produces even more obvious result.
 
Have been experimenting with different BWDPI features.

Turning off AiProtection (malicious sites, VP & CC) alone slows down growth - quite obvious. Turning off Traffic Analyzer alone produces even more obvious result.

If that's the case - there's not much to do here - that's all closed source there...
 
If that's the case - there's not much to do here - that's all closed source there...

Unfortunately not much we could do except having a choice to turn it off. I made that decision last night.

I devised a method to accelerate growth of kmalloc-96. Appears two major users of this memory pool. My guess are BWDPI and Netfilter..conntrack related.

Upon fresh reboot, with/without BWDPI enabled, the pool can grow and shrink which is easily observable. So at least BWDPI does well initially. Apparently something in BWDPI goes weird in the subsequent hours that less and less memory is released. With BWDPI disabled, the growth and shrink is very responsive to workload.

Personally I'm convinced it's a memory leak in BWDPI. At least it's a piece of poorly written code which I can live without. Hopefully, Asus can sort it out with TrendMicro. Disabling BWDPI is a major loss of a key feature IMO.

The numbers so far appear very pleasing to my eyes. Giving it some more time to run, I'll upload a chart later this week.
 
With BWDPI completely disabled. x-axis covers from right after reboot to ~3 days of uptime. Readings are sampled at 5-min interval.

Picture1.png
 
I devised a method to accelerate growth of kmalloc-96. Appears two major users of this memory pool. My guess are BWDPI and Netfilter..conntrack related.

Don't have an AsusWRT device handy, but how is nf_conntrack being handled - is there an explicit CT helper assigned to BWDPI? Or is it an implicit auto?

That might explain the leak... since nf is kernel space, and bwdpi shims in...
 
Between bwdpi and ctf.ko, there's a lot of closed source in that flow path, so it's reasonable to expect that state tables might not be released sometimes.. if they can be.

And one cannot do a global flush of the state tables, as this would drop client connections across the board, even though that would problem clean up kmalloc-96 and bring it back to a reasonable level.
 
Don't have an AsusWRT device handy, but how is nf_conntrack being handled - is there an explicit CT helper assigned to BWDPI? Or is it an implicit auto?

That might explain the leak... since nf is kernel space, and bwdpi shims in...

BWDPI consists of three ko's (IDP, bw_forward and ct_notification). I think ct_notification is the helper/messenger between BWDPI and conntrack..just guessing from its name. People might guess it's a ctf helper too

Between bwdpi and ctf.ko, there's a lot of closed source in that flow path, so it's reasonable to expect that state tables might not be released sometimes.. if they can be.

On another day, I actually tried with BWDPI on and CTF off, still leaks the same way. I used to blame CTF lol. but seems BWDPI is more crap. My short affair started with 380 FW..what a mistake.

And one cannot do a global flush of the state tables, as this would drop client connections across the board, even though that would problem clean up kmalloc-96 and bring it back to a reasonable level.

Not worth it IMO. I have disabled BWDPI. It mysteriously solved another bug that I've been mentioning couple of time, and attracts little attention. Also with BWDPI disabled, browse experience is snappier, and won't deteriorate with time.

While looking at this leak the past week, I spot another user space leak! But I would stop here or else people would think I'm bashing Asus FW..
 
FWIW - I don't think it's bashing, at least not in the negative context - bashing bugs perhaps, and this is a win-win.
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top