Viktor Jaep
Part of the Furniture
Hi everyone... I need your advice, please. I have been dealing with this strange issue ever since I got my RT-AC86U and started playing around with scripts... Every so often, randomly... could happen a few times a day, or go days and days before I ever see it happen... but a simple "nvram get <value>" can cause my script to just hang... indefinitely. Until I kill it with a CTRL-C. Lately, in trying to troubleshoot this... I will hop on "htop", sort it by 'nvram', and will find the culprit "nvram get" task just sitting there, drawing 0% cpu and no activity indicated... so when I kill that task in htop, the script will continue running without needing to restart it.
I have been able to get around this issue by putting a band-aid on the problem, but it doesn't solve the core issue... thanks to @eibgrad, he turned me onto the "timeout <sec>" command, which needs to be installed separately and be inserted before each "nvram get <value>" command... which makes it a real PITA to keep adding/removing when sharing these scripts. This does seem to help the issue, and allows the scripts to run indefinitely as it will then invoke the timeout function when the "nvram get" hangs. The other thing I have done is created another script that runs hourly through cru to check and see if the script is hung, and if so, kills it and restarts it. But you *shouldn't* have to go through all this trouble I would think - a router should just work!
I have not found any other references to this anywhere, nor do I see anyone else bringing this issue up as a flaw on their router... and my scripts seem to run flawlessly on other people's routers. So this gets me thinking -- am I the only one with this issue - and why?
1.) An "nvram get" command is pretty low-level -- could there be a hardware issue with my router? I'm not dealing with any overheating, and even have a CPU fan tied directly to it, keeping a cool 53C at all times.
2.) Could there be some sort of a conflict where possibly some other program is calling the same values at the same time, or perhaps has a lock on calling nvram variables? Like with SQL - concurrent transaction locking/hangups... I'm not sure if the router can handle concurrent calls if this might be the reason why?
3.) Could this be a sign of needing to wipe the router and start from scratch? I've had it for about 1.5yrs. Perhaps it's due time?
4.) Any other software/config-related tweaks I could look at or perform to see if it will alleviate this issue?
I really appreciate your help, input and advice on this!
I have been able to get around this issue by putting a band-aid on the problem, but it doesn't solve the core issue... thanks to @eibgrad, he turned me onto the "timeout <sec>" command, which needs to be installed separately and be inserted before each "nvram get <value>" command... which makes it a real PITA to keep adding/removing when sharing these scripts. This does seem to help the issue, and allows the scripts to run indefinitely as it will then invoke the timeout function when the "nvram get" hangs. The other thing I have done is created another script that runs hourly through cru to check and see if the script is hung, and if so, kills it and restarts it. But you *shouldn't* have to go through all this trouble I would think - a router should just work!
I have not found any other references to this anywhere, nor do I see anyone else bringing this issue up as a flaw on their router... and my scripts seem to run flawlessly on other people's routers. So this gets me thinking -- am I the only one with this issue - and why?
1.) An "nvram get" command is pretty low-level -- could there be a hardware issue with my router? I'm not dealing with any overheating, and even have a CPU fan tied directly to it, keeping a cool 53C at all times.
2.) Could there be some sort of a conflict where possibly some other program is calling the same values at the same time, or perhaps has a lock on calling nvram variables? Like with SQL - concurrent transaction locking/hangups... I'm not sure if the router can handle concurrent calls if this might be the reason why?
3.) Could this be a sign of needing to wipe the router and start from scratch? I've had it for about 1.5yrs. Perhaps it's due time?
4.) Any other software/config-related tweaks I could look at or perform to see if it will alleviate this issue?
I really appreciate your help, input and advice on this!
Last edited: