As
@dave14305 alluded in his post on the thread
here, the same issue also happens fairly regularly with the
/usr/sbin/wl
command, so I don't think the root cause is a hardware flaw or a failing NVRAM chip. At this point, I agree with
@RMerlin that it looks more like a deadlock condition with two (or more) competing threads not releasing their corresponding lock/mutex/semaphore appropriately and at the right time, so they end up waiting on each other forever.
I got very curious last Sunday about this problem, so I ended up writing a script that looks for both "
nvram" & "
wl" commands that appear "stuck" and then captures the tree path to their root parent process. I was trying to see if the same parent processes show up when the hangs occur. In my case, the same pair show up more frequently:
YazFi &
conn_diag. I think that's probably because they both frequently make calls to the "
wl" command, and Yazfi to '
nvram get' as well. BTW, I found out when going thru the logs generated by the script that the "
cru l" command can get "stuck" as well because it makes a call to "
nvram get http_username" which gives the filename containing the list of cron jobs (e.g. /var/spool/cron/crontabs/{http_username}).
Today, I added code to the script to kill the "stuck" processes when found on the 2nd round of the search when the script is set up to run as a cron job. I have it set up to run every 5 minutes since I don't get a lot of occurrences (average about 3 a day). But I'd imagine that for those folks who are running many 3rd-party add-ons which call
nvram and/or
wl commands frequently, they may see the bug much more often. This script is not a solution at all, but at least it will eliminate all those "stuck" processes that stay around until the next reboot.
Here is the script if you want to try it. It was initially meant to be a diagnostic tool, so it's still a bit "raw" and it has not been polished with a round of refactoring.
Type
./CheckStuckProcCmds.sh -help
to get usage description.