1. Alternative solution without timeout
@SomeWhereOverTheRainBow I'll test your code.
Would you please explain to me what this part actually does?
Bash:
run_cmd () {
to=$1; shift
$@ & local child=$! start=0
while kill -0 $child 2>/dev/null; do
read -t 1
start=$((start+1))
if [ $start -ge $to ]; then
kill -s 9 $child 2>/dev/null
break
fi
done
}
2. The current wrapper script with timeout
This is still work in progress. It lacks protection from running when
timeout is not available. It's installed as an addon on external USB device. Quite unsafe given it would be managing such a core operation as
nvram get / set. Would it be possible to install
timeout on the jffs partition to make this a bit safer?
The $PATH has to be checked. I have a line exporting
/opt/bin:/opt/sbin
but that needs refinement because I ended up having these paths added more than once.
I also wish to have a counter how many times the override has been invoked, so I can send to the system log an error message with a number on it (for statistics). Thus we can get a more accurate measure of how often the
nvram command is used and how often the error condition happens.
3. AF_NETLINK suspicion
As for the netlink library, presumably setting nl_pid wrongly: this doesn't match very well with empirical observation.
Why is it that >2,700 iterations in a row of the test loop (with 5
nvram get calls inside) work fine? Mind you, these are actually >13,500 successful nvram operations before 1 fails. Unless it's a buffer that runs out or something of this nature, I would expect the nvram reads to fail much more regularly. This could as well be a faulty nvram controller or a bug in the CPU.
We also have a report of an AC86U that doesn't exhibit the faulty behavior. We haven't independently verified it but it deserves attention. - Update, this report has now been recalled. At this point we have no reason to doubt that all AC86U routers have this fault.
4. The "ditch the AC86U and buy AX86U" solution
Well, I don't like it. You know the saying:
trick me once - shame on you, trick me twice - shame on me. I don't feel confident to go an shovel more money into a company that has failed me before I see genuine effort to fix the problem.
As I've mentioned, I had a couple of cheap TP-Link routers (sub 50 EUR) that worked flawlessly with custom firmware and a load of addons for around 5 years. I don't like the idea of having to ditch a 120 EUR device + shipping costs less than a month of unboxing it. Asking for a replacement of the same model? Difficult and likely useless. They are out of stock at the place I ordered it, that's quite a bit of shipping costs on me and I have no guarantee that the potential replacement unit won't have the same defect. Quite the opposite - it will probably have it. On top of that, I'll have to deal with the heat dissipation problem all over again.
What about all the other users experiencing the faulty behavior? I don't want to hurt the Asus business but let's look for a better solution first before throwing away devices.
5. Reporting the problem
I believe this problem should be adequately reported to Asus. I have to give them the benefit of the doubt that they were not aware of it and would take efforts to address it. I'm somewhat discouraged by the thermal design - I mean, how could they not have seen that? They clearly went for the thick thermal pads between the ICs and the heatsink - they were completely aware of the big gaps. That's indication of low standards on designing a premium device.
I would still try. I'd report both the poor heat dissipation and the nvram bug. If I contact regular customer service, I'll be most probably dismissed. So how shall I contact them? Could I possibly write a notification letter and submit it via
@RMerlin ?
Edited the whole post, reason: refinement.