Tech9
Part of the Furniture
Has anyone phoned home to Asus with these findings?
Does it break something in stock Asuswrt?
Has anyone phoned home to Asus with these findings?
Needs to be tested, but I don't know how. Does Optware support strace? If so, then I imagine one would only need to follow @dave14305 and others earlier troubleshooting procedures to determine.Does it break something in stock Asuswrt?
hahaha. For me, that is still the same message because I don't have this issue.If it doesn't:
"The number you are trying to call in unavailable at the moment. Please, try again later."
Does it break something in stock Asuswrt?
Well, first let me be clear that I don't own the GT-AC5300 router. It belongs to one of my best friends who has it running at his home, and I would need his consent to run any type of testing. He trusts me enough to give me full access to the router since the pandemic started, when I helped him reconfigure his home network & set up their PCs so that his whole family (wife + 3 kids) could work & attend school remotely, and have conference calls, remote meetings, etc. So while I have complete access to the router via an OpenVPN Server to do regular checks & maintenance (e.g. F/W updates, reconfiguration based on needs, etc.) I cannot do what I please without his prior knowledge & approval.@Martinski, do you mind testing any of our wrapper scripts to see if it would do its job on the GT-AC5300 and prevent stuck commands?
Based on what I've seen on the GT-AC5300
All of the ASUS router's extra services (AiProtection, Traffic Analyzer/Traffic Monitor, Parental Controls, QoS, AiCloud, AiDisk, etc.) are disabled on my own RT-AC86U as well as on the GT-AC5300, so that may be why I have not observed any of the problems you listed. It's possible that some of those services make "nvram" or "wl" calls, which upon failure to return would affect the behavior of the service.I don't know if it's related, but I've seen the following stuck on AC86U running Asuswrt:
- Clients List (rare)
- AiProtection logging (rare)
- Web History (common)
- Traffic Analyzer (common)
- when testing 384 code sometimes it turns off on reboot, haven't seen it on 386 though
Mostly TrendMicro components. A reboot fixes it. Perhaps explains why folks set reboot scheduler.
I have not observed any of the problems you listed
I realized that it was not consistently accurate or reliable
That was looong way to say "No" but fair enough.Well, first let me be clear that I don't own the GT-AC5300 router. It belongs to one of my best friends who ...
... at this point I doubt that I'll be able to run the scripts on my friend's GT-AC5300 router.
OK I've done a lot more tracing on my RT-AX86U and I think I understand more about what's happening, but I'm also slightly confused.I’m just saying that I don’t understand why eapd with pid 24095 uses nl_pid 24084 and then 24084+32770. Unless the precompiled binaries are broken in that regard.
lsof
inode information to try and identify the originating process (which appears to be the case) I then came up with these processes:nvram
and wl
traces it looks like the error isn't being properly trapped, falls through to the next bit of code and attempts to read the nonexistent socket.I was trying to look back in old Merlin repo or John’s fork for any older “less-closed source” versions of nvram. I didn’t find anything I thought was useful.So what is it that's different about the RT-AC86U? I assume there's a standard Broadcom API to read the nvram. So is this an issue with the API or the user space program (nvram)? From thenvram
andwl
traces it looks like the error isn't being properly trapped, falls through to the next bit of code and attempts to read the nonexistent socket.
I prefer the (google) oxford definition:Definition of BLOVIATE
to speak or write verbosely and windily… See the full definitionwww.merriam-webster.com
I think we’re approaching Broadcom’s dead-end alley.
nvram get
that hangs: 0.000084 stat64("/jffs", 0xffb8e7a0) = 0
0.000055 stat64("/jffs/nvram_war", 0xffb8e7a0) = 0
0.000117 socket(AF_NETLINK, SOCK_RAW, 0x1f /* NETLINK_??? */) = 3
0.000078 bind(3, {sa_family=AF_NETLINK, nl_pid=1098, nl_groups=00000000}, 12) = -1 EADDRINUSE (Address already in use)
0.000073 brk(NULL) = 0x42e000
0.000049 brk(0x44f000) = 0x44f000
0.000052 sendmsg(3, {msg_name=0xcffb8e830, msg_namelen=-4659160, msg_iov=NULL, msg_iovlen=0, msg_control=0xf6ec7d74ffb8ee04, msg_controllen=4142697156, msg_flags=MSG_DONTROUTE|MSG_CTRUNC|MSG_PROBE|MSG_TRUNC|MSG_DONTWAIT|MSG_WAITALL|MSG_FIN|MSG_CONFIRM|MSG_ERRQUEUE|MSG_NOSIGNAL|MSG_MORE|MSG_NO_SHARED_FRAGS|MSG_ZEROCOPY|MSG_FASTOPEN|MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT|0x1bb00000}, 0) = 36
0.000178 recvmsg(3,
0.000094 stat64("/jffs", 0xffe693f0) = 0
0.000054 stat64("/jffs/nvram_war", 0xffe693f0) = 0
0.000103 getpid() = 1217
0.000046 socket(AF_NETLINK, SOCK_RAW, 0x1f /* NETLINK_??? */) = 3<socket:[504128]>
0.000087 bind(3<socket:[504128]>, {sa_family=AF_NETLINK, nl_pid=1217, nl_groups=00000000}, 12) = -1 EADDRINUSE (Address already in use)
0.000074 bind(3<socket:[504128]>, {sa_family=AF_NETLINK, nl_pid=1218, nl_groups=00000000}, 12) = 0
0.000074 openat(AT_FDCWD</jffs/scripts>, "/proc/sys/kernel/pid_max", O_RDONLY) = 4</proc/sys/kernel/pid_max>
nvram
or wl
or how they're being used, but a coding bug in a common module. My guess is that it's in libnvram.so
which is supplied as a prebuilt module for each platform.+1Spot the difference?
So would this have been an issue created from the blobs upstream of @RMerlin? BTW, superb diagnostic work on your part @ColinTaylor .OK, I managed to setup a situation on my router that creates the pid-in-use problem seen on the RT-AC86U.
Here's @dave14305's strace fornvram get
that hangs:
And here's my strace:Code:0.000084 stat64("/jffs", 0xffb8e7a0) = 0 0.000055 stat64("/jffs/nvram_war", 0xffb8e7a0) = 0 0.000117 socket(AF_NETLINK, SOCK_RAW, 0x1f /* NETLINK_??? */) = 3 0.000078 bind(3, {sa_family=AF_NETLINK, nl_pid=1098, nl_groups=00000000}, 12) = -1 EADDRINUSE (Address already in use) 0.000073 brk(NULL) = 0x42e000 0.000049 brk(0x44f000) = 0x44f000 0.000052 sendmsg(3, {msg_name=0xcffb8e830, msg_namelen=-4659160, msg_iov=NULL, msg_iovlen=0, msg_control=0xf6ec7d74ffb8ee04, msg_controllen=4142697156, msg_flags=MSG_DONTROUTE|MSG_CTRUNC|MSG_PROBE|MSG_TRUNC|MSG_DONTWAIT|MSG_WAITALL|MSG_FIN|MSG_CONFIRM|MSG_ERRQUEUE|MSG_NOSIGNAL|MSG_MORE|MSG_NO_SHARED_FRAGS|MSG_ZEROCOPY|MSG_FASTOPEN|MSG_CMSG_CLOEXEC|MSG_CMSG_COMPAT|0x1bb00000}, 0) = 36 0.000178 recvmsg(3,
Spot the difference?Rich (BB code):0.000094 stat64("/jffs", 0xffe693f0) = 0 0.000054 stat64("/jffs/nvram_war", 0xffe693f0) = 0 0.000103 getpid() = 1217 0.000046 socket(AF_NETLINK, SOCK_RAW, 0x1f /* NETLINK_??? */) = 3<socket:[504128]> 0.000087 bind(3<socket:[504128]>, {sa_family=AF_NETLINK, nl_pid=1217, nl_groups=00000000}, 12) = -1 EADDRINUSE (Address already in use) 0.000074 bind(3<socket:[504128]>, {sa_family=AF_NETLINK, nl_pid=1218, nl_groups=00000000}, 12) = 0 0.000074 openat(AT_FDCWD</jffs/scripts>, "/proc/sys/kernel/pid_max", O_RDONLY) = 4</proc/sys/kernel/pid_max>
I don't think this is a problem as-such with individual user space programs likenvram
orwl
or how they're being used, but a coding bug in a common module. My guess is that it's inlibnvram.so
which is supplied as a prebuilt module for each platform.
Welcome To SNBForums
SNBForums is a community for anyone who wants to learn about or discuss the latest in wireless routers, network storage and the ins and outs of building and maintaining a small network.
If you'd like to post a question, simply register and have at it!
While you're at it, please check out SmallNetBuilder for product reviews and our famous Router Charts, Ranker and plenty more!