What's new

Since upgrading to firmware 386.3_2, my Internet will not stay connected for even a day.

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

I do see a problem in the syslog now. However, I'm having trouble posting it to the forum. Let me try a short version:

Code:
Dec 22 18:01:38 dnsmasq[1784]: failed to allocate 256 bytes

Dec 22 18:01:39 wlceventd: wlceventd_proc_event(508): eth5: Disassoc 00:11:BB:22:44:55, status: 0, reason: Disassociated because sending station is leaving (or has left) BSS (8), rssi:0

Dec 22 18:01:39 wlceventd: wlceventd_proc_event(527): eth5: Auth 00:11:BB:22:44:55, status: Successful (0), rssi:0

Dec 22 18:01:39 wlceventd: wlceventd_proc_event(556): eth5: Assoc 00:11:BB:22:44:55, status: Successful (0), rssi:0

Dec 22 18:01:39 dnsmasq-dhcp[1784]: DHCPDISCOVER(br0) 00:22:CC:22:33:ff

Dec 22 18:01:39 dnsmasq-dhcp[1784]: DHCPOFFER(br0) 192.168.100.80 00:22:CC:22:33:ff

Dec 22 18:01:39 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

Dec 22 18:01:39 dnsmasq-dhcp[1784]: DHCPREQUEST(br0) 192.168.100.80 00:22:CC:22:33:ff

Dec 22 18:01:39 dnsmasq-dhcp[1784]: DHCPACK(br0) 192.168.100.80 00:22:CC:22:33:ff

Dec 22 18:01:39 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

Dec 22 18:01:41 dnsmasq[1784]: failed to allocate 328 bytes
 
And there it is.

Code:
Chain POSTROUTING (policy ACCEPT 2073 packets, 147K bytes)
pkts bytes target     prot opt in     out     source               destination
1483  103K MASQUERADE  all  --  *      tun11   0.0.0.0/0            0.0.0.0/0
…

I suppose what you could do is install the NAT rule yourself from the nat-start script.

Code:
iptables -t nat -I POSTROUTING -o tun1+ -j MASQUERADE

I made it generic using a wildcard so nothing is likely to wipe it out, and applies to any active OpenVPN clients.

Of course, this does nothing to either identify nor fix the underlying cause. But it would presumably keep things running.
 
The error "failed to write /var/lib/misc/dnsmasq.leases: No space left on device" was not present in PRIOR syslogs I have saved. The last one I saved was from 4 days ago. The search terms "fail" and "error" are not present in the prior logs anywhere.

I don't know the command to list all the block devices on the router. But maybe this will help:


Code:
# df -h /

Filesystem                Size      Used Available Use% Mounted on

ubi:rootfs_ubifs         77.2M     64.6M     12.6M  84% /


# df -h /var

Filesystem                Size      Used Available Use% Mounted on

tmpfs                   215.0M    260.0K    214.7M   0% /var


 df -h /jffs

Filesystem                Size      Used Available Use% Mounted on

/dev/mtdblock9           47.0M      3.6M     43.4M   8% /jffs


This file is empty:

Code:
# ls -la /var/lib/misc/dnsmasq.leases

-rw-r--r--    1 admin root             0 Dec 22 21:15 /var/lib/misc/dnsmasq.leases
 
Code:
# free -m
             total       used       free     shared    buffers     cached
Mem:        440324     408200      32124       1612          0      19172
-/+ buffers/cache:     389028      51296
Swap:            0          0          0
 
This site will not allow me to post the section from the syslog that shows the vpnclient1 restart, even when wrapped in code tags.

The error is "This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data."
 
This site will not allow me to post the section from the syslog that shows the vpnclient1 restart, even when wrapped in code tags.

The error is "This website is using a security service to protect itself from online attacks. The action you just performed triggered the security solution. There are several actions that could trigger this block including submitting a certain word or phrase, a SQL command or malformed data."

You should probably be posting anything that large to PasteBin anyway.
 
maybe relevant:

failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60s) | SmallNetBuilder Forums
 
You should probably be posting anything that large to PasteBin anyway.
Do you need to see it? I was just pasting a small section about 20 lines long showing a few lines before and after restarting the vpnclient. It shows basically this:

Code:
Dec 22 19:14:16 dnsmasq[1784]: failed to allocate 256 bytes
restart vpnclient
Dec 22 19:14:17 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

Dec 22 19:14:20 dnsmasq[1784]: failed to allocate 328 bytes
Dec 22 19:14:20 dnsmasq[1784]: failed to allocate 256 bytes
 
Apparently restarting the vpnclient was not enough. While it did restore client Internet access, when I accessed the GUI to inspect a setting, the GUI became inaccessible via http and I had to reboot it. I rebooted via SSH ("reboot" command), but that was not sufficient. I ended up powering off and on, and now it is working normally.

I was checking the GUI looking for any enabled settings for AI Cloud or similar. Turns out that I don't have any of that enabled.

I did not see any other clues in the thread about "failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)"
 
Do you need to see it? I was just pasting a small section about 20 lines long showing a few lines before and after restarting the vpnclient. It shows basically this:

Code:
Dec 22 19:14:16 dnsmasq[1784]: failed to allocate 256 bytes
restart vpnclient
Dec 22 19:14:17 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

Dec 22 19:14:20 dnsmasq[1784]: failed to allocate 328 bytes
Dec 22 19:14:20 dnsmasq[1784]: failed to allocate 256 bytes

Since I don't know what I'm looking for, or what might be relevant, I don't know if I need the syslog. I'm just looking to gather as much information as possible. As I said before about the limitations of debugging this kind of problem in these forums, we're often dependent on the OP to decide if something is or isn't relevant. But I might see something that's suspicious that might be overlooked by you.
 
Since I don't know what I'm looking for, or what might be relevant, I don't know if I need the syslog. I'm just looking to gather as much information as possible. As I said before about the limitations of debugging this kind of problem in these forums, we're often dependent on the OP to decide if something is or isn't relevant. But I might see something that's suspicious that might be overlooked by you.
I sent you the link via PM
 
Apparently restarting the vpnclient was not enough. While it did restore client Internet access, when I accessed the GUI to inspect a setting, the GUI became inaccessible via http and I had to reboot it. I rebooted via SSH ("reboot" command), but that was not sufficient. I ended up powering off and on, and now it is working normally.

I was checking the GUI looking for any enabled settings for AI Cloud or similar. Turns out that I don't have any of that enabled.

I did not see any other clues in the thread about "failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)"
UPDATE: apparently the failure of GUI access is a separate issue. It happened again and I fixed it with "service restart_httpd" and a reboot was not required.
 
Thanks for the syslog.

I systematically filtered out what I thought was irrelevant (dnsmasq events, wireless events, etc.) and with what was left, I could NOT find anything suspicious. I can see you logging in @ 19:08:29, which is presumably when you noticed the problem. Then you attempted the first restart @ 19:14:17 and it failed, but then the second restart succeeded. But I can't see anything specifically prior to 19:08:29 that triggered the need to restart.

Next time it fails, instead of a restart, you might try adding the NAT rule back, if only to verify that is the problem.

Code:
iptables -t nat -I POSTROUTING -o tun11 -j MASQUERADE

Only other thing I can suggest at the moment is perhaps putting together a script to monitor the NAT table for when this actually happens. Perhaps take a snapshot of the system every minute (ifconfig, iptables, the process table, etc.), timestamp it, and dump it to a file. Maybe we'll find more clues. Maybe there's more that's changing in those data structures than just the NAT table. Maybe there's some process starting up that's NOT leaving any footprints in the syslog. Problem is, we could still miss that process if it's short-lived. Doesn't help either that it takes so long to develop. That itself might be a clue. Something that happens only every 5-6 days or so.
 
Next time it fails, instead of a restart, you might try adding the NAT rule back, if only to verify that is the problem.

Code:
iptables -t nat -I POSTROUTING -o tun11 -j MASQUERADE

Thank you. Yes, I will try that. Easy enough to do.

Only other thing I can suggest at the moment is perhaps putting together a script to monitor the NAT table for when this actually happens. Perhaps take a snapshot of the system every minute (ifconfig, iptables, the process table, etc.), timestamp it, and dump it to a file.

Do you have a script available for that? I'll be glad to do it.

Doesn't help either that it takes so long to develop. That itself might be a clue. Something that happens only every 5-6 days or so.

The only thing I have seen that develops over time is this:

Code:
Dec 22 19:14:17 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

But that issue seems perplexing as well because my device is not out of space (according to all the commands I know to run such as "df", "free", etc.).
 
The only thing I have seen that develops over time is this:

Code:
Dec 22 19:14:17 dnsmasq-dhcp[1784]: failed to write /var/lib/misc/dnsmasq.leases: No space left on device (retry in 60 s)

But that issue seems perplexing as well because my device is not out of space (according to all the commands I know to run such as "df", "free", etc.).

I dismissed this as being relevant since (afaik) this is the first time that problem has popped up. You were having your current VPN problems long before this started happening, right?
 
I dismissed this as being relevant since (afaik) this is the first time that problem has popped up. You were having your current VPN problems long before this started happening, right?

It could have been there previously. I noticed it this time because we did not do a reboot and you asked for the syslog -- and I had GUI access at that moment.

I've been learning about router troubleshooting steps throughout this process, but one thing I still don't know how to do it to get the logs without GUI access. (However, now I know how to restart the web interface without rebooting.)

As of now, the "No space left on device" error is not present in the logs after the reboot . But I will not be surprised if it appears simultaneously with the VPN issue.
 
Still waiting for the issue to recur...

What's it been, about two weeks now? That's quite a bit longer than previous failures. Did you make any changes other than just restart it? Any more of the dnsmasq error messages?
 
What's it been, about two weeks now? That's quite a bit longer than previous failures. Did you make any changes other than just restart it? Any more of the dnsmasq error messages?

Yes, it's coming up on 14 days. I have not made any configuration changes since our prior discussion, but there have been fewer guests using the network.

Initially, the problem occurred about every 1-3 days (from memory). Then it seems like it extended to 5 days.

Around that time (probably a month ago), I made one config change to increase the DHCP server lease duration to 604800.

The last issue we discussed occurred after 7 or 8 days, which was a new record at the time for days without an issue (on this firmware).

Now it has gone 14 days without an issue or a restart. It seems like a clear pattern of longer periods between problems.

Any more of the dnsmasq error messages?

No. No errors. But I am expecting that when they do occur, it will be simultaneous with clients not being able to access the Internet.

I have heard that it takes routers a while to "settle in" after flashing new firmware. Does that seem related to what I'm experiencing here in any way?
 
Around that time (probably a month ago), I made one config change to increase the DHCP server lease duration to 604800.

I assume you increased it believing that would result in fewer writes. But hanging on to leases well beyond when guests leave can lead to exhaustion of the DHCP pool, at least w/ heavy usage, and depending on the size of the pool. If anything, I would have probably decreased the lease duration so they could be removed more often and make space available (the dnmasq.leases error was about the lack of space).

I guess it just depends on the root cause of the problem w/ those leases. Increasing the lease duration makes sense if you believe the mere fact of writing to the file causes the problem. Whereas decreasing the lease duration makes sense (at least to me) if you take the error message at its word, and the storage is exhausted.
 

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top