What's new

connmon connmon - Internet connection monitoring

  • SNBForums Code of Conduct

    SNBForums is a community for everyone, no matter what their level of experience.

    Please be tolerant and patient of others, especially newcomers. We are all here to share and learn!

    The rules are simple: Be patient, be nice, be helpful or be gone!

Status
Not open for further replies.
something weird is going on with connmon.... I have diversion, skynet, ntpmerlin, scribe, uidivstats, uiscribe, and i come home from work and look at my cron tab and this is the only thing left in the cron tab for some mysterious reason....

Code:
*/5 * * * * /jffs/scripts/connmon generate daily #connmon_daily#
2 * * * * /jffs/scripts/connmon generate weekly #connmon_weekly#
3 */3 * * * /jffs/scripts/connmon generate monthly #connmon_monthly#

None of the other crons were in the cron tab anymore from the other programs... i think connmon might be doing something funky.

I tested and the issue goes away when connmon isn't installed.
 
some odd behaviors I noticed with connmon, it doesn't stay persistent in one spot it starts at the top of the cron tab and eventually winds up at the bottom of the cron tab, so it must be continously being rewritten when certain functions of the connmon script runs. All the other scripts like ntpmerlin, and udivstats stay consistently in the same spot on the list.

I think the issue stems from when it checks for the cronjobs in the crontab, the check isn't actually finding it, so it is continuously rewriting it anytime auto cron create comes up in your script, as well as when you go to uninstall it, it is not finding the cronjobs when it uninstalls the scripts, thus leaving the cronjobs listed in the crontab instead of removing them, the issue that is arising when it constantly writes the cronjobs to the crontab is causing issues because the memory being volatile and what not.
 
Last edited:
some odd behaviors I noticed with connmon, it doesn't stay persistent in one spot it starts at the top of the cron tab and eventually winds up at the bottom of the cron tab, so it must be continously being rewritten when certain functions of the connmon script runs. All the other scripts like ntpmerlin, and udivstats stay consistently in the same spot on the list.

I think the issue stems from when it checks for the cronjobs in the crontab, the check isn't actually finding it, so it is continuously rewriting it anytime auto cron create comes up in your script, as well as when you go to uninstall it, it is not finding the cronjobs when it uninstalls the scripts, thus leaving the cronjobs listed in the crontab instead of removing them, the issue that is arising when it constantly writes the cronjobs to the crontab is causing issues because the memory being volatile and what not.
No issues here, and i have about 30 cron jobs
That being said, i have corrected the cron detection which will stop it being added repeatedly. Not sure why it would cause your other jobs to disappear
 
Last edited:
The onlything I can think is if auto cron is invoked anytime the ping test runs, idk I didn't look that deep, but I do know it was getting invoked every 5 minutes , from what I can tell, the last timestamp I could confirm the other cron jobs being still listed was live for 3 hours of that. Then poof, like it had been erased like cached memory.

I will test your fix later. As far as I can tell crontab is staying persistant under normal conditions.
 
Last edited:
Thanks for the awesome script!
I was wondering if it's possible to implement a custom frequency and duration in the settings? For example, my ISP has small drops, which wouldn't be caught in the 5min checks. So I wanted to check for 10 seconds every minute. I changed the cron to run every min, updated the connmon script to ping for 10 seconds, but the graph plotting seems off/crammed in after.
 
Thanks for the awesome script!
I was wondering if it's possible to implement a custom frequency and duration in the settings? For example, my ISP has small drops, which wouldn't be caught in the 5min checks. So I wanted to check for 10 seconds every minute. I changed the cron to run every min, updated the connmon script to ping for 10 seconds, but the graph plotting seems off/crammed in after.
The first graph will be plotting every minute - what did you expect it to look like?
 
After about 2 weeks of testing this script I think it draws some nice graphs, but actually fails to monitor the connection. Here is why in examples:

- a large file download takes all of the ISP bandwidth, the connection is actually very healthy, but the Quality graph drops to 80% or below. This happens every time when ping and jitter creep up, a normal condition when the network is very busy. Kind of wrong logic for connection quality evaluation, in my opinion. In other words, when I had the opportunity to use 100% of the capacity of my connection it was actually at 80% Quality?

- WAN cable is unplugged, connection Quality should be 0%. The graph just skips some dots and continues with 100% after the WAN is up and running again. Same with Ping and Jitter. Did I really have 12ms ping to Google with WAN cable unplugged? No data for the period the WAN was down, just a straight line connecting the last readings. In other words, by looking at the graphs it's really hard to tell if the connection was down or not.

And one thing that has to be fixed, because it doesn't do what is says it does:

- the script fails to remove the files it creates. Answering Yes to the question "Do you want to delete connmon config and stats? (y/n)" doesn't remove the stats files. I did uninstall and install the script again few days after. I can still see the graphs from the previous installation. But I'm happy to see I had 100% connection Quality during the period the script wasn't working. :)

Tested on RT-AC86U, Asuswrt-Merlin 384.12
 
The first graph will be plotting every minute - what did you expect it to look like?
Here's what the plotting looks like when I hover over one of the spikes - three different ping results, plotted for the same time, and nothing under jitter

q93cWIk.png


And one thing that has to be fixed, because it doesn't do what is says it does:

- the script fails to remove the files it creates. Answering Yes to the question "Do you want to delete connmon config and stats? (y/n)" doesn't remove the stats files. I did uninstall and install the script again few days after. I can still see the graphs from the previous installation. But I'm happy to see I had 100% connection Quality during the period the script wasn't working. :)

Tested on RT-AC86U, Asuswrt-Merlin 384.12

I also noticed this, and found that the sqlite database is stored at /jffs/scripts/connmon.d/connstats.db. So you can manually clear the entries from it using
# sqlite3 /jffs/scripts/connmon.d/connstats.db
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
sqlite> delete from connstats;
 
After about 2 weeks of testing this script I think it draws some nice graphs, but actually fails to monitor the connection. Here is why in examples:

- a large file download takes all of the ISP bandwidth, the connection is actually very healthy, but the Quality graph drops to 80% or below. This happens every time when ping and jitter creep up, a normal condition when the network is very busy. Kind of wrong logic for connection quality evaluation, in my opinion. In other words, when I had the opportunity to use 100% of the capacity of my connection it was actually at 80% Quality?

- WAN cable is unplugged, connection Quality should be 0%. The graph just skips some dots and continues with 100% after the WAN is up and running again. Same with Ping and Jitter. Did I really have 12ms ping to Google with WAN cable unplugged? No data for the period the WAN was down, just a straight line connecting the last readings. In other words, by looking at the graphs it's really hard to tell if the connection was down or not.

And one thing that has to be fixed, because it doesn't do what is says it does:

- the script fails to remove the files it creates. Answering Yes to the question "Do you want to delete connmon config and stats? (y/n)" doesn't remove the stats files. I did uninstall and install the script again few days after. I can still see the graphs from the previous installation. But I'm happy to see I had 100% connection Quality during the period the script wasn't working. :)

Tested on RT-AC86U, Asuswrt-Merlin 384.12
You're saturating your connection, so some packets are lost during the ping. This means there is a degradation in the quality of your connection for other devices while the download is running. Quality isn't a direct equivalent to line speed.

If the test fails, it doesn't plot the values. I can change this to plot 0s instead. Missing chunks of graph should serve to show you something isn't right though.

That's probably something I missed when I overhauled it from rrd to sqlite. An easy fix which I'll get done this weekend.
 
Here's what the plotting looks like when I hover over one of the spikes - three different ping results, plotted for the same time, and nothing under jitter

q93cWIk.png




I also noticed this, and found that the sqlite database is stored at /jffs/scripts/connmon.d/connstats.db. So you can manually clear the entries from it using
# sqlite3 /jffs/scripts/connmon.d/connstats.db
SQLite version 3.27.2 2019-02-25 16:06:06
Enter ".help" for usage hints.
sqlite> delete from connstats;
You'll need to zoom in, at the default zoom level your mouse pointer is overlaying multiple data points
 
If the test fails, it doesn't plot the values. I can change this to plot 0s instead. Missing chunks of graph should serve to show you something isn't right though.
Plotting 0's or even perhaps a small log entry below the graphs which would show down/up events would be awesome!

You'll need to zoom in, at the default zoom level your mouse pointer is overlaying multiple data points
That seems to work fine, any idea why nothing is plotted for jitter though?
 
If the test fails, it doesn't plot the values.

And this is the main issue. It is very hard to spot short periods of WAN down in a 24h period.
If the Ping and Jitter show 500 and the Quality 0, for example, the graphs will be much more usable.

So you can manually clear the entries from it

I can, but this is a bug in uninstall procedure.
 
And this is the main issue. It is very hard to spot short periods of WAN down in a 24h period.
If the Ping and Jitter show 500 and the Quality 0, for example, the graphs will be much more usable.

I completely agree with you here. Especially after changing the cron to run every minute and having to zoom out so the plots no longer overlap. That's why I thought a small dropdown/textbox below the graphs with downtime events would have been handy.
 
And this is the main issue. It is very hard to spot short periods of WAN down in a 24h period.
If the Ping and Jitter show 500 and the Quality 0, for example, the graphs will be much more usable.
The issue with plotting 0 (or whatever) with the current setup is that it means the longer period graphs (e.g. 30 days) get a skewed average
 
The issue with plotting 0 (or whatever) with the current setup is that it means the longer period graphs (e.g. 30 days) get a skewed average

There is no point to see an average of let’s say 98% in a 30-day period if I can’t see when the WAN was down and for how long. Currently we have a connection monitoring script that is not actually monitoring service disruptions and this is the most important piece of information.
 
Status
Not open for further replies.

Support SNBForums w/ Amazon

If you'd like to support SNBForums, just use this link and buy anything on Amazon. Thanks!

Sign Up For SNBForums Daily Digest

Get an update of what's new every day delivered to your mailbox. Sign up here!
Top