A Guide About Installing ZeroTier on ASUS AC68U Router

MissingTwins · Jan 23, 2023

itsming said:
Can I ask you where the 10.0.0.4 comes from given that 192.168.9.0 is the router ip address in LAN? Is it the router ip address in Zerotier network?

10.x.x.x/24 is my zerotier network address, for example my router has the ip addresses of 192.168.9.1(eth0) and 10.9.8.4(zt0).

itsming said:
Can I also you how how set this rule in the router? Is it the page shown in the attached Fig 1?

All ip routes are set by the commands. Such as `ip route add 192.168.7.0/24 via 10.9.8.3`
And I made them into scripts and they run either by wan-event or by crontab.

itsming said:
Aim: devices connected to my router that is connected to my Zerotier network can use the ip address of my VPS that is also inside the Zerotier network without installing the ZeroTier One app (to save battery and labour).

Yes this is also what this tutorial does, and my goal.

itsming said:
Progress so far: I have managed to ping from any device in my Zerotier network to my router, and vice versa. I have also managed to ping from any device connected to my router to any device in my Zerotier network, but not vice versa.

If you want to other zerotier device be able to ping the device connected to your router, you need to add ip-route into the zerotier devices, otherwise it has not idea where to send your packets.

itsming said:
Ping outputs some longs like "Redirect Host (New addr: 192.168.196.90)" which is my router ip address in the Zerotier network. I guess it may because devices connected to my router are not shown in the Zerotier web console. Meanwhile, the device that's got ZeroTier One app installed (shown in Zerotier web console ) has the VPS ip address without any issues.

For example assume your router ip address in zerotier netword is 192.168.196.90, and your router local ip address is 192.168.9.1
then you need to put this ip-route into every remote zeroiter node that you have.
`ip route add 192.168.9.0/24 via 192.168.196.90`

itsming said:
Encountered problem: the ip address shown in the device connected to my router is not the ip address shown in my VPS. The Managed Routes in Zerotier Central page is shown in Fig 2. Can you kindly assist?

I didn't use the ZeroTier WebUI's Managed Routes, because it would appear in every ZeroTier node and mess your local traffic easily, causing performance issues and jamming the local traffics. So I only add ip-route to those that is necessary.

Sorry for the late response, I didn't get the mail notification.

MissingTwins · Jan 23, 2023

damageboy said:
Hi @MissingTwins,
I was having issues with your guide until I read the following comment, which did indeed "solve" the issues I was having:

Would you mind quote the issues you had? I may have a clue.

damageboy said:
I also have noticed that you're using ac68u, this router can only run Zerotier up to 1.4.6, if it is runing with any version higher than 1.4.6, you have to do downgrade.

The scripts works on later and latest version of ZeroTier also. For ac68u there is no better version rather than 1.4.6.

damageboy said:
Is there any workaround for this? I have some reservations about using such an old version, and would rather solve the issue if possible, do you have any additional information about what the root cause is?

But to ac86u and ax86u, the latest version of ZeroTier is recommended.

Here is the updated version of scripts

/opt/ect/init.d/S90zerotier-one.sh

Bash:

#! /bin/sh

source /jffs/scripts/mylib.sh

# -----------------------------------------
# Example : baseRoute
# Argu    : none
# Input   : None
# Return  : None
function baseRoute() {

    # Allow LAN via zerotier
    /jffs/scripts/lan-route-table.sh
}

# -----------------------------------------
# Example : initCheck
# Argu    : none
# Input   : None
# Return  : None
function initCheck() {

    ZT_ONLINE=$(zerotier-cli info| grep -i "online")
    if [ -z "$ZT_ONLINE" ];then
        sysLOG "ZT OFFLINE! restarting" warning ;
        /opt/etc/init.d/S91zerotier-one restart
        return 0;
    fi
    
    ZT_INTERFACE=$(ip -o link show | grep -oP '\d{1,2}:\s\Kzt[\w]+' | head -n1);
    # fallback
    if [ -z "$ZT_INTERFACE" ];then
        echo "get zt interface empty, try another way";
        ZT_INTERFACE=$(ip -o link show | awk -F': ' '{print $2}'|grep "^zt");
    fi
    echo "Is zt empty: $ZT_INTERFACE";
    
    # sometimes dev zt0 would disappeared until you restarted zerotier
    if [ -z "$ZT_INTERFACE" ];then
        sysLOG "zt+ dev disappeared! Restarting" warning ;
        /opt/etc/init.d/S91zerotier-one restart
    fi
    
    # zerotier http not working when system started, restart zerotier is essential
    if [ -f "/tmp/first-start.flag" ];then
        echo "found zerotier first launch"
        tmpup=$(uptime | cut -c 14- )
        tmpup=${tmpup%% load*}
        is_long_enough=""
        echo $tmpup | grep -i "days" && is_long_enough=1
        echo $tmpup | grep -i ":" && is_long_enough=1
        echo $tmpup | grep -i "min,"
        if [ "$?" -eq 0 ]; then
            tmpup=${tmpup%%min*}
            # greater than 2 minutes
            [ "$tmpup" -gt 2 ] && is_long_enough=1
        fi
        if [ "$is_long_enough" -gt 0 ];then
            sysLOG "due to first launch, restart zerotier" info ;
            /opt/etc/init.d/S91zerotier-one restart
            rm /tmp/first-start.flag
        fi
    fi
    
    # MTU is causing lots of problems
    if [ ! -z "$ZT_INTERFACE" ];then ifconfig "$ZT_INTERFACE" mtu 1388; fi

    # add base route tables
    if [ ! -z "$ZT_INTERFACE" ];then baseRoute; fi
}

# -----------------------------------------

case "$1" in
  start)
    # -------
    if lsmod | grep -q tun ;
    then echo "mod tun ready" ;
    else
        modprobe tun;
        sysLOG "starting modprobe tun, zerotier should start in one minute" notice ;
        exit 0;
    fi
    # -------
    if ( pidof zerotier-one )
    then
        echo "ZeroTier-One is already running.";
        initCheck ;
    else
        echo "Starting ZeroTier-One" ;
        /opt/bin/zerotier-one -d ;
        sysLOG "Zerotier Started" notice ;
        initCheck ;
    fi
    ;;
  stop)
    # -------
    if ( pidof zerotier-one )
    then
        echo "Stopping ZeroTier-One";
        killall zerotier-one
        sysLOG "Zerotier Stopped" notice ;
    else
        echo "ZeroTier-One was not running" ;
    fi
    ;;
  status)
    # -------
    if ( pidof zerotier-one )
    then echo "ZeroTier-One is running."
    else echo "ZeroTier-One is NOT running"
    fi
    ;;
  *)
    echo "Usage: /etc/init.d/zerotier-one {start|stop|status}"
    exit 1
    ;;
esac

# -----------------------------------------
exit 0

/jffs/scripts/mylib.sh

Bash:

#!/bin/sh

MY_BASE=$(echo $0)
#Avoid sourcing scripts multiple times echo $0 = -sh
[[ "${_NAME_OF_THIS_LIBSCRIPT:-""}" == "yes" ]] && return 0
[[ "${MY_BASE:0:1}" != "-" ]] && _NAME_OF_THIS_LIBSCRIPT=yes

# Log Tag
#[[ "${LOG_TAG:-""}" ]] && return 0
LOG_TAG=`basename "$0"`
LOG_TAG="DarthTwins $LOG_TAG"
printf "\n\n$(date '+%m/%d %T') LOG_TAG=$LOG_TAG\n"

$(bash -c ": > /dev/tty" )
[ $? != 0 ] || { IS_CONSOLE=1; echo "This is a console"; }
 
# -----------------------------------------
# Example : sysLOG "We have a problem!" error
# Argu    : $1 logs
#           $2 error/notice/warning
# Input   : $IS_CONSOLE
# Return  : None
function sysLOG() {
    if [ -n "$IS_CONSOLE" ]; then
        echo "$1" | tee /dev/tty | tr -d '\n' | logger -t $LOG_TAG -p user.$2;
    else
        echo "$1" | tr -d '\n' | logger -t $LOG_TAG -p user.$2;
    fi
}

# -----------------------------------------
# Example : iptablesINS "INPUT -i ppp+ -p tcp --dport 11111 -j ACCEPT"
# Argu    : $1 iptables-rules
# Input   : None
# Return  : None
function iptablesINS() {
    
    local reRAW RECODE CMD_STR
    CMD_STR="iptables -I $1"
    reRAW=$( iptables -C $1 2>&1 )
    RECODE=$?
    reRAW=$( echo -n "$reRAW" | head -n 1  )
    if [ $RECODE -ge 2 ]; then
        sysLOG "ip6tables -I $1 Err=$reRAW" error
    elif [ $RECODE -eq 1 ]; then
        reRAW=$( eval $CMD_STR 2>&1 )
        [ $? != 0 ] && sysLOG "Error $CMD_STR" error || sysLOG "Success $CMD_STR" notice
    else
        echo "Existed $CMD_STR"
    fi
}

# -----------------------------------------
# Example : ip6tablesINS "INPUT -i ppp+ -p tcp --dport 11111 -j ACCEPT"
# Argu    : $1 ip6tables-rules
# Input   : None
# Return  : None
function ip6tablesINS() {
    
    local reRAW RECODE CMD_STR
    CMD_STR="ip6tables -I $1"
    reRAW=$( ip6tables -C $1 2>&1 )
    RECODE=$?
    reRAW=$( echo -n "$reRAW" | head -n 1  )
    # echo "=$RECODE |$1 | RE="$reRAW
    if [ $RECODE -ge 2 ]; then
        sysLOG "ip6tables -I $1 Err=$reRAW" error
    elif [ $RECODE -eq 1 ]; then
        reRAW=$( eval $CMD_STR 2>&1 )
        [ $? != 0 ] && sysLOG "Error $CMD_STR" error || sysLOG "Success $CMD_STR" notice
    else
        echo "Existed $CMD_STR"
    fi
}

# -----------------------------------------
# Example : addRoute "192.168.8.0/24" "10.9.8.5"
# Argu    : $1 net
# Argu    : $2 gate
# Input   : None
# Return  : None
function addRoute() {
    local TEST_ARGS msg CMD_STR
    CMD_STR="ip route replace $1 via $2 "
    TEST_ARGS=$(ip route show "$1" | wc -l)
    if [ $TEST_ARGS -eq 0 ]; then
        msg=$( eval $CMD_STR 2>&1 )
        [ $? != 0 ] && sysLOG "Failed $CMD_STR err=$msg" error || sysLOG "Success $CMD_STR" notice
    fi
}

# -----------------------------------------
# Example : baseZTRoute "10.9.8.0/24"
# Argu    : none
# Input   : ZT_NETWORK
# Return  : None
function baseZTRoute() {
    local ZT_NETWORK
    ZT_NETWORK=$1
    iptables -C INPUT -i zt+ -j ACCEPT
    if [ $? != 0 ]; then
        iptables -I INPUT 1 -i zt+ -j ACCEPT
        iptables -t nat -I PREROUTING -i zt+ -d $ZT_NETWORK -p tcp -m multiport --dport 21,22,80 -j DNAT --to-destination `nvram get lan_ipaddr`
        iptables -t nat -A POSTROUTING -o br0 -s $ZT_NETWORK -j SNAT --to-source `nvram get lan_ipaddr`
        iptables -I FORWARD -i zt+ -d `nvram get lan_ipaddr`/24 -j ACCEPT
        iptables -I FORWARD -i br0 -d $ZT_NETWORK -j ACCEPT
        sysLOG "zt+ rules added $ZT_NETWORK" notice
    else
        echo "Existed baseZTRoute $ZT_NETWORK"
    fi
}

/jffs/scripts/lan-route-table.sh

Bash:

#!/bin/sh

source /jffs/scripts/mylib.sh

# -----------------------------------------
## Check route for LANs
#echo "zt lan route:"
#addRoute "192.168.9.0/24"   "10.9.8.5"
#addRoute "192.168.10.0/24"  "10.9.8.2"

# -----------------------------------------
#echo "zt iptables:"
baseZTRoute "10.9.8.0/24"
    
# -----------------------------------------
#echo "9399 iptables:"
iptablesINS "INPUT -i eth0 -p tcp --destination-port 9399 -j ACCEPT"
iptablesINS "INPUT -i eth0 -p udp --destination-port 9399 -j ACCEPT"

ip6tablesINS "INPUT -i eth0 -p tcp --destination-port 9399 -j ACCEPT"
ip6tablesINS "INPUT -i eth0 -p udp --destination-port 9399 -j ACCEPT"

damageboy · Jan 23, 2023

@MissingTwins

I am indeed running an AC-68U:

Linux RT-AC68U-C9B8 2.6.36.4brcmarm #1 SMP PREEMPT Fri Jan 6 15:04:31 EST 2023 armv7l ASUSWRT-Merlin

My specific issue was that although the packet counters for the added SNAT/INPUT rules looks like packets were flowing properly, I couldn't get ICMP / TCP packets to actually go through.

I attempted to inspect traffic by installing tshaek and see if anything was coming "down the pipe" with:

tshark -i ztyxa4t4hd -f "icmp"

But nothing was, until I downgraded, that is.

porktomatoes · Aug 27, 2023

Thanks for the guide. In my case, using ASUS GT-AX6000 with Asuswrt-Merlin 3004.388.4, I was able to expose my home LAN, as well as Internet access via my home ISP, to other ZT clients using only the following rules:

Bash:

# Internet access / "full tunnel mode":
iptables -I FORWARD -i zt+ -o eth0 -s 10.242.0.0/16 -j ACCEPT
iptables -I FORWARD -i zt+ -o eth0 -s 10.242.0.0/16 -d 10.0.0.0/8 -j DROP # optional
iptables -I FORWARD -i zt+ -o eth0 -s 10.242.0.0/16 -d 172.16.0.0/12 -j DROP # optional
iptables -I FORWARD -i zt+ -o eth0 -s 10.242.0.0/16 -d 192.168.0.0/16 -j DROP # optional
# LAN:
iptables -I FORWARD -i zt+ -o br0 -s 10.242.0.0/16 -d `nvram get lan_ipaddr`/24 -j ACCEPT
iptables -I INPUT -i zt+ -s 10.242.0.0/16 -d `nvram get lan_ipaddr` -j ACCEPT # optional; allows access to the ASUS router itself (e.g. ping, ssh, web iface, etc.)

10.242.0.0/16 is my ZT virtual LAN, and 10.242.148.100 is the router's IP within it.
On the ZeroTier side, I've set up these managed routes:

Code:

0.0.0.0/0       via 10.242.148.100  # Internet access
192.168.50.0/24 via 10.242.148.100  # LAN access

NB.: The "DROP" rules are not strictly necessary, but I added them just to be on the safe side (these rules will e.g. deny ZT clients access to your ISP's modem). Also, I found that using NAT was not needed; somehow it just works like this.

With this setup, using an Android phone with the ZT app over my mobile carrier, I can access my home LAN and also browse the Internet with my home IP address ("Route all traffic through ZeroTier" needs to be enabled in the Android app)

chetstone · Sep 29, 2023

@MissingTwins

Thanks so much for this guide and for your patience in answering all these questions for years.

I have a question about something you told a previous questioner:

Nesting Zerotier devices is not recommended, it will definitely slowing down your inner lan Zerotier devices' performances. And Nesting Zerotier devices sometimes causing strange drop-out or unreachable nodes.

What does nesting ZT devices mean? If you have devices with ZT installed on a LAN, then add ZT to the router for the LAN, does this create nested devices?

I have a number of devices with ZT installed and have been using this setup without issue for years. Now I am interested in installing ZT on the router in order to be able to manage it remotely via WEB UI and ssh. And also perhaps to access one or two other devices from outside the LAN, like a printer. What is the best way to do this without compromising performance? I really would rather not change the configuration of my existing ZT devices.

Thanks

MissingTwins · Oct 1, 2023

chetstone said:
What does nesting ZT devices mean? If you have devices with ZT installed on a LAN, then add ZT to the router for the LAN, does this create nested devices?

For example, suppose you have two routers: the first is the main router with direct access to the internet, and the second is connected to the main router's LAN port and operates on its own IP subnet. It is recommended to install ZT on the main router only. Any device in the LAN with ZT installed is referred to as 'nesting ZT', and this can slow down the specific device with ZT installed.

If you install ZT on both routers, the second router's network performance and stability might be compromised.

However, the issue of nesting ZeroTier was causing problems in versions 1.4.x and 1.6.x. It is not that apparent in version 1.10.x.

chetstone · Oct 2, 2023

MissingTwins said:
For example, suppose you have two routers: the first is the main router with direct access to the internet, and the second is connected to the main router's LAN port and operates on its own IP subnet. It is recommended to install ZT on the main router only. Any device in the LAN with ZT installed is referred to as 'nesting ZT', and this can slow down the specific device with ZT installed.

If you install ZT on both routers, the second router's network performance and stability might be compromised.

Thanks so much for the reply. I don't have nested routers, but I do have computers with ZT installed on the LAN. I may be overthinking this, but it seems to me that it's unlikely there would be a problem if you simply install ZT on the router to be able to manage it, as you have done in Post #1 of this thread. The problem would come IMO when you expose ZT routing to LAN members as you specify in Post #5 (and is specified again many times in different variations in the thread.) Then a computer with ZT enabled would have two ways to get to it over zerotier, and that could cause problems.

So based on that theory, I decided to try advertising routes to only the few (non-ZT capable) devices explicitly, rather than the whole subnet, and it seems to be working OK in my limited testing. Here is my Managed Routes:

10.147.17.0/24 Zerotier net
192.168.108.0/24 LAN 1
172.24.0.0/24 LAN 2
10.147.17.64 LAN 1 Router
10.147.17.128 LAN 2 Router

Here is my firewall-start script:


#!/bin/sh
logger -t "custom iptables" "Enter" -p user.notice
iptables -C INPUT -i zt+ -j ACCEPT
if [ $? != 0 ]; then
    iptables -I INPUT -i zt+ -j ACCEPT
    iptables -t nat -I PREROUTING -i zt+ -d 10.147.17.0/24 -p tcp -m multiport --dport 21,22,80 -j DNAT --to-destination `nvram get lan_ipaddr`
    iptables -I INPUT -p udp --dport 9993 -j ACCEPT
    iptables -I FORWARD -i zt+ -j ACCEPT
    logger -t "custom iptables" "rules added" -p user.notice
else
    logger -t "custom iptables" "rules existed skip" -p user.notice
fi

I don't remember where I came up with those iptables rules, but it seems to work OK. I notice that in your suggestion you used a POSTROUTING Nat rule:
iptables -t nat -A POSTROUTING -o br0 -s 10.9.8.0/24 -j SNAT --to-source `nvram get lan_ipaddr`
What's the difference vs. the PREROUTING rule I use above?

MissingTwins said:
However, the issue of nesting ZeroTier was causing problems in versions 1.4.x and 1.6.x. It is not that apparent in version 1.10.x.

I'm on RT-AC68W so I'm using 1.4.6. I wanted to try the latest version, 1.12.2, to see if that bug might have been fixed, but the latest offered by entware is 1.10.4.

So do you think what I'm doing is risky?

By the way, the updated version of scripts you posted above is incomplete. In /opt/ect/init.d/S90zerotier-one.sh you reference /opt/ect/init.d/S91zerotier-one.sh, and that script is not provided. I tried to guess what it did but it didn't work. Also you reference a /tmp/first-start file but don't mention when or where it gets created. So I abandoned those scripts and went back to the ones you provided in your first post. Those work great.

Finally, what tool do you use to create your cool network map graphic?

MissingTwins · Oct 3, 2023

chetstone said:
The problem would come IMO when you expose ZT routing to LAN members as you specify in Post #5 (and is specified again many times in different variations in the thread.) Then a computer with ZT enabled would have two ways to get to it over zerotier, and that could cause problems.

Exactly!

So based on that theory, I decided to try advertising routes to only the few (non-ZT capable) devices explicitly, rather than the whole subnet, and it seems to be working OK in my limited testing. Here is my Managed Routes:

I don't use Managed Routes because it broadcasts to all nodes. I want to hide some subnets from being exported to irrelevant nodes and prevent a death loop.

Here is my firewall-start script:
I don't remember where I came up with those iptables rules, but it seems to work OK.

It's my style—definitely a piece from one of my scripts. LOL.

I notice that in your suggestion you used a POSTROUTING Nat rule:
iptables -t nat -A POSTROUTING -o br0 -s 10.9.8.0/24 -j SNAT --to-source `nvram get lan_ipaddr`
What's the difference vs. the PREROUTING rule I use above?

POSTROUTING is used after the routing decision has been made, and just before the packet is sent out to the network interface. It's primarily used for Source NAT(SNAT) operations, which might change the source IP address of an outgoing packet to make it appear as if it's coming from the router.
PREROUTING should be used the DNAT. It changes the destination IP address of an incoming packet to ensure it gets to the correct device behinds your router.

I'm on RT-AC68W so I'm using 1.4.6. I wanted to try the latest version, 1.12.2, to see if that bug might have been fixed, but the latest offered by entware is 1.10.4.

1.4.6 was a very stable version for the ac68u, except that the MTU had to be set after every reboot.
In later versions, I experienced various hangs, crashes, slow reconnections, and strange behaviors, like being able to ping but encountering HTTP malfunctioning.

So do you think what I'm doing is risky?

This carries no risk at all; no data will be lost.
It will be a bit troublesome if you are far from the router and it disconnects due to a misused testing script.

By the way, the updated version of scripts you posted above is incomplete. In /opt/ect/init.d/S90zerotier-one.sh you reference /opt/ect/init.d/S91zerotier-one.sh, and that script is not provided. I tried to guess what it did but it didn't work. Also you reference a /tmp/first-start file but don't mention when or where it gets created. So I abandoned those scripts and went back to the ones you provided in your first post. Those work great.

Thank you for pointing that out; Missing files has been uploaded to GitHub.

https://github.com/MissingTwins/merlin_zerotier

Finally, what tool do you use to create your cool network map graphic?

That's Microsoft Visio.

chetstone · Oct 8, 2023

@MissingTwins

Thanks for the reply, and thanks for posting your scripts on github. They're pretty complicated, but it looks like you're trying to work around a couple of bugs. I still don't know how /tmp/firstrun flag gets set. Does the zerotier daemon itself create it? If so, wouldn't it get reset on restart so you'd have an infinite loop of restarting?

In any case, I wonder if you could help me get my current implementation working. I have a fairly simple test setup. I will be moving Router 2 to my girlfriend's house when I get it working, but for testing, it looks like this:

The idea is to be able to use the printer in the other LAN. Currently, I can:

Connect from either computer to either Router using SSH and Web.
Ping and connect to either printer's web page from either computer.
Print from Computer 1 to either printer.

But I cannot print from Computer 2 to Printer 1. The printer queue first shows "Connected to Printer", then "Unable to get printer status" and it hangs there. The printer IP was added as an IPP printer.

UPDATE: Everything works with a more modern MacOS. I had the bright idea to move the Ventura computer to Router 2 and it was able to print on Printer 1. So the problem was the Mojave.

As recommended, I am not using Managed Routes, and I have disconnected both computers from the zerotier network (although it works the same when they are connected).

My firewall-start script (same for both routers):

Bash:

#!/bin/sh
logger -t "custom iptables" "Enter" -p user.notice
ZT_NETWORK=10.147.17.0/24
iptables -C INPUT -i zt+ -j ACCEPT
if [ $? != 0 ]; then
    iptables -I INPUT -i zt+ -j ACCEPT
    iptables -t nat -I PREROUTING -i zt+ -d $ZT_NETWORK -p tcp -m multiport --dport 21,22,80 -j DNAT --to-destination `nvram get lan_ipaddr`
    iptables -t nat -A POSTROUTING -o br0 -s $ZT_NETWORK -j SNAT --to-source `nvram get lan_ipaddr`
    iptables -I FORWARD -i zt+ -d `nvram get lan_ipaddr`/24 -j ACCEPT
    iptables -I FORWARD -i br0 -d $ZT_NETWORK -j ACCEPT
    iptables -I INPUT -p udp --dport 9993 -j ACCEPT
    iptables -I INPUT -p tcp --dport 9993 -j ACCEPT
    iptables -I INPUT -s `nvram get lan_ipaddr`/24 -j ACCEPT
    logger -t "custom iptables" "rules added" -p user.notice
else
    logger -t "custom iptables" "rules existed skip" -p user.notice
fi

Route table for Router 1:

Bash:

#! /bin/sh
#lan-route-table.sh

TEST_ARGS=$(ip route show 172.24.0.0/24 | wc -l)
if [ $TEST_ARGS -eq 0 ]; then
    ip route add 172.24.0.0/24 via 10.147.17.128
    logger -t "lan-route-table.sh" -c "zerotier LAN route added" -p user.notice
fi

Route table for Router 2:

Bash:

#! /bin/sh
#lan-route-table.sh

TEST_ARGS=$(ip route show 192.168.108.0/24 | wc -l)
if [ $TEST_ARGS -eq 0 ]; then
    ip route add 192.168.108.0/24 via 10.147.17.64
    logger -t "lan-route-table.sh" -c "zerotier LAN route added" -p user.notice
fi

So the only difference I see in the two lans is the age of the components. Both routers have the latest Asus-wrt Merlin release, so they should be equivalent. Router 2 has an older version of Zerotier. MacOS Mojave is of course quite old (although I have installed the latest available version of the Canon CUPS driver.) The printer itself is actually identical. I just move it back and forth between LANs when testing. Any ideas?

(By the way, why do some of your scripts mention port 9399? Is that a typo? The ZT documentation mentions only 9993.)

Thanks!

MissingTwins · Oct 9, 2023

chetstone said:
@MissingTwins

Thanks for the reply, and thanks for posting your scripts on github. They're pretty complicated, but it looks like you're trying to work around a couple of bugs. I still don't know how /tmp/firstrun flag gets set. Does the zerotier daemon itself create it? If so, wouldn't it get reset on restart so you'd have an infinite loop of restarting?

Sorry, I forgot to mention that the firstrun_flag was set in /jffs/scripts/init-start. I have updated GitHub.

chetstone said:
But I cannot print from Computer 2 to Printer 1. The printer queue first shows "Connected to Printer", then "Unable to get printer status" and it hangs there. The printer IP was added as an IPP printer.

Let's Conduct Some Diagnostics

1. Ping Test: Ping Computer 1 from Computer 2.
2. Webserver Check: If you are running webservers (such as Emby or Jellyfin) on Computer 1, try to browse them from Computer 2.
3. File Sharing Test: Attempt to access the SMB share on Computer 1 from Computer 2 and try copying some files.
4. Printer Access Test: Attempt to access Printer 1 using a Windows or Linux machine.
You might try using CUPS directly on your router, but I haven't tested this as I do not have a printer.

If all the above tests succeed without error, with both computers disconnected from ZeroTier locally, then the problem may not lie with ZeroTier.

By the way, version 1.4.6 had an MTU problem.
Have you set the MTU to 1388 or smaller?
If the MTU is the problem, tests 2, 3, and 4 may not function properly.

chetstone said:
As recommended, I am not using Managed Routes, and I have disconnected both computers from the zerotier network (although it works the same when they are connected).

1.4.6 struggled to manage Routes and Nested ZeroTier effectively, but the newest version generally performs well, although it occasionally hangs.

chetstone said:
So the only difference I see in the two lans is the age of the components. Both routers have the latest Asus-wrt Merlin release, so they should be equivalent. Router 2 has an older version of Zerotier. MacOS Mojave is of course quite old (although I have installed the latest available version of the Canon CUPS driver.) The printer itself is actually identical. I just move it back and forth between LANs when testing. Any ideas?

Suggest trying from a Windows machine.
And again check the MTU.

chetstone said:
(By the way, why do some of your scripts mention port 9399? Is that a typo? The ZT documentation mentions only 9993.)

9399 is my other service. I believe I have removed it from the GitHub scripts to avoid confusion.

chetstone said:
Thanks!

You're welcome!

chetstone · Oct 10, 2023

Thanks so much for the quick reply. Well, just after I posted my message I realized how to troubleshoot thhe problem -- I replaced Computer 2 with the Mac with the current OS and everything worked. So the problem has to do with the outdated Mojave OS. This is not something I want to fix. That computer is going to the recycling bin soon LOL!

I've now switched to using your github scripts and they "just work" with no modifications other than filling in my ZT Network ID and routing info. Wish I'd been able to start with those-- would have saved a lot of time.

I think it would be good to add some minimal SEO so people can find those. First off, could you maybe rename your repository? You have misspelled "Merlin" as "merilin". And then, if you can still edit your original post in this thread, maybe you can add a link to your github. And it would be good to add more info to the README to explain what this does and how to use it. I could help with that, if you would accept a PR. I'd be happy to write up a concise explanation consolidating the most important information revealed in this thread.

I spent the better part of a week trying to get my project working with the original scripts. I read and reread all the posts in the thread many times. There were so many different suggestions for what to put in iptables. I kept trying different things (not realizing that my main problem was my outdated macos that just wouldn't print). My methodolgy was also flawed in that, having seen this answer, I was assuming that running service restart_firewall would reset iptables to the default, then run the firewall-start script. But I've now done some testing that indicates it does NOT reset iptables to default. So I kept uploading new versions of firewall-start, then running service restart_firewall, thinking that I was loading new iptable commands. In fact, the iptables were not getting reset, new tables were not being loaded, and I was spinning my wheels and not getting anywhere. (It was still kind of fun, though.)

So to help people avoid this maze, I'd like to help with your github scripts, which have iptables that just work, at least for my project.

By the way, IS there a way to reset iptables to default without restarting the router?

Also, is there a way to to enable mDNS advertisements to appear on the opposite LAN?

Thanks again for your kind assistance.

MissingTwins · Oct 10, 2023

chetstone said:
Thanks so much for the quick reply. Well, just after I posted my message I realized how to troubleshoot thhe problem -- I replaced Computer 2 with the Mac with the current OS and everything worked. So the problem has to do with the outdated Mojave OS. This is not something I want to fix. That computer is going to the recycling bin soon LOL!

I've now switched to using your github scripts and they "just work" with no modifications other than filling in my ZT Network ID and routing info. Wish I'd been able to start with those-- would have saved a lot of time.

Congratulations! Good to hear it.

chetstone said:
could you maybe rename your repository? You have misspelled "Merlin" as "merilin".

Sorry for the typo, done. Also I have updated the Github link.

chetstone said:
And then, if you can still edit your original post in this thread, maybe you can add a link to your github.

Actually, I had done this before you mentioned it. At the bottom of the first post, you can find update logs.

chetstone said:
And it would be good to add more info to the README to explain what this does and how to use it. I could help with that, if you would accept a PR. I'd be happy to write up a concise explanation consolidating the most important information revealed in this thread.

Thank you. I indeed need your help, please.

chetstone said:
I was assuming that running service restart_firewall would reset iptables to the default, then run the firewall-start script. But I've now done some testing that indicates it does NOT reset iptables to default. So I kept uploading new versions of firewall-start, then running service restart_firewall, thinking that I was loading new iptable commands. In fact, the iptables were not getting reset, new tables were not being loaded, and I was spinning my wheels and not getting anywhere. (It was still kind of fun, though.)

I have been there. I had a lot of struggle in debugging iptables, but I have overcome it, so I made this guide, LOL.

chetstone said:
So to help people avoid this maze, I'd like to help with your github scripts, which have iptables that just work, at least for my project.

Thank you, appreciate your help, and welcome PR.

chetstone said:
By the way, IS there a way to reset iptables to default without restarting the router?

Sure, it's very straightforward.

Bash:

# Flush all, but this is strongly not recommended.
# You may accidentally destroy the rules that were created by the router (VPN), Docker, or other services.
iptables -F
ip6tables -F

# Flush the INPUT chain only
iptables -F INPUT
ip6tables -F INPUT

# Flush PREROUTING in the nat table
iptables -t nat -F PREROUTING
ip6tables -t nat -F PREROUTING

But, I would rather do this instead, as it's safer.

Bash:

# See all rules
iptables -nL --line-numbers
ip6tables -nL  --line-numbers

# Remove a specific rule in the INPUT chain by line number
iptables -D INPUT 2
ip6tables -D INPUT 2

# Remove a specific rule in the nat table chain PREROUTING by line number
iptables -t nat -D PREROUTING 3
ip6tables -t nat -D PREROUTING 3

chetstone said:
Also, is there a way to to enable mDNS advertisements to appear on the opposite LAN?

I haven't dug into it; maybe I'm wrong, but since the mDNS relies on UDP, it should work within the ZT network.

chetstone said:
Thanks again for your kind assistance.

Your are most welcome.

chetstone · Oct 16, 2023

I submitted the PR with my new README and also unfortunately found a serious bug.
My internet connection kept going up and down this afternoon, and when it went down, the iptable entries for the NAT table disappeared. They can only be restored manually or by rebooting the router.

Here is the output of iptables -v -L PREROUTING -n --line-numbers -t nat \; iptables -v -L POSTROUTING -n --line-numbers -t nat before losing the WAN connection:

Code:

Chain PREROUTING (policy ACCEPT 1 packets, 64 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 DNAT       tcp  --  zt+    *       0.0.0.0/0            10.147.17.0/24       multiport dports 21,22,80 to:192.168.108.1
2        0     0 GAME_VSERVER  all  --  *      *       0.0.0.0/0            192.168.0.201
3        0     0 VSERVER    all  --  *      *       0.0.0.0/0            192.168.0.201
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 PUPNP      all  --  *      eth0    0.0.0.0/0            0.0.0.0/0
2        0     0 MASQUERADE  all  --  *      eth0   !192.168.0.201        0.0.0.0/0
3        0     0 MASQUERADE  all  --  *      br0     192.168.108.0/24     192.168.108.0/24
4        0     0 SNAT       all  --  *      br0     10.147.17.0/24       0.0.0.0/0            to:192.168.108.1

And here it is after losing WAN for about 30 seconds:

Code:

Chain PREROUTING (policy ACCEPT 1 packets, 64 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 GAME_VSERVER  all  --  *      *       0.0.0.0/0            192.168.0.201
2        0     0 VSERVER    all  --  *      *       0.0.0.0/0            192.168.0.201
Chain POSTROUTING (policy ACCEPT 0 packets, 0 bytes)
num   pkts bytes target     prot opt in     out     source               destination
1        0     0 PUPNP      all  --  *      eth0    0.0.0.0/0            0.0.0.0/0
2        0     0 MASQUERADE  all  --  *      eth0   !192.168.0.201        0.0.0.0/0
3        0     0 MASQUERADE  all  --  *      br0     192.168.108.0/24     192.168.108.0/24

Here is the syslog for that time period:

Code:

Oct 15 22:32:20 WAN_Connection: WAN was restored.
Oct 15 22:32:20 dnsmasq[15543]: read /etc/hosts - 24 names
Oct 15 22:32:20 dnsmasq[15543]: read /jffs/addons/YazDHCP.d/.hostnames - 20 names
Oct 15 22:32:20 dnsmasq-dhcp[15543]: read /jffs/addons/YazDHCP.d/.staticlist
Oct 15 22:32:20 dnsmasq-dhcp[15543]: read /jffs/addons/YazDHCP.d/.optionslist
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.1.1.1#53
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.0.0.1#53
Oct 15 22:32:20 dnsmasq[15543]: using only locally-known addresses for home.dewachen.org
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.1.1.1#53
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.0.0.1#53
Oct 15 22:32:20 dnsmasq[15543]: using only locally-known addresses for home.dewachen.org
Oct 15 22:32:35 dropbear[21074]: Child connection from 192.168.108.138:55453

I am seeing this happen on both my RT-AX86U Pro and my RT-AC68W.

And unfortunately that is not the only problem that happens when the WAN connection is dropped momentarily:
dnsmasq returns bogus IP addresses for hosts on the local LAN:

Code:

# Normally
dig marantz.
# (or dig marantz.home.dewachen.org)
# should return:
;; ANSWER SECTION:
marantz.        0    IN    A    192.168.108.208
# but when the internet is down, it returns
;; ANSWER SECTION:
marantz.        0    IN    A    10.0.0.1

This is causing some serious problems on my LAN. It is also happening with both routers.

In this case, dnsmasq recovers and starts working normally when the WAN is restored.
I cannot imagine how these two problems are related, but since they happen because of the same trigger, perhaps they are.

MissingTwins · Oct 16, 2023

chetstone said:
I submitted the PR with my new README

I truly appreciate your help.
If you're interested, there are a few additional scripts available for:
Matrix bot for online notifications
Cloudflare DDNS
PXE server for RPI3/4
socat poor man's http server for WOL

chetstone said:
and also unfortunately found a serious bug.
My internet connection kept going up and down this afternoon,

Did any technical difficulties occur with the ISP?
Is the internet connection stable now? How did you fix it?

chetstone said:
and when it went down, the iptable entries for the NAT table disappeared. They can only be restored manually or by rebooting the router.

I believe this is not a bug. The Asus router behaves this way when the WAN is disconnected. If I remember correctly, it even interrupts LAN communication if the WAN goes up and down.

Regarding this issue, there is a 1-minute crontab for zeroter in the myservices.sh. However, you mentioned that your WAN gets disconnected every 30 seconds, so the crontab never has a chance to run.

Additionally, there is a, extra `/jffs/scripts/firewall-start` goes into the scripts folder.
Don't forget to do this `chmod +x /jffs/scripts/firewall-start`

Bash:

#!/bin/sh
logger -t "firewall-start" -c "WAN:"$1" " -p user.notice

## Initialize routes for LANs
source /jffs/scripts/lan-route-table.sh

logger -t "firewall-start" "Leaving" -p user.notice

chetstone said:

Here is the syslog for that time period:

Code:

Oct 15 22:32:20 WAN_Connection: WAN was restored.
Oct 15 22:32:20 dnsmasq[15543]: read /etc/hosts - 24 names
Oct 15 22:32:20 dnsmasq[15543]: read /jffs/addons/YazDHCP.d/.hostnames - 20 names
Oct 15 22:32:20 dnsmasq-dhcp[15543]: read /jffs/addons/YazDHCP.d/.staticlist
Oct 15 22:32:20 dnsmasq-dhcp[15543]: read /jffs/addons/YazDHCP.d/.optionslist
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.1.1.1#53
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.0.0.1#53
Oct 15 22:32:20 dnsmasq[15543]: using only locally-known addresses for home.dewachen.org
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.1.1.1#53
Oct 15 22:32:20 dnsmasq[15543]: using nameserver 1.0.0.1#53
Oct 15 22:32:20 dnsmasq[15543]: using only locally-known addresses for home.dewachen.org
Oct 15 22:32:35 dropbear[21074]: Child connection from 192.168.108.138:55453

I don't see wan-event get called; there should be a wan-event Detect connection.
Does the wan-event have execution privileges

chetstone said:
I am seeing this happen on both my RT-AX86U Pro and my RT-AC68W.

Are both of your routers connecting to the same ISP?

chetstone said:
And unfortunately that is not the only problem that happens when the WAN connection is dropped momentarily:
dnsmasq returns bogus IP addresses for hosts on the local LAN:

Code:

# Normally dig marantz. # (or dig marantz.home.dewachen.org) # should return: ;; ANSWER SECTION: marantz. 0 IN A 192.168.108.208 # but when the internet is down, it returns ;; ANSWER SECTION: marantz. 0 IN A 10.0.0.1

This is causing some serious problems on my LAN. It is also happening with both routers.

In this case, dnsmasq recovers and starts working normally when the WAN is restored.
I cannot imagine how these two problems are related, but since they happen because of the same trigger, perhaps they are.

I think the second IP subnet originates from your dnsmasq settings.
I have encountered similar issues before, but I can't recall the specifics.
Additionally, the Asus router's DHCP seems to also assign 192.168.56.0/24 when in speific situations.

Check `/etc/dnsmasq.conf` out, to confirm what the dnsmasq is acturally configed.

chetstone · Oct 16, 2023

MissingTwins said:
I truly appreciate your help.
If you're interested, there are a few additional scripts available for:
Matrix bot for online notifications
Cloudflare DDNS
PXE server for RPI3/4
socat poor man's http server for WOL

You may want to talk about these scripts in the README

MissingTwins said:
Did any technical difficulties occur with the ISP?
Is the internet connection stable now? How did you fix it?

I've been having occasional outages of 1 or 2 minutes, maybe up to 10 minutes for the last six months since my WISP replaced my antenna with Tarana equipment. For the last couple of months, I've been running pingplotter continuously and reporting outages to the WISP's engineers so they can correlate with what's happening at their tower. They identified RF interference at the tower a couple of weeks ago and it has been better, but yesterday afternoon was my worst outage yet--- about 4 hours of pings looking like this. So this morning they came and replaced my antenna. Hopefully that will fix things.

MissingTwins said:
I believe this is not a bug. The Asus router behaves this way when the WAN is disconnected. If I remember correctly, it even interrupts LAN communication if the WAN goes up and down.

This cannot be true. If the LAN has problems like dnsmasq handing out bogus IPs just because the WAN goes down, that would be unacceptable and nobody would buy Asus equipment. I've been using my RT-AC68U with this WISP for almost ten years and have never seen anything like this until I installed merlin_zerotier.
I believe something in your wan_event script is crashing and causing the router to be unstable. The first time this happened I saw this in my syslog:

Code:

Oct 14 16:37:55 WAN_Connection: WAN was restored.
Oct 14 16:37:55 dnsmasq[3330]: read /etc/hosts - 24 names
Oct 14 16:37:55 dnsmasq[3330]: read /jffs/addons/YazDHCP.d/.hostnames - 20 names
Oct 14 16:37:55 kernel: potentially unexpected fatal signal 6.
Oct 14 16:37:55 kernel: CPU: 0 PID: 3330 Comm: dnsmasq Tainted: P           O      4.19.183 #1
Oct 14 16:37:55 kernel: Hardware name: RTAX86U_PRO (DT)
Oct 14 16:37:55 kernel: pstate: 00070010 (nzcv q A32 LE aif)
Oct 14 16:37:55 kernel: pc : 00000000f78c03a4
Oct 14 16:37:55 kernel: lr : 00000000ffeeab50
Oct 14 16:37:55 kernel: sp : 00000000ffeeab50
Oct 14 16:37:55 kernel: x12: 0000000000000000
Oct 14 16:37:55 kernel: x11: 00000000ffeeade0 x10: 00000000f79b6de0
Oct 14 16:37:55 kernel: x9 : 0000000000000002 x8 : 0000000000000001
Oct 14 16:37:55 kernel: x7 : 00000000000000af x6 : 00000000ffeead90
Oct 14 16:37:55 kernel: x5 : 00000000ffeead90 x4 : 0000000000000006
Oct 14 16:37:55 kernel: x3 : 0000000000000008 x2 : 0000000000000000
Oct 14 16:37:55 kernel: x1 : 00000000ffeeab50 x0 : 0000000000000000
Oct 14 16:37:59 rc_service: watchdog 2372:notify_rc start_dnsmasq
Oct 14 16:37:59 custom_script: Running /jffs/scripts/service-event (args: start dnsmasq)
Oct 14 16:37:59 custom_config: Appending content of /jffs/configs/dnsmasq.conf.add.
Oct 14 16:37:59 dnsmasq[12940]: started, version 2.89 cachesize 1500
Oct 14 16:37:59 dnsmasq[12940]: compile time options: IPv6 GNU-getopt no-R

MissingTwins said:
Regarding this issue, there is a 1-minute crontab for zeroter in the myservices.sh. However, you mentioned that your WAN gets disconnected every 30 seconds, so the crontab never has a chance to run.

What I meant was that whenever the WAN is down for 20-30 seconds or more dnsmasq starts handing out bogus IPs.
With this kind of outage, the WAN is never *completely* down. The ping times just get longer and longer until various services start to fail. However, since the outage was over, I've been testing by simply unplugging my WAN ethernet cable, and the same problem happens.

MissingTwins said:
Additionally, there is a, extra `/jffs/scripts/firewall-start` goes into the scripts folder.
Don't forget to do this `chmod +x /jffs/scripts/firewall-start`

Bash:

#!/bin/sh logger -t "firewall-start" -c "WAN:"$1" " -p user.notice ## Initialize routes for LANs source /jffs/scripts/lan-route-table.sh logger -t "firewall-start" "Leaving" -p user.notice

Should this script be added to the repository?

On further testing, I've found that I was wrong about the NAT iptables entry never getting restored. They do come back a few minutes after the WAN comes back up.

MissingTwins said:
I don't see wan-event get called; there should be a wan-event Detect connection.

For some reason, most of your sysLOG messages don't show up in my syslog. And when I try to execute your sysLOG function from the console I get an error:

Code:

admin@RT-AC68W-71F0:/tmp/home/root# source /jffs/scripts/mylib.sh


10/16 15:10:19 LOG_TAG=DarthTwins -sh
This is a console
admin@RT-AC68W-71F0:/tmp/home/root# sysLOG "this is a message" notice
logger: invalid option -- h
BusyBox v1.25.1 (2023-09-04 11:47:30 EDT) multi-call binary.

Usage: logger [OPTIONS] [MESSAGE]

Write MESSAGE (or stdin) to syslog

    -s    Log to stderr as well as the system log
    -c    Log to console as well as the system log
    -t TAG    Log using the specified tag (defaults to user name)
    -p PRIO    Priority (numeric or facility.level pair)
this is a message

MissingTwins said:
Does the wan-event have execution privileges

Yes. Notice that I changed the scripts to executable in github and in the README I suggested

scp -p

to copy the scripts to the router to help people remember that.

MissingTwins said:
Are both of your routers connecting to the same ISP?

Yes. See my diagram in the post above.

MissingTwins said:
I think the second IP subnet originates from your dnsmasq settings.
I have encountered similar issues before, but I can't recall the specifics.
Additionally, the Asus router's DHCP seems to also assign 192.168.56.0/24 when in speific situations.

Check `/etc/dnsmasq.conf` out, to confirm what the dnsmasq is acturally configed.

Code:

#/etc/dnsmasq.conf

pid-file=/var/run/dnsmasq.pid
user=nobody
bind-dynamic
interface=br0
interface=pptp*
no-dhcp-interface=pptp*
no-resolv
servers-file=/tmp/resolv.dnsmasq
no-poll
no-negcache
cache-size=1500
min-port=4096
domain=home.dewachen.org
expand-hosts
bogus-priv
domain-needed
local=/home.dewachen.org/
dhcp-range=lan,192.168.108.3,192.168.108.254,255.255.255.0,86400s
dhcp-option=lan,3,192.168.108.1
dhcp-option=lan,15,home.dewachen.org
dhcp-option=lan,44,192.168.108.1
dhcp-option=lan,252,"\n"
dhcp-authoritative
interface=br1
dhcp-range=br1,192.168.101.2,192.168.101.254,255.255.255.0,86400s
dhcp-option=br1,3,192.168.101.1
interface=br2
dhcp-range=br2,192.168.102.2,192.168.102.254,255.255.255.0,86400s
dhcp-option=br2,3,192.168.102.1
dhcp-name-match=set:wpad-ignore,wpad
dhcp-ignore-names=tag:wpad-ignore
dhcp-script=/sbin/dhcpc_lease
script-arp
edns-packet-max=1232
address=/aircam/aircam.dewachen.org/192.168.25.10
address=/wisenet/wisenet.dewachen.org/192.168.25.20
addn-hosts=/jffs/addons/YazDHCP.d/.hostnames # YazDHCP_hostnames
dhcp-hostsfile=/jffs/addons/YazDHCP.d/.staticlist # YazDHCP_staticlist
dhcp-optsfile=/jffs/addons/YazDHCP.d/.optionslist # YazDHCP_optionslist

chetstone · Oct 16, 2023

Hmmm. I guess you're right. I tried removing my zerotier installation entirely and I'm still having the problem of dnsmasq handing out bogus addresses when the WAN is offline. So perhaps this is "normal" but IMHO it's definitely a bug, and a major one at that. And why haven't I noticed it before? As I mentioned, My internet goes down fairly often, usually for a short time. But yesterday was the first time I've gotten notifications that my stereo receiver couldn't be reached from homebridge because it couldn't contact 10.0.0.1.

Maybe it's some other addon or setting I recently changed. DoT? No, turned that off and it's still happening. YazDHCP? No, I've been using that to assign ips and names for a couple of years.

Update: remove emotional outburst LOL

MissingTwins · Oct 17, 2023

chetstone said:
You may want to talk about these scripts in the README

I've been having occasional outages of 1 or 2 minutes, maybe up to 10 minutes for the last six months since my WISP replaced my antenna with Tarana equipment. For the last couple of months, I've been running pingplotter continuously and reporting outages to the WISP's engineers so they can correlate with what's happening at their tower. They identified RF interference at the tower a couple of weeks ago and it has been better, but yesterday afternoon was my worst outage yet--- about 4 hours of pings looking like this. So this morning they came and replaced my antenna. Hopefully that will fix things.

I see, thank you for the explanation. It seems you are using wireless link.
Then It should be a moderm hand over the up-stream DHCP, I'm curious about what is the modern's local ip address?
May I know your upstream network configuration? Is it PPPoE or LAN?

chetstone said:
This cannot be true. If the LAN has problems like dnsmasq handing out bogus IPs just because the WAN goes down, that would be unacceptable and nobody would buy Asus equipment. I've been using my RT-AC68U with this WISP for almost ten years and have never seen anything like this until I installed merlin_zerotier.
I believe something in your wan_event script is crashing and causing the router to be unstable. The first time this happened I saw this in my syslog:

That's strange; I've had ac68ux3, ac86ux2, ax86ux2 running this script without ever crashing the routers before.

This script merely checks and adds router table/iptables if the desired rules are missing, and starts zerotier if it fails, every minute.

For debugging purposes, you might want to:
1. Comment out cru inside `myservices.sh` and use `cru l` to see what is inside the crontab, then remove `ZeroTierDaemon, cruGuard1, and cruGuard2` with `cru d ZeroTierDaemon; cru d cruGuard1; cru d cruGuard2`.

2. Comment out the `lan-route-table.sh` in `firewall-start` and remove the IP route/iptables related items manually, or just restart the router if it's too trivial.

3. Change `ENABLED=yes` to` ENABLED=no` in `init.d/S91zerotier-one`, then use `/opt/etc/init.d/S91zerotier-one stop` to terminate the zerotier service, and see whether the dnsmasq still crashes.

I noticed this, perhaps your dnsmasq crashes are more closely related to the YazDHCP you're using.

Understanding the Log

Iguessing, but my kernael does not seem to be noormal. Sep 14 03:58:39 kernel: eth0 (Int switch port: 6) (Logical Port: 6) (phyId: 13) Link DOWN. Sep 14 03:58:42 kernel: ^[[0;33;41mFCACHE ERROR: fc_config_mcast_group: Mcast: client has already JOINed the mcast group (duplicate JOIN)^[[0m Sep...

www.snbforums.com

This is all hypothetically assuming the user is running 388.2 firmware with no "reliable" DNS upstream configured on their WAN page?

This really does look like your case.
May I know which merlin/dnsmasq version you are running on?

chetstone said:

Code:

Oct 14 16:37:55 WAN_Connection: WAN was restored.
Oct 14 16:37:55 dnsmasq[3330]: read /etc/hosts - 24 names
Oct 14 16:37:55 dnsmasq[3330]: read /jffs/addons/YazDHCP.d/.hostnames - 20 names
Oct 14 16:37:55 kernel: potentially unexpected fatal signal 6.
Oct 14 16:37:55 kernel: CPU: 0 PID: 3330 Comm: dnsmasq Tainted: P           O      4.19.183 #1
Oct 14 16:37:55 kernel: Hardware name: RTAX86U_PRO (DT)
Oct 14 16:37:55 kernel: pstate: 00070010 (nzcv q A32 LE aif)
Oct 14 16:37:55 kernel: pc : 00000000f78c03a4
Oct 14 16:37:55 kernel: lr : 00000000ffeeab50
Oct 14 16:37:55 kernel: sp : 00000000ffeeab50
Oct 14 16:37:55 kernel: x12: 0000000000000000
Oct 14 16:37:55 kernel: x11: 00000000ffeeade0 x10: 00000000f79b6de0
Oct 14 16:37:55 kernel: x9 : 0000000000000002 x8 : 0000000000000001
Oct 14 16:37:55 kernel: x7 : 00000000000000af x6 : 00000000ffeead90
Oct 14 16:37:55 kernel: x5 : 00000000ffeead90 x4 : 0000000000000006
Oct 14 16:37:55 kernel: x3 : 0000000000000008 x2 : 0000000000000000
Oct 14 16:37:55 kernel: x1 : 00000000ffeeab50 x0 : 0000000000000000
Oct 14 16:37:59 rc_service: watchdog 2372:notify_rc start_dnsmasq
Oct 14 16:37:59 custom_script: Running /jffs/scripts/service-event (args: start dnsmasq)
Oct 14 16:37:59 custom_config: Appending content of /jffs/configs/dnsmasq.conf.add.
Oct 14 16:37:59 dnsmasq[12940]: started, version 2.89 cachesize 1500
Oct 14 16:37:59 dnsmasq[12940]: compile time options: IPv6 GNU-getopt no-R

What I meant was that whenever the WAN is down for 20-30 seconds or more dnsmasq starts handing out bogus IPs.

From the log above, it seems that dnsmasq has crashed. I have noticed that your dnsmasq runs some addons.
Could it be possible that something inside YazDHCP.d is generating bogus IPs?

Reasons for the 10.0.0.1 Return Address

Fallback Configuration: The DNS resolver (possibly dnsmasq) may have a fallback configuration to respond with a predefined IP address (10.0.0.1 in this case) when the internet is down, or it can't reach the upstream DNS server for some reason.
Cache Behavior: If dnsmasq or another local DNS resolver has a stale or incorrect entry in its cache, it may return the wrong IP address during an internet outage.
Local Hosts File: There might be entries in the local hosts file that map the hostname marantz to the IP 10.0.0.1. During an internet outage, if the DNS resolver cannot query external DNS servers, it might rely more on local records, resulting in this IP being returned.
Network Segmentation: If there are different network segments or VLANs, the DNS resolver might be configured to return different IPs based on the network's status or the querying device's network location.

If the zerotier blocks the connection, we can kill and disable it before WAN is fully up.
Another possibilities, is there more than one dnsmasq running?

custom_script: Running /jffs/scripts/service-event

Despite the extra log entry in my scripts, a similar log should also appear when the router runs the script. So, I believe the `wan-event` might not have even been called.

chetstone said:
With this kind of outage, the WAN is never *completely* down. The ping times just get longer and longer until various services start to fail. However, since the outage was over, I've been testing by simply unplugging my WAN ethernet cable, and the same problem happens.

Should this script be added to the repository?

I have added firewall-start. This will implement iptables immediately after the network status changes. However, if the zt network hasn’t started up yet, the IP route might fail to be added, so crontab will attempt to add it every minute.

chetstone said:
On further testing, I've found that I was wrong about the NAT iptables entry never getting restored. They do come back a few minutes after the WAN comes back up.

I believe that crontab was functioning correctly. As long as it invokes `S90zerotier-one.sh`, the `lan-route-table.sh` will also be called.

chetstone said:
For some reason, most of your sysLOG messages don't show up in my syslog. And when I try to execute your sysLOG function from the console I get an error:

Apologies, this issue has been resolved; I have updated GitHub.
However, this error only occurs when you execute scripts from sh.
Since wan-event and firewall-start don’t use sysLOG(), they are not affected by this bug.

chetstone · Oct 18, 2023

OK I managed to fix this.

On the RT-AC68W, after removing zerotier, I removed YazDHCP. That fixed the issue. Then I reinstalled YazDHCP and the zerotier, and the issue did not come back.

On the RT-AX86U Pro, it was not so easy. I had to do a full Nuclear Reset.
I think reinstalled entware and zerotier and everything still works. (I decided I didn't really need YazDHCP.)

Action	AC-68W Bogus IPs when LAN offline?	AX-86U Pro Bogus IPs when LAN offline?
uninstall zerotier	yes	yes
uninstall YazDHCP	no	yes
reinstall YazDHCP	no
reinstall zerotier	no
install zerotier scripts	no
remove entware and reset amtm (removes all scripts)		yes
WPS NVRAM erase		yes
Nuclear reset		no
Install entware		no
Install zerotier and scripts		no

Thanks again for your help.

Cheers

MissingTwins · Oct 18, 2023

chetstone said:
OK I managed to fix this.

I'm glad that you have succeeded.

chetstone said:
On the RT-AX86U Pro, it was not so easy. I had to do a full Nuclear Reset.
I think reinstalled entware and zerotier and everything still works. (I decided I didn't really need YazDHCP.)

Just like what I have mentioned in the previous commen, maybe the bogus IP is from your up stream, the modem. That's why I wanted to know how your WAN was configured.

chetstone · Oct 18, 2023

MissingTwins said:
I'm glad that you have succeeded.

Just like what I have mentioned in the previous commen, maybe the bogus IP is from your up stream, the modem. That's why I wanted to know how your WAN was configured.

Sorry, I missed that question. Upstream of both my routers is the ISP’s router. It gives me addresses on the 192.168.0.0/24 lan by DHCP. Its WAN port is connected to the radio’s POE, configured with a static public IP.

So I don’t think the bogus address came from there. Anyway, it couldn’t have come from there, because the problem happened after I unplugged the cable between my router and theirs.

When an address like 10.0.0.1 comes out of the blue it makes me think some variable reverted to a default value.

I guess all my hacking on these routers changed some setting in the NVRAM that wouldn’t go away without extraordinary measures.

Thread starter	Title	Forum	Replies	Date
	Guide to using vpn with unbound dns?	Asuswrt-Merlin AddOns	3	Jul 16, 2024
S	AdGuardHome [SOLVED]Hostname resolution not working on router since installing Adguard home	Asuswrt-Merlin AddOns	0	Jan 28, 2025
S	Problem installing FlexQoS on Asus RT-68U	Asuswrt-Merlin AddOns	6	Jan 10, 2025
W	RT-BE88U 3006.102.2 - Guest and IoT Networks KO after installing Diversion / Skynet	Asuswrt-Merlin AddOns	1	Dec 27, 2024
S	Entware Possibility of installing sponsorblock on asus-merlin router?	Asuswrt-Merlin AddOns	2	May 18, 2024

A Guide About Installing ZeroTier on ASUS AC68U Router

Regular Contributor

Regular Contributor

New Around Here

New Around Here

Occasional Visitor

Regular Contributor

Occasional Visitor

Regular Contributor

Occasional Visitor

Regular Contributor

Occasional Visitor

Regular Contributor

Occasional Visitor

Regular Contributor

Occasional Visitor

Occasional Visitor

Regular Contributor

Occasional Visitor

Regular Contributor

Occasional Visitor

Similar threads

Similar threads

Support SNBForums w/ Amazon

Sign Up For SNBForums Daily Digest