Ok, I'm pretty sure I've figured out the problem with guest network 1 killing the WAN, and it's dumb. Really dumb.
Code:
admin@RT-AC66U_B1-0:/tmp/home/root# robocfg show
Switch: enabled
...
VLANs: BCM5301x enabled mac_check mac_hash
1: vlan1: 1 2 3 4 5t
2: vlan2: 0 5
502: vlan502: 0t 1t 2t 3t 4t 5t
admin@RT-AC66U_B1-0:/tmp/home/root# brctl show
bridge name bridge id STP enabled interfaces
br0 8000.38d547dbe940 no vlan1
eth1
eth2
tap22
br2 8000.38d547dbe945 yes wl1.1
eth0.502
eth1.502
eth2.502
So what we have here is br2 with GN1 (wl1.1) and VLAN 502 across eth0/eth1/eth2. I don't know what that VLAN is used for, but it doesn't show up unless you have a Guest Network 1 enabled. 2.4G Guest Network 1 creates br1 and VLAN 501. Maybe something to do with AiMesh?
eth1 and eth2 are 2.4/5 radios, and eth0 goes to the switch. On the switch port 0 is WAN, 1-4 are LAN, and 5 is the CPU. VLAN 1 traffic is tagged on port 5, so it goes to vlan1 interface and the LAN bridge. WAN traffic is untagged so it goes to the eth0 WAN interface.
The problem: enabling Guest Network 1 adds VLAN 501/502 to the switch, and puts the WAN port in those VLANs. This means Guest Network 1 broadcasts will go to the WAN port, including DHCP queries. When a GN1 device requests an IP, your WAN connection may respond first and kill the router's DHCP lease!
The fix is to add the following to your firewall-start script to remove those VLANs from the bridges and switch
Code:
robocfg vlan 501 ports ""
robocfg vlan 502 ports ""
brctl delif br1 eth0.501
brctl delif br1 eth1.501
brctl delif br1 eth2.501
brctl delif br2 eth0.502
brctl delif br2 eth1.502
brctl delif br2 eth2.502
I can confirm that without removing VLAN 501/502, a DHCP query on the guest network may steal the WAN IP. After removing those VLANs, guest network 1 functions properly. Yes, this bug makes the leaked traffic on the WAN port tagged, but my Fios ONT doesn’t care, and cable modems probably don’t either. They will respond to a DHCP broadcast.