heysoundude
Part of the Furniture
well, segregate validator traffic is what I really hoped to convey - wallets looking for validators or staking rewards would have to get pointed to you somehow from somewhere; figure that out and you'll see how to solve the problem. but if there's that MUCH traffic looking for validation, what @Morris says is the way: your big ol' bully of a validator needs its own fat pipe, and all your other traffic trying to ride along gets knocked out of its way. HMMMM - you've a fibre AND another connection incoming? pretty clear validator gets fibre and netflix/whatever/etc goes on what you're using now...validator should be making you enough to pay for itself and have enough profit to afford a 2nd internet connection.Great stuff.
First to be clear, the problem really is not DNS itself here, but the ability to make ANY outbound connections. I think that's more or less a given at this point.
This generates a side question: If I only specify a SINGLE DNS over TLS server... do you think Stubby will keep that connection alive? If so, I may be able to keep DNS working at least.
The only connections that work seem to be connections that pre-exist the Validator coming up. As an example, I run ngrok here and ngrok continues to allow incoming connections.
I have an unused fiber connection coming in (we don't know if it's live yet and I am having trouble figuring out how to buy an ONT) but we are hooked to ethernet.
The building is probably using CGNAT. I have great connectivity and I can't ping other routers in the building from my WAN interface. If I look at how CGNAT is structured, that would be a good indicator. "
Greg's suggestions :
Using wireguard to contain Validator traffic: We thought about doing a VPN to handle some of the traffic. However, it pushes the problem "out" to another system (DNS receiver). Getting something highbandwidth in the cloud or colocated would cost more and actually that bandwidth may not be as good as mine.
Tricking some Solana traffic into using VPN: I think Solana is probably making thousands of DNS requests. So it's not going to be possible to contain its outbound traffic, via DNS A-and QuadA records (IIUC).
DNS Caching: I think DNS issues may be a red herring. Stubby and other tools are just unable to connect outbound after a short while. I think NAT is failing.
So I have to figure out what kernel parameters to tweak to increase NAT capacity, or figure out how to decrease the UDP / TCP timeouts so that I don't have too many NAT connections open.
Think about this: TCP/IP only has 65,535 ports available to it. Each NAT outbound TCP/IP connection takes up one of those for an etherial source port. I could easily see my router running out of those in this situation. The connection tracker was seeing "15,000". I'm waiting for the validator to come up again- thanks @ColinTaylor for the tip on the Tools/Network status page! I have been using Tomato for years but not up to speed 100% on Merlin yet. Validator is coming up again I'll post the total number of connections.
(where are you that you can get fibre run to your home without an ONT?)
Stubby/DNS caching is another convo entirely...for the websurfing connection. The crypto connection will follow the solana network protocol defined by the validator code.
(I wish I had been paying more attention when my local crypto friends had their cosmos validator up...then I might have better ideas for you)