Access to Kiwi from remote addresses lost after power outage
At KPH I run 7 Kiwis http:kphsdr.com:8072...8078 (kiwi 72...78) and earlier this week they were all abruptly powered down when the neighborhood AC power lines went down and the backup battery supply ran out of power. All the Kiwis are running v1.464 and behind a firewall which is configured for port forwarding.
When power was restored all of the Kiwis came back up and can be accessed from the LAN, but only 2 of them Kiwi72, Kiwi77 would respond to remote listeners at public IP addresses. After installing tcpdump on Kiwi73 and Kiwi74 I could see the packets forwarded by the firewall arriving from my remote public IP address, but tcpdump shows no response packets. After the 'reboot the "Beagle Reboot" button had no effect on Kiwi74, I rebooted it from the debian command line after which Kiwi74 did respond to remote users.
However rebooting Kiwi73 did not restore remote connectivity. I have disabled all IP blacklists (confirmed with 'iptables -L) and removed all time limits and then rebooted once again, and Kiwi73 still doesn't respond to public IP addresses. I can 'ping google.com' from Kiwi73's command line, so it has DNS access and a working default route.
As a final experiment I upgraded Kiwi75 to v1.470 and it too continued to not respond to public IP address users.
So I can see no networking problems at the debian level and all the evidence suggests that 'kiwid' is blacklisting public IP addresses.
I can open a public IP address:port for remote access to these failing Kiwis if that would help debug the problem.
I see this was posted to the Installation section, but I don't see how to move it to another forum group
From here (Germany) I can reach 72, 73 (redirects to 74), 74, 77, 78. Only 75 and 76 seem dead.
Yes, the DHCP leases are correct, and the WD Pi on the same LAN can access all of the Kiwis 72...78 at their corresponding LAN addresses
To restore KPH to some semblance of public access, I am forwarding public port 8073 to Kiwi74. As you suggested, I have just removed the forwarding rule to Kiwi75 from Kiwi74.
I have verified that the routers forwarding rules remain correct by running tcpdump on the 'dead' Kiwis 73, 75 and 76. It shows my public client TCP packets arriving at the correct Kiwi, but on the 'dead' Kiwis 73, 75 and 76 there are no response packets.
So LAN-sourced packets get a response from those three Kiwis, but packets coming from outside the LAN generate no response.
My first thought was that Kiwi 73, 75 and 76 were not getting the default gateway address from their DHCP transactions at startup. But all can access the WAN through the router's gateway address and get their DNS from the router. So the lack of response seems to be inside 'kiwid' not at the BB TCP/IP layer.
Could be the network switch, I have one brand that just does not like power glitches.
Everything after a power glitch is "how could that be!? makes no sense" territory.
On the "dead" ones, can you ping both ways between each and all devices? I wonder if the switch ports are set 10m simplex and never switching (just a wild guess). What does "ethtool eth0" show
The 'dead' Kiwis can ping each other and ping 'google.com', so the problem is not at the IP HW or IP routing level. All symptoms suggest that 'kiwid' is ignoring open requests from IP addresses outside its LAN:
root@kiwisdr:~# ping google.com
PING google.com (220.127.116.11) 56(84) bytes of data.
64 bytes from sfo07s13-in-f14.1e100.net (18.104.22.168): icmp_seq=1 ttl=116 time=25.6 ms
64 bytes from sfo07s13-in-f14.1e100.net (22.214.171.124): icmp_seq=2 ttl=116 time=12.9 ms
64 bytes from sfo07s13-in-f14.1e100.net (126.96.36.199): icmp_seq=3 ttl=116 time=11.6 ms
64 bytes from sfo07s13-in-f14.1e100.net (188.8.131.52): icmp_seq=4 ttl=116 time=13.2 ms
64 bytes from sfo07s13-in-f14.1e100.net (184.108.40.206): icmp_seq=5 ttl=116 time=11.9 ms
--- google.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4005ms
rtt min/avg/max/mdev = 11.605/15.073/25.617/5.308 ms
Obviously you have rebooted these devices?
What does shell command ku show?
@rrobinet first step - check access to kiwi's web server from local network for example (from any linux host in same LAN) curl -Is http://lanip:port if it works normal you should receive "HTTP/1.1 200 OK". If all answer was good you need check you Ubiquiti router (NAT, ARP & etc, test incoming connection status) if something gives error - you need fix it.
I have power cycled the Kiwis several times.
systemctl --full --lines=100 stop kiwid.service || true
systemctl --full --lines=100 start kiwid.service || true
This is not a router problem. As I explained before, the 'dead' Kiwis can ping the internet and perform a software update from 464 to 470.
The only problem is these three Kiwis won't respond to listener requests from public IP addresses, only LAN address
In the end, power cycling the Kiwis restored access. That was difficult to do because the Kiwis are 100 Km away in a locked building, but a day of SW development I was able to remotely toggle the DC feeds to the Kiwis, and that seemed to fix them.
Such a strange error condition that required more than a Debian reboot to fix.
Thanks for the advice!
I spoke too soon. One Kiwi still get an immediate 'refused to connect' message from my browser. Upgrade to .470 and a power cycle didn't change the behavior
Hi @rrobinet, I saw so mysterious problems when channel MTU was deferent then known to the router, - small ping packets routing good but big (1470) not, some web site works but some not. OS use MTU discovery protocol for fix this situation but it wasn't help for all situation. I wrote about curl test on LAN, because on simple LAN MTU fixed on 1500 and don't changing between LAN ports.
Of course if you received wrong curl test on LAN too - its 100% not router or channel problem...
Historically I did have issues with some PSU's and antenna grounding, I'm not sure the exact sequence but with some antennas, grounded at the far end, combined with some mains powered PSU's it would seem to hold the Beaglebone in an unresponsive network condition, until either power was cycled OR the Beaglebone was rebooted without the antenna connected.
At the time another user had a network chip issue caused by lightning so I assumed I was also seeing that, but later, once I worked out to disconnect both DC wires, or power up without the antenna connected it was happy. With so many Kiwi's and big antennas I bet you have some decent telluric currents which could (guessing) latch some dc protection (in the NIC?).
A nice test would be to put a current meter (or even small bulb) between the station ground and antenna ground.
OK, I finally found and fixed the root cause.
The problem was a bad IP route after power cycle caused by the wifi dongles I have attached to the Kiwis.
I thought those dongles were disabled, but after a power cycle they were active. Someone (not me) moved the wifi router they were attempting to reach to another room and on to another LAN segment, and depending upon the wifi signal strength some of the Kiwi wifi dongles would sometimes still attach to wifi rather than use the wired LAN connection.
So those Kiwis which got a wifi connection would use the wifi router as the default gateway and their response packets to public IP packets would travel to the wifi router where they were dropped. The wifi connected Kiwis could ping the Internet which deceived me into thinking that all was well at the IP routing level.
Now that I can remotely power cycle the Kiwis, I have disabled the wifi dongles and verified the Kiwis return to full operation after a power cycle. We are sure to have more of them this winter at the remote KPH site.
Thanks for all the advice and encouragement!
OK good one, I wasn't expecting WiFI dongles to get a look in.
Glad it is sorted anyway.
In otherwise quiet installations, the wired LAN connection is a RF ground loop path which can introduce RFI. In a now abandoned attempt to break that LAN loop, I had added a USB wifi dongle to each of the Kiwis, but I found that even with a nearby Wifi AP the wifi connections with that dongle were too error-prone. I thought I had disabled the driver for that dongle, but apparently not.
We have found the attaching a small $20 GLNet wifi router to the USB ports of the Kiwi is a better, if more mechanically cumbersome, solution to the wired LAN problem.
The ground loop problem is not due to the LAN speed: 10 Mbps mode doesn't change the reactance of the LAN port. A better solution would be to use a Pi with built in wifi, but the Chinese clones that doe that are a disaster in many other ways.
BBAI has built -in WFi so is another potential solution.
@rrobinet too many unknowns 🤔 but I'm glad you could fix this problem! 😀