Restoring GPS tracking [fixed in v1.335]

edited October 2019 in Problems Now Fixed
Clint KA7OEI first reported the problem that Kiwis running wsprdaemon (WD)_ would eventually fall out of GPS tracking and restoring tracking requires that all WD listening (and other) sessions be terminated for a few seconds or minutes.

I am grateful that I can now detect that fault condition from the KIWIIP:8073/status page
And I am even more grateful that I can get a list of active listener sessions from KIWIIP:8073/users.
WD can kill all of its listening sessions, but if there are non-WD listeners I don't know how to terminate them.
It is easy and clean to disable user connections on the Control page, kill the active user connections, then wait for GPS tracking. But I don't know how to do that from my WD script.
Any suggestions?
Thanks,

Comments

  • jksjks
    edited October 2019
    On the GPS tab of the admin page just check the box "Acquire if Kiwi busy?" (it defaults unchecked). That way acquisition of (new) GPS sats won't be suppressed just because there are user connections. To verify, check the "acq" field at the bottom left of the page. It should no longer show "paused" when there are user connections.
  • I just checked the 7 KPH Kiwis and confirmed that all are running V1.334 for 9 days and all have "Acquire if Kiwi busy?" checked.
    In spite of that, one of the two Kiwis with 6 WD clients had lost GPS tracking and required my manual intervention. My WD 2.6 beta code now checks and logs loss of GPS, so as I work on other bugs and features of 2.6 my log will report if loss of GPS reoccurs. Before restarting, is there something I should look for in /var/log/messages?
  • Well maybe there's a bug someplace. Since you're logging anyway it would be interesting to see the trend of the "gps_good" count from /status as the sats disappear. And also how long after restart before sat loss ("uptime" value). That will give me a clue as to what might be going wrong. It might be a slow memory leak or something similar that takes a while to start doing damage.
  • Where's the 7th one?
    bash-3.2$ for i in `seq 8070 8079`; do echo -n "kphsdr.com:$i: "; curl -s --connect-timeout 5 kphsdr.com:$i/status | grep -i gps_good || echo; done
    kphsdr.com:8070: 
    kphsdr.com:8071: 
    kphsdr.com:8072: gps_good=12
    kphsdr.com:8073: gps_good=12
    kphsdr.com:8074: gps_good=12
    kphsdr.com:8075: gps_good=12
    kphsdr.com:8076: gps_good=12
    kphsdr.com:8077: 
    kphsdr.com:8078: gps_good=12
    kphsdr.com:8079: 
    
  • edited October 2019
    I had also observed this problem once. When users were connected, the kiwi stopped acquiring satellites (also with "Acquire if Kiwi busy?" enabled).
    In my case it was related to the kernel 4.4.155-ti-r155. I suspected a problem with missing kernel modules, but I didn't had time to investigate further.

    PS: What is the disadvantage of having GPS to always acquire satellites?
  • The port forwarding rule for 8077 was missing. 8070/71 have moved to Maui.
    rob@Robs-MBP:/Users/rob/wsprdaemon> for i in `seq 8072 8078`; do echo -n "kphsdr.com:$i: "; curl -s --connect-timeout 5 kphsdr.com:$i/status | grep -i gps_good || echo; done
    kphsdr.com:8072: gps_good=11
    kphsdr.com:8073: gps_good=10
    kphsdr.com:8074: gps_good=11
    kphsdr.com:8075: gps_good=10
    kphsdr.com:8076: gps_good=11
    kphsdr.com:8077: gps_good=11
    kphsdr.com:8078: gps_good=11
    rob@Robs-MBP:/Users/rob/wsprdaemon>
  • What is the disadvantage of having GPS to always acquire satellites?
    GPS acquisition uses a large FFT that runs in an uninterruptible fashion since it is a library routine (i.e. the Kiwi task scheduling can't influence its execution). Early on there was a fear this might cause realtime issues (audio drops) under heavy loading (many connections, WSPR extension use).

    Originally it seemed use of an individual Kiwi was pretty sparse and so it wouldn't hurt to pause the GPS acquisition process when there was more than one user connection. After all, if acquisition is off it takes quite a long time for all the currently tracked sats to move out-of-range such that there are no sats left. So this seemed an acceptable compromise. But many things have changed since then. Christoph figured out how to make the acquisition FFT a power-of-two which sped it up considerably. We now have applications that make full-time connections to all of the channels (wsprdaemon, kiwirecorder, WSPR extension autorun).

    These days there doesn't seem to be any significant problems letting the GPS always acquire.
    HB9TMC
  • After 12 hours with 6 WD client connections, KPH77 has lost all GPS signals while the other 6 Kiwis all are tracking 12 satellites.

    rob@Robs-MBP:/Users/rob/wsprdaemon> for i in `seq 8072 8078`; do echo -n "kphsdr.com:$i: "; curl -s --connect-timeout 5 kphsdr.com:$i/status | grep -i gps_good || echo; done
    kphsdr.com:8072: gps_good=12
    kphsdr.com:8073: gps_good=12
    kphsdr.com:8074: gps_good=12
    kphsdr.com:8075: gps_good=12
    kphsdr.com:8076: gps_good=12
    kphsdr.com:8077: gps_good=0
    kphsdr.com:8078: gps_good=12
    rob@Robs-MBP:/Users/rob/wsprdaemon>

    All GPS RF comes from a single high gain antenna through an 8-way splitter and Kiwi76 has 7 active WD listeners, so I see no HW or operational differences which can be associated with the problem on Kiwi77. Kiwi77 lost lock sometime between 3:53AM PDT (10:53 UDT) and 4:05AM and I see nothing in kiwi77:/var/log/messages in the last 10 hours except a HUP:

    .....
    Oct 10 02:32:45 kiwisdr kiwid: 10d:00:22:57.828 ..234567 6 0.00 kHz am z0 "wsprdaemon_v2.5b" 10.14.70.86 (ARRIVED)
    Oct 10 02:32:45 kiwisdr kiwid: 10d:00:22:57.855 ..234567 7 0.00 kHz am z0 "wsprdaemon_v2.5b" 10.14.70.86 (ARRIVED)
    Oct 10 05:16:04 kiwisdr kiwid: 10d:03:06:16.790 ..234567 UPDATE: check scheduled
    Oct 10 06:25:11 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="475" x-info="http://www.rsyslog.com"] rsyslogd was HUPed


    I have left Kiwi77 is this unlocked state and will PM you instructions for ssh access.

    Until we can figure out how to fix this in the Kiwi, can you suggest how my script could restore the Kiwi's GPS locks?
  • Okay, just woke up. Coffee in hand, checking now..
  • v1.335 has a fix, hopefully.
  • After 4 days in service at KPH, all Kiwis remain in GPS lock. Previously lock was being lost in a few hours, so it appears you have fixed that problem. I will continue to monitor the 7 KPH Kiwis and report is the problem reappears.

    In addition, I no longer see another problem: previously, kiwirecorder listeners would be assigned to RXO/1 when there were open RX2..7 channels. Not only are my sessions now much more stable, but the one session which restarted after 84 hours was assigned to RX5 as it should. I am going to lower the priority of adding to WD tests and recovery for this condition if, as it appears, it now occurs less often or not at all.

    Thanks for this fix.
  • The fix was to adjust the heuristic for deciding when to schedule a run of the GPS acquisition task. In my previous testing I had failed to disable compression when making the 8 kiwirecorder connections (which is how wsprdaemon operates). The subsequent increase in data transfer alters the timing enough that the acquisition task was getting locked out.
  • Thanks for that explanation. It doesn't sound like that would affect the selection of RX channels, but it seems possible that uncompressed audio would allow WD to extract more signals from the wav files.
Sign In or Register to comment.