use of kiwirecorder, kiwiwspr etc

2

Comments

  • could we not use a flag like the "no_wf" that says if it is OK or not to drop this session for a reboot?
    I.E. average off-site users by default can't fire up a session that will stop it rebooting, local users can (and local addresses do, by default).
    If the owner (or trusted user) is off-site via public IP and wants to force a non dying session they append (for example) &no_rbt_[pswd]
    That should work with scripts.
  • could we not use a flag like the "no_wf" that says if it is OK or not to drop this session for a reboot?
    I.E. average off-site users by default can't fire up a session that will stop it rebooting, local users can (and local addresses do, by default).
    If the owner (or trusted user) is off-site via public IP and wants to force a non dying session they append (for example) &no_rbt_[pswd]
    That should work with scripts.


    Session and restart enforcement sounds more like a server policy than a user decision. If a trusted user connects via public IP then they can log in with their credentials to get a privileged session status, so an extra parameter isn't really needed for that, is it?
  • Some additional facts we're forgetting:

    1) Checking if an update is available doesn't "kill" any sessions. You can try it manually in the update tab as proof.
    2) Right now if an update is available, and there are user connections, the update will not begin until the last user disconnects.
    3) "User connections" doesn't currently count "internal connections" which right now is only WSPR auto-run. Kiwirecorder et al are considered normal user connections. So only WSPR auto-run will be interrupted by an update (and then restarted).
    4) Checking for, and if necessary building, an update occurs on each Kiwi server restart.

    I just now spent some time re-testing all this. I changed the code slightly to shift the 0200Z check to 5 minutes in the future. Then tried connecting with kiwirecorder and then WSPR auto-run. It all worked fine. Kiwirecorder connections were not closed by an update check or pending update build. So I'm not sure what you guys are talking about.
  • And if you have both a WSPR auto-run and kiwirecorder connection active, and there is a pending update, the correct thing happens too: as soon as kiwirecorder disconnects WSPR get booted and the update proceeds.
  • It must be something in Rob's kiwiwspr.sh that's biting me then.
  • I have implemented version 0.5c of kiwiwspr.sh which includes a watchdog daemon that checks the status of the kiwirecorder.py sessions each odd 2 minutes and restarts or performs scheduled configuration changes on them as needed. The watchdog log shows that one or more (and sometimes all) of them dead at random times. I will need to modify kiwiwspr.sh to save the error output of kiwirecorder.py, but looking at the restart times strongly suggests those restarts are happening at semi-random times. So they are unlikely generated by your update check. I'm sorry if I mislead you into a unneeded diagnostic task.
    WA2ZKD
  • edited August 2018
    not random here... always about 0620-4 UTC. That is that all the kiwirecorder tasks die
  • I have kwiiwspr.sh running on Raspberry Pis at two sites: KPH and AI6VN/KH6.
    At AI6VN/KH6 kiwirecorder.py never fails while at KPH it it fails frequently, sometimes several times per hour.
    I just noticed that at KPH the Raspberry Pi server running kiwiwspr.sh had 100% cpu activity due to a rogue zombie process I created last week while debugging Internet connection issues.
    After killing that process I seem to have reduced or eliminated the dying kiwirecorder.py's, although will take another day to be sure of the level of improvement.
    So I would theorize that your server runs a program at 0602 that consumes CPU or other resources on your server. If you don't know of a crontab job at that time, you could schedule 'top -b > top.log' to run 0658 to 0606 and look at what was running then.
    In my systems, the first 8-14 lines of top output are COMMAND python until each even 2 minutes when one or more 'wsprd' commands are at the head of the list for about 45 seconds
  • FWIW all the kiwirecorder tasks also die around 0623 UTC on my system. I do not use my system 24/7 but when I do I have experienced the failure each night around the same time.

    Being able to run 8 simultaneous instances of WSPR on one Kiwi is a fantastic feature. Thanks so much to all who made it possible.

    Gene W3PM
  • I have just finished version 0.5d which adds support for a watchdog and scheduled band configuration changes.
    It also incorporates many command line changes which I hope make it easier to use.
    However it requires changes to the CAPTURE_JOBS[] array in the kwiconfig.conf file and you must add at least one WSPR_SCHEDULE[] entry.
    It is running well at 3 sites, but as always save your current kwiwspr.sh and kwiwspr.conf file in case this new version doesn't work for you.
    It is a very lightweight program and you can easily spot all 14 LF/MF/HF WSPR bands on a Raspberry Pi
    I welcome comments and bug reports

    Attachments:
    https://forum.kiwisdr.com/uploads/Uploader/83/1583fff0adb66258650841176e1931.sh
    G0LUJWA2ZKDGene
  • edited August 2018
    installing now but may have an issue. I ran kiwiwspr.sh and let it make a conf file then without an edit ran kiwiwspr.sh -j a,all but it bombed. I had intended to stop it edit for myself and start again. See attached file
  • edited August 2018
    errors I get when I run it.... I edited calls and IP# only KIWI name unchanged Well I tried to upload errors.txt but board is thwarting me sorry about the dupes... [fixed -jks]
  • I'm at my desk and will try to duplicate your problem
  • Thanks!!
  • Thanks for the testing
    I found and fixed a bug in the auto-created kiwiwspr.conf.
    Delete the kiwiwspr.conf and run the attached 0.5e to auto-create a kiwiwspr.conf which doesn't crash kiwiwspr.sh
    There are no other changes or fixes in 0.5e

    Attachments:
    https://forum.kiwisdr.com/uploads/Uploader/d3/ad166df9de32af23c1e4ded068c997.sh
  • all running good now, only one schedule at the moment.

    Anything thoughts on the 06:22 phenomena?
  • As I said in the post above, I agree with John that the problem is not in the Kiwi but almost certainly in the server running kiwiwpr.sh. Have you looked at your /etc/crontab for a task scheduled at that time?
  • what I find curious is that Gene (above) has the same time. I run an Odroid XU4 BTW. I reviewed cron skeds, nothing jumps out at me
  • Do these Kiwis happen to have the "daily restart" option on the admin control tab page set to "yes"? If so there might be a bug with that. The restart is supposed to wait for all users (including kiwirecorder) to disconnect.
  • edited August 2018
    I've just checked all 5 of those Kiwis and none were set to daily restart.
    I saw no restarts on my Maui Kiwi in the last 24 hours, only on the KPH Kiwi which was being used by the overloaded Pi.
    So I feel certain there is no problem with the Kiwi SW.
    On one of my TV products a unpublished BIOS upgrade by the motherboard manufacturer introduced a CPU hang of several seconds at what seemed like random times.
    I suspect an analogous source of these events in the server, not the Kiwi
  • Kiwiwspr0.5d.sh running on a RPi3+ has worked very well for me during the past 24 hours. At one point last night it was decoding 630 WSPR spots per minute.

    At 06:27 UTC the watchdog log indicated that six stale capture jobs were restarted with six new PIDs. This eliminated my ~ 06:23 WSPR failure problem.

    For a short experiment I ran seven instances of WSPR with 0.5d from 40 – 10 m and one instance of 20m WSPR on the Kiwi’s one remaining channel for comparison. During a one hour period 0.5d decoded 46 spots from 18 unique stations. The Kiwi decoded 41 spots from 16 unique stations. I suppose this shows that the wsprd algorithm used with 0.5d is more efficient than the older one used with the Kiwi. I realize that running more instances on the Kiwi would overburden the limited processing power of the Beagle resulting in even fewer spots.

    Kiwiwspr0.5d.sh is a great asset. Once again, my thanks to Rob and everyone else who made it possible.
    WA2ZKDLX1DQ
  • "At one point last night it was decoding 630 WSPR spots per minute."

    Thanks for the experiment. You mean per 630 spots per hour I suppose or 10 per minute. 10 per second would be amazing.
  • Yes, of course, 630 spots per hour. Thanks for spotting my error.

    W3PM
  • I see the 0627 stale capture.... restart thing too
  • Hurricane Lane found a bug in the startup code of kiwiwspr last night when my Maui Kiwi lost power for about an hour and the kiwiwspr daemon failed to start after power was restored.
    I have fixed that bug and several others in the attached file, but existing installations will need to be patched.
    Unfortunately fixing the startup configuration of an existing installation requires that you make a minor edit to the systemctl configuration file:

    1) Verify that the kiwiwspr.service is installed on you machine:
    pi@PiKPH:~/ham/kiwiwspr $ sudo systemctl status kiwiwspr
    ? kiwiwspr.service - KiwiSDR WSPR daemon
    Loaded: loaded (/lib/systemd/system/kiwiwspr.service; enabled; vendor preset: enabled)
    Active: active (running) since Sat 2018-08-25 07:04:47 PDT; 6min ago
    Process: 22353 ExecStart=/home/pi/ham/kiwiwspr/kiwiwspr.sh -w a (code=exited, status=0/SUCCESS)
    Main PID: 22358 (kiwiwspr.sh)
    CGroup: /system.slice/kiwiwspr.service
    ??22358 /bin/bash /home/pi/ham/kiwiwspr/kiwiwspr.sh -w a
    ??26156 /bin/bash /home/pi/ham/kiwiwspr/kiwiwspr.sh -j a,all
    ??26487 /bin/bash /home/pi/ham/kiwiwspr/kiwiwspr.sh -d a,KPH_HF_3,30
    ??26510 ps 28485

    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: User lookup succeeded: uid=1000 gid=1000
    Aug 25 07:04:47 PiKPH systemd[22353]: kiwiwspr.service: Executing: /home/pi/ham/kiwiwspr/kiwiwspr.sh -w a
    Aug 25 07:04:47 PiKPH kiwiwspr.sh[22353]: Sat Aug 25 07:04:47 PDT 2018: INFO, this server already has a /lib/systemd/system/kiwiwspr.service file. So leaving it alone.
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Child 22353 belongs to kiwiwspr.service
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Control process exited, code=exited status=0
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Got final SIGCHLD for state start.
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Main PID guessed: 22358
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Changed start -> running
    Aug 25 07:04:47 PiKPH systemd[1]: kiwiwspr.service: Job kiwiwspr.service/start finished, result=done
    Aug 25 07:04:47 PiKPH systemd[1]: Started KiwiSDR WSPR daemon.
    pi@PiKPH:~/ham/kiwiwspr $

    If you don't see a printout like that which indicates the service is running, then you need to:
    2) Stop your watchdog daemon: ./kiwiwspr.sh -w z
    3) Edit your service file: sudo vi /lib/systemd/system/kiwiwspr.service
    add 'a' to the end of the line:
    ExecStart=${KIWIWSPR_ROOT_DIR}/kiwiwspr.sh -w
    to make it:
    ExecStart=${KIWIWSPR_ROOT_DIR}/kiwiwspr.sh -w a
    4) Load the edited file and start the service:
    sudo systemctl daemon-reload
    sudo systemctl start kiwiwspr
    sudo systemctl status kiwiwspr

    You can also confirm that the daemon is running with './kiwiwspr.sh -w s'
  • edited August 2018
    The forum reports an error when I try to upload kiwiwspr0.5g.sh. So email me if you need a copy.
  • I have just posted this discovery in another thread, but it is also highly relevant to users of my kiwiwspr.sh. Later today I hope to share V0.6a of that script with what I hope to be the final set of features and bug fixes.

    I recently discovered that Raspbian Stretch has a bug which very, very infrequently causes ethernet packets to be dropped. By running 'sudo rpi-update' on my Pi 3b, which upgraded the kernel from 4.7 to 4.14, I have completely eliminated the 2-5 per day kiwirecorder.py restarts from which I was previously suffering. Running 4.14, my WSPR decoding script 'kiwiwspr.sh' can simultaneously record 17 uncompressed audio streams (3.5 Mbps = 750 receive packets per second ) from 3 Kiwis on one Pi without a single restart in 24 hours.
  • However, even with that Pi OS upgrade I have encountered one restart which occurred at the same time on two 8 channel Kiwis being run from my Pi. The /var/log/messages on both Kiwis record the same set of restart events on both Kiwis. Those syslog entries are strangely not recorded in ascending time order, so it isn't clear what event initiated the LEAVING messages:

    .........
    Sep 5 13:44:19 kiwisdr kiwid: 5d:13:18:01.816 01234567 ca_pause 16560 ca_pause_old 192 ca_shift=28 code_creep=-220 code_period_ms=1 code_period_samples=16368
    Sep 5 23:05:44 kiwisdr kiwid: 5d:22:39:26.317 01234567 ca_pause 65527 ca_pause_old 55 ca_shift=4756 code_creep=-4811 code_period_ms=4 code_period_samples=65472
    Sep 6 03:21:27 kiwisdr kiwid: 6d:02:55:09.367 01234567 ca_pause 16372 ca_pause_old 4 ca_shift=16 code_creep=-20 code_period_ms=1 code_period_samples=16368
    Sep 6 03:28:04 kiwisdr kiwid: 6d:03:01:46.068 01234567 UPDATE: check scheduled
    Sep 6 06:25:11 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="430" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.800 .1234567 0 21096.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:02)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.808 .123456. 7 18106.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:03)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.813 .1234.6. 5 10140.15 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:05)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.816 ..234.6. 1 3594.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:56:06)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.821 ..23..6. 4 7040.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:06)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.830 ..23.... 6 14097.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:04)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.840 ..2..... 3 3570.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 23:58:08)
    Sep 6 06:25:11 kiwisdr kiwid: 6d:05:58:53.843 ..2..... UPDATE: checking for updates
    Sep 6 06:25:02 kiwisdr kiwid: 6d:05:58:54.877 ..2..... UPDATE: version 1.219 is current
    Sep 6 06:27:04 kiwisdr kiwid: 6d:06:00:46.261 ..23.... 3 3570.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:05 kiwisdr kiwid: 6d:06:00:47.250 ..234... 4 3594.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:06 kiwisdr kiwid: 6d:06:00:48.192 ..2345.. 5 7040.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:07 kiwisdr kiwid: 6d:06:00:49.144 ..23456. 6 10140.15 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:08 kiwisdr kiwid: 6d:06:00:50.085 ..234567 7 14097.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:10 kiwisdr kiwid: 6d:06:00:51.913 0.234567 0 21096.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.048 0.234567 0 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.052 0.234567 3 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.052 0.234567 4 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.053 0.234567 5 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.053 0.234567 6 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:24 kiwisdr kiwid: 6d:06:01:06.053 0.234567 7 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:27:25 kiwisdr kiwid: 6d:06:01:07.831 0.234567 7 GEOLOC: extreme-ip-lookup.com/json/198.40.45.23
    Sep 6 06:27:25 kiwisdr kiwid: 6d:06:01:07.834 0.234567 6 GEOLOC: geoloc_task:P2:T36((50.000 msec) TaskSleep)
    Sep 6 06:27:25 kiwisdr kiwid: 6d:06:01:07.847 0.234567 5 GEOLOC: geoloc_task:P2:T35((50.000 msec) TaskSleep)
    Sep 6 06:27:26 kiwisdr kiwid: 6d:06:01:07.854 0.234567 4 GEOLOC: geoloc_task:P2:T34((50.000 msec) TaskSleep)
    Sep 6 06:27:26 kiwisdr kiwid: 6d:06:01:08.039 0.234567 3 GEOLOC: geoloc_task:P2:T33((50.000 msec) TaskSleep)
    Sep 6 06:27:26 kiwisdr kiwid: 6d:06:01:08.259 0.234567 0 GEOLOC: geoloc_task:P2:T32((50.000 msec) TaskSleep)
    Sep 6 06:29:10 kiwisdr kiwid: 6d:06:02:52.047 01234567 1 18106.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 6 06:29:24 kiwisdr kiwid: 6d:06:03:06.048 01234567 1 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Sep 6 06:29:24 kiwisdr kiwid: 6d:06:03:06.821 01234567 1 GEOLOC: extreme-ip-lookup.com/json/198.40.45.23
    Sep 6 08:40:26 kiwisdr kiwid: 6d:08:14:07.981 01234567 ca_pause 18040 ca_pause_old 1672 ca_shift=20 code_creep=-1692 code_period_ms=1 code_period_samples=16368
    Sep 6 10:40:14 kiwisdr kiwid: 6d:10:13:55.890 01234567 ca_pause 18648 ca_pause_old 2280 ca_shift=1964 code_creep=-4244 code_pe
    ........
  • I get the 0627 restart on my Odroid XU4 running Ubuntu
  • Thinking my restarts were due to some HW limit of the Pi, I bought an odroid but haven't set it up
    Now that the Pi is fixed, I don't see that I need to learn a whole new environment.
Sign In or Register to comment.