kiwirecorder.py crashes

In kiwiwspr.sh I utilize kiwirecorder.py to capture the WSPR band audio to a wav file and then process that wav file using 'wsprd'
As I increase the number of kiwirecorder.py, more and more often my kiwirecorder.py will crash and need to be restarted by my watchdog daemon.
These failures occur with the same frequency at two different QTHs and on the more powerful odroid as often as on the Pi.
I have monitored the ethernet packet loss rate which is very, very low and there seems to be no correlation between crashes and the few lost packets reported by a simultaneous iperf udp stream.
As an example, I execute:

python -u /home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwirecorder.py -q --ncomp -s 10.14.70.75 -p 8073 -f 28124.6 -m usb -L 1200 -H 1700 --pw=foobar -T -101 --dt-sec 120 >> capture.log 2>&1 &

And 5 minutes later that spawned capture terminates with this diagnostic printout:

....
Started a new file: 20180927T162400Z_28124600_usb.wav
Traceback (most recent call last):
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 39, in run
self._recorder.run()
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 359, in run
received = self._stream.receive_message()
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 797, in receive_message
self._process_ping_message(message)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 688, in _process_ping_message
self._send_pong(message)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 888, in _send_pong
self._options.outgoing_frame_filters)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 368, in create_pong_frame
return _create_control_frame(common.OPCODE_PONG, body, mask, frame_filters)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 360, in _create_control_frame
return _build_frame(header, frame.payload, mask)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 139, in _build_frame
return header + masking_nonce + masker.mask(body)
File "/home/odroid/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/util.py", line 201, in _mask_using_array
result.fromstring(s)
TypeError: fromstring() argument 1 must be string or read-only buffer, not bytearray
GNSS position: lat,lon=[+38.10, -122.95]

Started a new file: 20180927T163912Z_28124600_usb.wav
....

This is by no means a crisis problem, but I suspect that other kiwirecorder.py users like the TDoA service may be encountering similar problems.
«13

Comments

  • Hi @rrobinet

    I have a fix for these crashes, waiting for @jks to merge them into his fork of kiwiclient.

    These crashes were caused by changes for python2/python3 compatibility.
    WA2ZKD
  • This works better although I would like to root cause the problem. It doesn't crash, but it is reporting an error in capturing a wave file and then restarts on its own at an odd minute, then returns to even two minute captures:

    Started a new file: 20180928T231200Z_14095600_usb.wav

    Started a new file: 20180928T231400Z_14095600_usb.wav
    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 40, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 361, in run
    received = self._stream.receive_message()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 757, in receive_message
    frame = self._receive_frame_as_frame_object()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 462, in _receive_frame_as_frame_object
    opcode, unmasked_bytes, fin, rsv1, rsv2, rsv3 = self._receive_frame()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 459, in _receive_frame
    unmask_receive=self._options.unmask_receive)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 194, in parse_frame
    received = receive_bytes(2)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 454, in _receive_bytes
    return self.receive_bytes(length)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_base.py", line 159, in receive_bytes
    new_read_bytes = self._read(length)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_base.py", line 119, in _read
    (length, (self._request.connection.remote_addr,)))
    ConnectionTerminatedException: Receiving 2 byte failed. Peer ((('10.14.70.76', 8073),)) closed connection
    GNSS position: lat,lon=[+38.09, -122.95]
    GNSS position: lat,lon=[+38.09, -122.95]

    Started a new file: 20180928T231512Z_14095600_usb.wav

    Started a new file: 20180928T231600Z_14095600_usb.wav
  • While there are no longer exception errors, the python program terminates itself and needs to be rerun about as often as before, perhaps once every 12 hours.

    Here is the python output from the 20M client session which failed at 0624:
    ...
    Started a new file: 20180929T062400Z_14095600_usb.wav
    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 40, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 361, in run
    received = self._stream.receive_message()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 757, in receive_message
    frame = self._receive_frame_as_frame_object()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 462, in _receive_frame_as_frame_object
    opcode, unmasked_bytes, fin, rsv1, rsv2, rsv3 = self._receive_frame()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 459, in _receive_frame
    unmask_receive=self._options.unmask_receive)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 194, in parse_frame
    received = receive_bytes(2)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 454, in _receive_bytes
    return self.receive_bytes(length)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_base.py", line 159, in receive_bytes
    new_read_bytes = self._read(length)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_base.py", line 128, in _read
    (length, e))
    ConnectionTerminatedException: Receiving 2 byte failed. socket.error (timed out) occurred
    GNSS position: lat,lon=[+38.09, -122.95]

    Started a new file: 20180929T062710Z_14095600_usb.wav
    ...


    Here is the /var/log/messages from the Kiwi which reports all of the 8 client sessions which were restarted at that time:

    ....
    Sep 29 06:25:11 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="427" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.635 012.4567 3 7040.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:23)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.638 0.2.4567 1 28126.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:00)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.646 0.2.456. 7 3594.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:06)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.664 0.2.45.. 6 21096.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:20)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.667 0...45.. 2 3570.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:24)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.671 0....5.. 4 10140.15 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:22)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.671 0....... 5 18106.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:20)
    Sep 29 06:25:12 kiwisdr kiwid: 6d:07:39:56.675 ........ 0 14097.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 7:10:03)
    Sep 29 06:27:04 kiwisdr kiwid: 6d:07:41:48.551 ..2..... 2 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:04 kiwisdr kiwid: 6d:07:41:49.378 ..23.... 3 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:05 kiwisdr kiwid: 6d:07:41:50.165 ..234... 4 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:06 kiwisdr kiwid: 6d:07:41:50.961 ..2345.. 5 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:07 kiwisdr kiwid: 6d:07:41:51.702 ..23456. 6 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:08 kiwisdr kiwid: 6d:07:41:53.357 ..234567 7 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Sep 29 06:27:09 kiwisdr kiwid: 6d:07:41:54.122 0.234567 0 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    ....
  • the 0627 question again
  • Hi Rob,

    I can reproduce the backtraces you are seeing by kicking out a connection from the admin tab. So it looks like the kiwisdr kicks out one of the connections for whatever reason. I guess you are running one instance of kiwirecorder.py recording 8 data streams, so if any one of them throws an exception, all of them terminate.

    You see this backtrace because the kiwirecorder does not expect that the server closes the connection. In addition the server does not send a close request to the websocket client. I have fixes for both of these issues and I guess that John will let you know when they are available.

    Do you have anything interesting in /var/log/messages before the 1st client leaves?
  • No, the /var/log/messages file is silent for 3 hours before the failure. The inverted log times and duplicate UPDATE messages are suspicious:

    ....
    Sep 24 16:05:17 kiwisdr kiwid: 1d:17:20:01.635 01234567 ca_pause 16377 ca_pause_old 9 ca_shift=56 code_creep=-65 code_period_ms=1 code_period_samples=16368
    Sep 25 03:28:01 kiwisdr kiwid: 2d:04:42:45.909 01234567 UPDATE: check scheduled
    Sep 25 06:25:10 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="427" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Sep 26 03:28:01 kiwisdr kiwid: 3d:04:42:45.898 01234567 UPDATE: check scheduled
    Sep 26 06:25:10 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="427" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Sep 26 06:25:11 kiwisdr kiwid: 3d:07:39:55.976 0123456. 7 10140.15 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 40:47:54)
    Sep 26 06:25:11 kiwisdr kiwid: 3d:07:39:55.984 01234.6. 5 18106.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 47:58:02)
    ...

    WA2ZKD points out that 0627 seems to be a common time for these events to occur.
    However I can increase the frequency of these failures by increasing the number of simultaneous kiwirecorder.py clients. When I run 20 clients the CPU is only 30% busy and there is only 4 Mbps of ethernet traffic, but failures will occur several times per hour.
    I can give you ssh access to such a system if it would help your debugging.
  • The log times are not inverted -- they are for different days (Sep 24, 25, 26).

    How can I duplicate the more frequent crashing scenario so I might see what's happening on the Kiwi side? In particular what does "run 20 clients" mean exactly? That you're using multiple Kiwi servers and instances of kiwiclient such that there are 20 simultaneous open network connections? And, as Christoph asked, does is each kiwiclient setup to make 8 connections or only one? (i.e. does seeing 8 connections disappear from the Kiwi log reflect that one instance of kiwiclient, with 8 connections, has exited or that the Kiwi has kicked all connections causing 8 instances of kiwiclient to exit?)
  • jksjks
    edited September 2018
    I have committed Christoph's changes to KiwiSDR and kiwiclient. There is not a formal Kiwi v1.235 release yet given that there may be more fixes coming today. So to run the committed changes you need to use the "force software build" button in the update tab of the admin page.
  • I run multiple kiwirecorder.py client sessions on one server. At KPH there are two servers, one a Raspberry PI 3b, while the second server is an odroid. Both servers experience the crashes/restarts at about the same irregular rate. Currently both clients are configured to run 11 client sessions on each server. On the Pi, 8 sessions go to Kiwi76 and 3 LF sessions to Kiwi72. On the odroid 8 sessions go to Kiwi75 and 3 sessions to Kiwi72. The crashes occur on both servers and were occurring when only the Pi clients were running. I have experimented with configuring the Pi to run 20 clients, 8 to Kiwi76, 8 to Kiwi75 and 3 to Kiwi72 and the crash frequency went up significantly to several times per hour. I have two beta sites running my code, one on an odroid and a second on a x86 Debian, and both those sites experience crashes, but at lower frequency due to, I believe, both running 8 or fewer clients.
    WA2ZKD
  • Thanks Rob that's helpful.

    I've found a bunch of little problems with kiwirecorder, but nothing that explains why the Kiwi might be exhibiting the disconnection behavior. I just checked in these changes along with your v1.0 kiwiwspr script in the tools/ subdirectory. I'll keep look for the problem..
  • v1.235 is out which will push Christoph's web socket close fixes to all Kiwis.

    So now I can make 8 connections to an 8-channel Kiwi and then proceed to use the "kick" button (admin page status tab) on any combination of channels and the retry code in kiwirecorder will soon reconnect those channels. This occurs whether I use 8 instances of kiwirecorder each making a single connection or a single instance that makes 8 connections. See the Makefile targets "crash" and "crash2". And also the new targets "ps, kp, kill" which makes it a bit easier to manage multiple kiwirecorder processes that run detached.
  • edited September 2018
    Hi Rob,

    You mentioned that the CPU load is 30% with 20 clients but I guess that this is for recording only because every two minutes 20 wspr decoding processes will be running in parallel.

    So you could try to give the wspr decoding processes less priority and/or run them sequentially instead of in parallel.

    In summary I think that the CPU load of the wspr decoders slows down the recorders to a point where timeouts (either on client or on server side) are exceeded. Note that this is also consistent with your observation that the frequency of crashes depends on the number of recorders (+wspr decoders) running.
  • I'm seeing crashes (well, mass disconnects) every 3-4 hours now and adding printfs() to the Kiwi code. So far it appears the connection "kick" flag is getting set for some reason. This should not be happening so maybe it's a memory smash bug.
  • I have noticed my WSPR running the old script (yes I'm too lazy to update it right now) is seeing different connection times, seems something is kicking the individual recorder sessions?
    When it happened before (to best recollection) it would have been an all-or-nothing when I restarted something in the network link.
    WA2ZKD
  • I have observed a scheduling bug in V1.0a and am working to fix it this morning. However none of the start/stop of the kiwirecorder.py jobs has changed for many versions.

    I 'nice' the wsprd jobs, and the crash rates seems to be the same on the Pi and the odroid, so I don't think its an out of CPU or ethernet IO problem. I left an i5 Mac Air out at KPH and can compare the crash rate on it against the Pi when I fix my scheduling bug, but I'll be upping CPU performance and changing OS at the same time. I don't own an x86 Linux server, but one of my beta testers runs on one but he has only one Kiwi.

    Thanks
  • I should have a second soon so could open a x86 VM to you if you have a fixed IP address.
    The Kiwi is currently connecting over a wireless bridge (two Microtik devices) so I suppose there is some possibility there could be a momentary glitch as my neighbours overly powerful AP's hog channels (mine are running 3dbm with a link of about four feet).
  • All of my connections are wired
  • Well, the Heisenberg-printf principle is in full effect. My printf-instrumented code ran all night without any problems of course. I'll try again today.
  • Hi Rob,

    The message "rsyslogd was HUPed” means that the system logs were rotated.

    I suspect that you will find similar log entries or log entries related to other software than kiwid before each kiwirecorder crash. Can you check?
  • Hi Christoph,

    I have finished fixing my kiwiwspr.sh version 1.1 script and have been running 19 WSPR client jobs with it for an hour on what is now a very busy (and warm) Pi3B with no crashes or restarts. Previously I would see crashes every few minutes under these conditions. So you and John have made dramatic improvements in its stability in the last week.

    With those crashes gone my log files are completely clean, so any errors stimulated by logrotate or other kiwi or pi system activity should be easily visible.
    With my and your fixes, I think kiwiwspr.sh is ready for production use. I'll be posting it in another discussion thread. Thanks
  • The recorder is far, far more reliable but not yet perfect ;=)
    In 19 hours of testing on 19 simultaneous recording sessions I have observed 2 crashes.

    ========= The first crash occurred on Kiwi KPH_HF_2 at 08:23 PDT == 15:23 UDT after 19 hours of operation when this message was logged by kiwirecorder: =================

    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 42, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 372, in run
    received = self._stream.receive_message()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 757, in receive_message
    frame = self._receive_frame_as_frame_object()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 462, in _receive_frame_as_frame_object
    opcode, unmasked_bytes, fin, rsv1, rsv2, rsv3 = self._receive_frame()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/mod_pywebsocket/_stream_hybi.py", line 459, in _receive_frame
    unmask_receive=self._options.unmask_receive)
    AttributeError: 'Stream' object has no attribute '_options'

    ===== Kiwi KPH_HF_2's /var/log/messages has log entries at that time, but nothing else in the system logged at that time =========

    Oct 2 06:25:38 kiwisdr kiwid: 6d:15:46:58.091 01234567 0 GEOLOC: geoloc_task:P2:T30((50.000 msec) TaskSleep)
    Oct 2 06:27:17 kiwisdr rsyslogd0: action 'action 17' resumed (module 'builtin:ompipe') [try http://www.rsyslog.com/e/0 ]
    Oct 2 06:27:17 kiwisdr rsyslogd-2359: action 'action 17' resumed (module 'builtin:ompipe') [try http://www.rsyslog.com/e/2359 ]
    Oct 2 15:22:32 kiwisdr kiwid: 7d:00:43:52.287 01234.67 5 10140.15 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 8:57:16)
    Oct 2 15:23:03 kiwisdr kiwid: 7d:00:44:23.373 01234567 5 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Oct 2 15:23:15 kiwisdr kiwid: 7d:00:44:35.089 01234567 5 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Oct 2 15:23:15 kiwisdr kiwid: 7d:00:44:35.726 01234567 5 GEOLOC: ipapi.co/198.40.45.23/json
    Oct 2 20:14:29 kiwisdr kiwid: 7d:05:35:49.149 01234567 ca_pause 16375 ca_pause_old 7 ca_shift=16 code_creep=-23 code_period_ms=1 code_period_samples=16368



    ========= The second crash occurred on Kiwi KPH_HF_3 at 08:51 PDT == 15:51 UDT when this message was logged by kiwirecorder: =================
    Traceback (most recent call last):
    File "/usr/lib/python2.7/logging/__init__.py", line 861, in emit
    msg = self.format(record)
    File "/usr/lib/python2.7/logging/__init__.py", line 734, in format
    return fmt.format(record)
    File "/usr/lib/python2.7/logging/__init__.py", line 465, in format
    record.message = record.getMessage()
    File "/usr/lib/python2.7/logging/__init__.py", line 325, in getMessage
    msg = str(self.msg)
    TypeError:
    Logged from file kiwiclient.py, line 272
    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 42, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 379, in run
    self._process_ws_message(received)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 164, in _process_ws_message
    self._process_message(tag, body)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 275, in _process_message
    self._set_keepalive()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 159, in _set_keepalive
    self._send_message('SET keepalive')
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 144, in _send_message
    self._stream.send_message(str(msg))
    TypeError:
    2018-10-02 08:50:41,761 pid 23176 websocket close: "Required argument 'source' (pos 1) not found"
    pi@PiKPH:/tmp/kiwi-captures $

    ===== Kiwi KPH_HF_3's /var/log/messages has log entries at that time, but nothing in the system logged at that time =========
    Oct 2 06:25:11 kiwisdr rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="427" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
    Oct 2 15:50:41 kiwisdr kiwid: 9d:17:05:26.262 012.4567 3 3594.05 kHz usb z0 "kiwirecorder.py" 10.14.70.84 Point Reyes Station, California, USA (LEAVING after 17:28:50)
    Oct 2 15:51:03 kiwisdr kiwid: 9d:17:05:48.293 01234567 3 0.00 kHz am z0 "kiwirecorder.py" 10.14.70.84 (ARRIVED)
    Oct 2 15:51:21 kiwisdr kiwid: 9d:17:06:05.903 01234567 3 GEOLOC: 10.14.70.84 sent no geoloc info, trying from here
    Oct 2 15:51:22 kiwisdr kiwid: 9d:17:06:06.568 01234567 3 GEOLOC: ipapi.co/198.40.45.23/json
  • After 48 hours I have encountered a third crash, this one generated this one line cryptic error message:

    Fatal Python error: GC object already tracked
  • Hi Rob,

    there is an update to the kiwirecorder code which you may try.

    1st crash: "AttributeError: 'Stream' object has no attribute '_options'"
    * There may be a fix for this in the update

    2nd crash: "self._stream.send_message(str(msg)) TypeError:"
    * I do not understand how str(msg) can throw an exception. Anyway calling str in this case is not needed and it is removed in the update

    3rd crash: "Fatal Python error: GC object already tracked"
    * for the update I started to look at the state of the garbage collector and found and removed some cyclic dependencies (one caused by OptionParser and another one due to some import statements)

    Thanks for these crash reports as they help to improve the software!
  • I've downloaded the most recent kiwiclient and am running it at KPH. If you don't hear from me, I have observed no crashes or diagnostic output.
  • The new kiwiclient has been running on four systems for over six hours. There have been no crashes on the <=8 clients systems, but two crashes on the 19 client system.
    The first crash occurred about 90 minutes after startup:

    pi@PiKPH:/tmp/kiwi-captures $ cat KPH_HF_3/30/capture.log
    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 35, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 377, in run
    self._process_ws_message(received)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 159, in _process_ws_message
    tag = message[0:3].decode('utf-8')
    File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
    TypeError: an integer is required
    pi@PiKPH:/tmp/kiwi-captures $

    The second crash occurred several minutes ago after 6 hours of operation, but it generated no printout information. The watchdog daemon checked the pid every two minutes and it found it dead. The log file for that pid is empty:

    pi@PiKPH:/tmp/kiwi-captures $ cat KPH_HF_2/80/capture.log
    pi@PiKPH:/tmp/kiwi-captures $


    At least the first crash appears to be in a new area of your code. I'll continue testing...
  • I experienced four crashes this morning on the 19 client system: 3 produced no printouts and occured within the same 2 minute window.
    The fourth occured about 2 hours later and produced the same output as I reported above:

    pi@PiKPH:/tmp/kiwi-captures $ cat KPH_HF_3/17/capture.log
    Traceback (most recent call last):
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiworker.py", line 35, in run
    self._recorder.run()
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 377, in run
    self._process_ws_message(received)
    File "/home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwiclient.py", line 159, in _process_ws_message
    tag = message[0:3].decode('utf-8')
    File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
    TypeError: an integer is required
    pi@PiKPH:/tmp/kiwi-captures $
  • I cannot reproduce these crashes so it is very hard to fix them. Anyway there is a new version which you may give a try. It avoids utf-8 decoding which is only needed when using python3 but not for python2.
  • ...by the way today I got several NDBs from China and from the far east of Russia before local sunrise on kph:8072, including an oil rig off the coast of Sakhalin island (482 BT Berkut, RSE, +-400 Hz)
  • The Oct 7 kiwirecorder I downloaded this morning crashes immediately. Restoring the Oct 4 and the same command line runs fine. Without version numbers, I can't identify the versions more clearly.

    pi@PiKPH:~/ham/kiwiwspr $ ls -ld kiwiclient-jks-v0.1*
    drwxr-xr-x 4 pi pi 4096 Oct 4 07:30 kiwiclient-jks-v0.1
    drwxr-xr-x 4 pi pi 4096 Oct 7 07:24 kiwiclient-jks-v0.1.broken
    pi@PiKPH:~/ham/kiwiwspr $ /home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1/kiwirecorder.py -q --ncomp -s 10.14.70.76 -p 8073 -f 28124.6 -m usb -L 1200 -H 1700 --pw=littledavey -T -101 --dt-sec 120
    ^CKeyboardInterrupt: threads successfully closed
    pi@PiKPH:~/ham/kiwiwspr $ /home/pi/ham/kiwiwspr/kiwiclient-jks-v0.1.broken/kiwirecorder.py -q --ncomp -s 10.14.70.76 -p 8073 -f 28124.6 -m usb -L 1200 -H 1700 --pw=littledavey -T -101 --dt-sec 120
    2018-10-07 10:55:21,557 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,559 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,579 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,623 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,666 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,711 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,756 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,800 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,844 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,892 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,914 pid 13857 integer argument expected, got float
    2018-10-07 10:55:21,958 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,003 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,047 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,091 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,135 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,180 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,224 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,268 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,312 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,357 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,401 pid 13857 integer argument expected, got float
    2018-10-07 10:55:22,445 pid 13857 integer argument expected, got float
    ^C2018-10-07 10:55:22,489 pid 13857 integer argument expected, got float
    KeyboardInterrupt: threads successfully closed
    pi@PiKPH:~/ham/kiwiwspr $
Sign In or Register to comment.