Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Telemetry Health Calibration returning wrong states #164

Open
douglaswsilva opened this issue Dec 17, 2019 · 13 comments
Open

Telemetry Health Calibration returning wrong states #164

douglaswsilva opened this issue Dec 17, 2019 · 13 comments

Comments

@douglaswsilva
Copy link
Collaborator

douglaswsilva commented Dec 17, 2019

Hey guys,

As I spoke with Jonas, sometimes we've been getting "wrong" calibration to some of the values when subscribing to telemetry.health. I finally was able to reproduce it connected to the debugger and this is what I see int he logs coming from the SDK:

[01:04:11|Info ] Server set to listen on 0.0.0.0:50051 (grpc_server.cpp:48)
[01:04:11|Info ] Server started (grpc_server.cpp:32)
[01:04:11|Info ] Waiting to discover system... (connection_initiator.h:582019-12-17 13:04:11.543473-0800 SiteScan[4552:1715194] TIC TCP Conn Failed [6:0x280ca8480]: 1:50 Err(50)
1.567022-0800 SiteScan[4552:1715194] Task <6D86C013-F1A5-4AE0-863B-60407A548E8A>.<1> HTTP load failed (error code: -1009 [1:50])
2019-12-17 13:04:11.580058-0800 SiteScan[4552:1715235] Task <6D86C013-F1A5-4AE0-863B-60407A548E8A>.<1> finished with error - code: -1009
|Info ] New system on: 127.0.0.1:60985 (udp_connection.cpp:263)
[01:04:11|Debug] Component Unsupported component (240) added. (system_impl.cpp:401)
[01:04:11|Debug] Discovered 1 component(s) (UUID: 1) (system_impl.cpp:561)
[01:04:11|Info ] System discovered [UUID: 1] (connection_initiator.h:62)
2019-12-17 13:04:11.860006-0800 SiteScan[4552:1715074] connectionState started
2019-12-17 13:04:11.860102-0800 SiteScan[4552:1715074] Firmware version is (null)
[01:04:12|Info ] New system on: 192.168.42.1:54780 (udp_connection.cpp:263)
[01:04:12|Debug] Component Camera 1 (100) added. (system_impl.cpp:401)
[01:04:12|Warn ] sending again, retries to do: 3  (511). (mavlink_commands.cpp:222)
[01:04:12|Debug] Component Autopilot (1) added. (system_impl.cpp:401)
[01:04:12|Debug] Component Gimbal (154) added. (system_impl.cpp:401)
[01:04:12|Error] Error: UUID changed (system_impl.cpp:257)
[01:04:12|Warn ] Command ack 520 not matching our current command: 530 (mavlink_commands.cpp:143)
[01:04:12|Debug] request camera info (camera_impl.cpp:140)
[01:04:12|Info ] Using cached file for Yuneec E90. (camera_impl.cpp:907)
[01:04:12|Debug] step range missing for CAM_CUSTOMWB (camera_definition.cpp:371)
[01:04:12|Error] min range missing for CAM_SPOTAREA (camera_definition.cpp:345)
[01:04:12|Warn ] Not found: CAM_SPOTAREA (camera_definition.cpp:220)
[01:04:12|Error] min range missing for CAM_ASPECTRATIO (camera_definition.cpp:345)
[01:04:12|Warn ] Not found: CAM_ASPECTRATIO (camera_definition.cpp:220)
[01:04:12|Debug] Ignoring string params. (camera_definition.cpp:121)
[01:04:12|Warn ] Default missing for CAM_SYSTEMTIME (camera_definition.cpp:196)
[01:04:12|Debug] Ignoring custom params. (camera_definition.cpp:126)
[01:04:12|Debug] Ignoring custom params. (camera_definition.cpp:126)
[01:04:12|Warn ] Default missing for CAM_ZOOM (camera_definition.cpp:196)
[01:04:12|Error] Error: get param busy timeout: CAL_GYRO0_ID (mavlink_parameters.cpp:456)
[01:04:12|Error] Error: Param for gyro cal failed. (telemetry_impl.cpp:872)
[01:04:12|Debug] MAVLink: info: data link #0 lost (system_impl.cpp:308)
[01:04:12|Debug] MAVLink: info: data link #0 regained (system_impl.cpp:308)
[01:04:12|Warn ] Comparison type mismatch between uint8_t and uint16_t (mavlink_parameters.h:407)
[01:04:12|Error] Chosen value smaller than minimum (camera_definition.cpp:550)

As you can see, there was a timeout in

[01:04:12|Error] Error: get param busy timeout: CAL_GYRO0_ID (mavlink_parameters.cpp:456)

Which made it be false and the other calibration values to be true. To reproduce this I just ran our app on iPad and while connected to drone via wifi.

If we are subscribing to too many things at the same time, therefore causing the timeout, please let us know. And if you have any suggestion for the best way to subscribe to things, we'd appreciate.

Once again, thanks for looking into this!

@douglaswsilva douglaswsilva changed the title Telemetry Health Calibration returning wrong values Telemetry Health Calibration returning wrong states Dec 17, 2019
@JonasVautherin
Copy link
Collaborator

Which made it be false and the other calibration values to be true.

Oh, so do you always get a timeout on the sensor when it fails? And for the different sensors?

@julianoes: do you have insights on what it means to have a timeout on one of those parameters? MAVSDK will try to get the parameter multiple times before it gives up, right? Could that mean that on the firmware side, something is wrong with this param?

@julianoes
Copy link
Contributor

My thought is that this might be a link congestion issue on startup.

@julianoes
Copy link
Contributor

We could try it with a longer timeout and more retransmissions.

@douglaswsilva
Copy link
Collaborator Author

@JonasVautherin The "wrong" calibration values don't happen every time. I was only able to reproduce this once when connected to the debugger. But I imagine that every time it happens, it's probably caused by this timeout in one of the calibration values.

@julianoes That'd be cool to try. Let me know how I can help.

@JonasVautherin
Copy link
Collaborator

@julianoes is there an easy and ugly way to add a super long timeout, just to confirm that this is the root cause? From discussions with @douglaswsilva, it seems like they can reproduce it reasonably often (like 1/5th of the time) in their setup.

@douglaswsilva
Copy link
Collaborator Author

douglaswsilva commented Jan 6, 2020

@julianoes If there's a place I can increase the timeout in the mavsdk, let me know. I can do it locally, build, and test it to see if it fixes the issue.

@douglaswsilva
Copy link
Collaborator Author

@julianoes Thanks, I'll take a look into this.

@douglaswsilva
Copy link
Collaborator Author

douglaswsilva commented Jan 30, 2020

@julianoes @JonasVautherin Our test engineering is currently testing the fix for this: https://github.com/mavlink/MAVSDK/pull/972/files. Thanks Julian!

On that note, I noticed another issue that I'm double testing and believe it shouldn't be hard to fix, but I'd need your guys help.
I think if you subscribe to drone.telemetry.health before the drone is connected and then connect the drone, it will not update the values.

As I said, I'm going to double check this, but let me know if you spot anything weird.

@julianoes
Copy link
Contributor

@JonasVautherin
Copy link
Collaborator

@julianoes I had a quick look at it, and I don't understand your comment. Shouldn't health be advertised on each heartbeat, as per process_heartbeat here?

@julianoes
Copy link
Contributor

Some of the health information is based on parameters that are queried. That is only a "workaround for now" but that's why the data can be stale.

@JonasVautherin
Copy link
Collaborator

And disable() will be called when the drone is discovered? 🤔

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants