msadowski (u/msadowski)

msadowski commented on Ask HN: How do robotics teams manage data and debugging today? · Posted by u/Lazaruscv

Lazaruscv · 5 months ago

Our current thinking is to focus heavily on automating triage across syslogs and bag/mcap files, since that’s where the hours really get burned, even for experienced folks. For interpretation, we see it more as an assistive layer (e.g., surfacing “likely causes” or linking to past incidents), rather than trying to replace domain expertise.

Do you think there are specific triage workflows where even a small automation (say, correlating error timestamps across syslog and bag files) would save meaningful time?

msadowski · 5 months ago

One thing that comes to mind is checking the timestamps across sensors and other topics. Two cases come to mind:

* I was setting up Ouster lidar to use gos time, don’t remember the details now but it was reporting the time ~32 seconds in the past (probably some leap seconds setting?)

* I had a ROS node misbehaving in some weird ways - it turned out there was a service call to insert something into db and for some reason the db started taking 5+ minutes to complete which wasn’t really appropriate for a blocking call

I think the timing is one thing that needs to be consistently done right on every platform. The other issues I came across were very application specific.

msadowski commented on Ask HN: How do robotics teams manage data and debugging today? · Posted by u/Lazaruscv

Lazaruscv · 5 months ago

This is super insightful, thank you for laying it out so clearly. Your point about the error surfacing way after it first occurred is exactly the sort of issue we’re interested in tackling. Foxglove is doing a great job with visualization and aggregation; what we’re thinking is more of a complementary diagnostic layer that:

• Correlates syslogs with mcap/bag file anomalies automatically

• Flags when a hardware failure might have begun (not just when it manifests)

• Surfaces probable root causes instead of leaving teams to manually chase timestamps

From your experience across 50+ clients, which do you think is the bigger timesink: data triage across multiple logs/files or interpreting what the signals actually mean once you’ve found them?

msadowski · 5 months ago

In my case, it’s definitely the data triage. Once I see the signal, I usually have ideas on what’s happening but I’ve been doing this for 11 years.

Maybe there could be value in signal interpretation for purely software engineers but I reckon it would be hard for such team to build robots.

msadowski commented on Ask HN: How do robotics teams manage data and debugging today? · Posted by u/Lazaruscv

msadowski · 5 months ago

Full disclosure, I work at Foxglove right now. Before joining, I spent over seven years consulting and had more than 50 clients during that period. Here are some thoughts:

* Combing through the syslogs to find issues is an absolute nightmare, even more so if you are told that the machine broke at some point last night

* Even if you find the error, it's not necessarily when something broke; it could have happened way before, but you just discovered it because the system hit a state that required it

* If combing through syslog is hard, try rummaging through multiple mcap files by hand to see where a fault happened

* The hardware failing silently is a big PITA - this is especially true for things that read analog signals (think PLCs)

Many of the above issues can be solved with the right architecture or tooling, but often the teams I joined didn't have it, and lacked the capacity to develop it.

At Foxglove, we make it easy to aggregate and visualize the data and have some helper features (e.g., events, data loaders) that can speed up workflows. However, I would say that having good architecture, procedures, and an aligned team goes a long way in smoothing out troubleshooting, regardless of the tools.

msadowski commented on Ask HN: Who wants to be hired? (May 2025) · Posted by u/whoishiring

msadowski · 10 months ago

  Location: Prague, Czech Republic
  Remote: Yes
  Willing to relocate: No
  Technologies: ROS, Python, C++, Robotics, Px4, Ardupilot, Pixhawk 
  Résumé/CV: https://www.linkedin.com/in/mateuszsadowski/
  Email: mat[at]msadowski.ch

I have 11 years of experience making software for robots. For the past 7 years, I've been working as a consultant on various projects. I have experience building software for hardware platforms, from autonomous mobile robots through drones to heavy-metal industrial robots. These days, I have a slight preference for consulting projects but can consider a full-time position if there is a good match.

msadowski commented on OpenAI's new reasoning AI models hallucinate more techcrunch.com/2025/04/18... · Posted by u/almog

msadowski · a year ago

Anyone has any stories on companies overusing AI? I’ve had some very frustrating encounters already when non-technical people were trying to help by sending AI solution to the issue which totally didn’t make any sense. I liked how the researchers in this work [1] prose calling LLM output “Frankfurtian BS”. I think it’s very fitting.

[1] https://ntrs.nasa.gov/citations/20250001849