Moondream 2 has been very useful for me: I've been using it to automatically label object detection datasets for novel classes and distill an orders of magnitude smaller but similarly accurate CNN.
One oddity is that I haven't seen the claimed improvements beyond the 2025-01-09 tag - subsequent releases improve recall but degrade precision pretty significantly. It'd be amazing if object detection VLMs like this reported class confidences to better address this issue. That said, having a dedicated object detection API is very nice and absent from other models/wrappers AFAIK.
Looking forward to Moondream 3 post-inference optimizations. Congrats to the team. The founder Vik is a great follow on X if that's your thing.
Thanks! If you could shoot me a note at vik@m87.ai with any examples of the precision/recall issues you saw I'd appreciate it a ton.
Re: chart understanding, there are a lot of different types of charts out there but it does fairly well! We posted benchmarks for ChartQA in the blog but it's on par with GPT5* and slightly better than Gemini 2.5 Flash.
* To be fair to GPT5, it's going to work well on many more types of charts/graphs than Moondream. To be fair to Moondream, GPT5 isn't really well suited to deploy in a lot of vision AI applications due to cost/latency.