I'm rather puzzled by how bad the COCO ground truth is. This is the benchmark dataset for object detection? Wow. I would say Gemini's output is better than the ground truth in most of the example images.
Are there a few big things, many small things...? I'm curious what fruit are left hanging for fast SIMD matrix multiplication.
BTW, is it possible to have 10 of these connected to a single board/cpu ?
Does anyone have a comparison between Hailo and, say, a mid or high-end GPU or a TPU?