An interesting follow-up is using various xAI (explainable AI) techniques to then investigate what features in an image the classifier uses to make its decisions. Saliency maps work great for images. When I was playing around with it, the binary classifier I trained from scratch to distinguish cats from dogs ended up basically only looking at eyes. Enough images in the dataset featured cats with visible, open eyes, and the vertical slit is an excellent predictor. It was an interesting lesson that also emphasized how much the training data matters.
Relevant: https://distill.pub/2017/feature-visualization/