TBH, a browser has one million other potential leaks and exploits and denials to care about. Unlike a markup language with carefully crafted rules about what to expand in which context, a browser providing a Turing-complete language trivially runs into infinite loops or recursion and could at best heuristically terminate JS code, after using laborious best-efforts methods such as execution traces and watchdog timers. Moreover, JS syntax itself and how it's represented in markup unneccessarily gives rise to obfuscation and escaping attacks. And the browser in addition also plays fast and loose with an ever-expanding, potentially Turing-complete styling language with super complicated styling syntax and constraint-based layout models with a complete lack of formal semantics or any other formal reasoning. CSS itself can contain HTML literals at several places such as data URIs and the content property with questionable escaping, but I guess this is par for the course when CSS is basically just tunneling yet another syntax through HTML attributes and elements. Context Security Policy is just bolted on to limit this pointless and careless gratuitious syntax proliferation, and of course must invent its own little item-value microsyntax to be tunneled through HTTP headers and HTML header tags.
Considering these glaring and obvious deficits weren't discussed at the same time I assume those who brought up XXE and suggested SGML/XML could leak documents or smuggle executable instructions did so in bad faith or were repeating hearsay to sound clever or something.
What it did do is create a perception that XML is insecure. This hastened its demise.
> It's trivial to mitigate security risk arising out the use of entities.
Obviously. However in practise, historically most implementations did not. At least not by default.
XML spec bears some responsibility for this for not being explicit about suggesting secure defaults.
Regardless, JSON won partially because it didnt have the attack surface, so people didn't have to worry. XML being theoretically easy to secure means nothing when practically implementations made it difficult.
All the various metadata formats are kind of weird. IIM (less popular now but still sometimds seen in jpeg files. Was originally for news organizations) is even weirder than Exif. My favourit part is how you specify its utf-8 by adding the iso-2022 escape code in a field. Like wut.