Considering what data? All queries sent to Gemini? Real users? A select few? Test queries from Google?
Does it include AI summaries of google searches? Because if the data includes stuff as simple as "How tall is Lee Pace," that is obviously going to bring the median query down, even if the top distribution is using many times more energy.
But still, the median is not useful by itself. It tells us 50% of the queries measured were under 0.24Wh. It obviously obscures policy-relevant information to not include the mean, but it also obscures what I can do individually without more details on the data. Where am I on this median?
It makes the most sense to provide the entire distribution and examples of data points.
This is an example of my least favorite style of feigned insight: redefining a term into meaninglessness just so you can say something that sounds different while not actually saying anything new.
Yes, if you redefine "hallucination" from "produce output containing detailed information despite that information not being grounded in external reality, in a manner distantly analogous to a human reporting sense data produced by a literal hallucination rather than the external inputs that are presumed normally to ground sense data" to "produce output", its true that all LLMs do is "hallucinate", and that "hallucinating" is not a undesirable behavior.
But you haven't said anything new about the thing that was called "hallucination" by everyone else, or about the thing--LLM output in general--that you have called "hallucination". Everyone already knew that producing output wasn't undesirable. You've just taken the label conventionally attached to a bad behavior, attached it to a broader category that includes all behavior, and used the power of equivocation to make something that sounds novel without saying anything new.