How do you figure? Rendering a single Truetype letter will have dozens of conditional branches, use complex floating point math, and touch tens of kilobytes of cache. And there are many thousands of them in the GUI. Even if you have rendered cache, there is still lots of memory pressure copying data back and forth.
Compared to that, retained mode GUIs should be much more efficient. Even 50 checks are nothing compared to ability to void rendering one line.
Note this is all predicated on screen being mostly unchanging. For example, as I am typing this comment, everything is stationary except for a single line where I am typing.
> The screen is already going to be re-rendered.
This is the key! As you said, if you have to re-render whole screen anyway, then retained mode GUIs will have to maintain draw state and re-render every element every time anyway -- a strictly worse performance.
But most of the regular apps, like editors and chat clients and web browsers, do not re-render whole screen every time. They only do one thing at a time.
For an empirical evidence, look at WinAPI design: this is a retained mode GUI, with lots of effort dedicated to figuring out invalidated rectangles and which controls need to be rendered. This was the only way to make a GUI which is performant enough.
There's best case for Retained mode GUIs and best case for ImGUIs. ImGUIs excel when lots is changing. In a scrolling mobile app the entire page is being re-rendered constantly as the user moves the page up and down. GUI widgets get created and deleted, data gets marshalled in and out, various algorithms are applied to try to minimize re-creating stuff, all code that doesn't need to be executed in ImGUI mode code.
There are tons of cases where ImGUIs win in perf. In fact the reason they are so popular with for game dev is exactly because they vastly out perform retained mode guis. There's a reason there are 100s of books and 1000s of blog posts and articles trying to get retained mode guis to run smoothly. They almost always fail without lots and lots of specialized client side code to try to minimize GUI widget object creation/deletion, reuse objects, minimize state changes, etc, all that code disappears in an ImGUI and they run at 60fps where as all the retained mode guis struggle to get even 20fps consistently as they hiccup and sputter.
Note that Uber/Lyft do something the taxi companies can not. They scale to meet demand. If 1000 drivers are needed between 6pm and 8pm then they'll likely get 1000 drivers. Taxi companies can't do this as they'd lose money on drivers and cars when all of them are not in use.