First, MFEs solve organizational issues by cheaply offering release independence. An example is when teams that do not overlap in working hours. Triaging and resolving binary release blockers is hard to do correctly and onerous on oncall WLB. Another example is when new products want to move quickly without triggering global binary rollbacks or forcing too fast a release cadence for mature products with lower tolerance for outages or older test pyramids.
Second, MFEs are a pragmatic choice because they can proceed independently from the mono/microrepo decision and any modularization investment, both of which are more costly by several multiples in the cases I've seen. Most infra teams are not ivory towers and MFEs are high bang-for-buck.
Finally, MFEs are a tool to solve fundamental scaling issues with continuous development. At a certain level of commits, race conditions cause bugs or build breakages or test failures, and flaky tests cause inability to (cheaply) certify last known good commit at cut time. You can greatly push out both of these scaling limits with good feature flag/health-mediated releases and mature CI, but having an additional tool in the kit allows you to pick which to invest in based on ROI.
Advocating for modularity is nice but I've never met an MFE advocate who didn't also want a more tree-shakeable modular codebase. We should not jump to the conclusion the MFE as bad or a "last resort" because there exists another solution that better solves an partially overlapping set of problems, especially if the other solution doesn't solve many problems that MFEs do or requires significant more work to solve them.
[1] Runtime JS isolation (e.g. of globals set by third party libraries) is hard and existing methods are leaky abstractions like module federation or require significant infra work like iframing with DOM shims. CSS encapsulation is very hard on complex systems, and workarounds like shadow DOM have a11y and library/tooling interop issues. Runtime state sharing (so not every MFE makes its own fetch/subscriptions for common data) is hard and prone to binary skew bugs. Runtime dynamic linking of shared common code is hard to reason around and static linking of common code can result in the same transition of a lazy loaded module to go from taking 10kB to 1MB+ over the wire.
This is one of the biggest challenges where Bazel falls short of non-Bazel tooling for us in the Web development/Node.js world. The Node ecosystem has characteristics that push Bazel scaling (node_modules is 100k+ files and 1GB+), and Bazel's insistence on getting builds correct/reproducible is in a way its own enemy. Bazel needs to set up 100s of thousands of file watchers to be correct, but the Node.js ecosystem's "let's just assume that node_modules hasn't changed" is good enough most of the time. For us, many non-Bazel inner devloop steps inflate from <5s to >60s after Bazel overhead, even after significant infra tuning.