Modular Monoliths Are a Good Idea

> In practice microservices can be just as tough to wrangle as monoliths.

What's worse: Premature scalability.

I joined one project that failed because the developers spent so much time on scalability, without realizing that some basic optimization of their ORM would be enough for a single instance to scale to handle any predictable load.

Now I'm wrangling a product that has premature scalability. It was designed with a lot of loosely coupled services and high degrees of flexibility, but it's impossible to understand and maintain with a small team. A lot of "cleanup" often results in merging modules or cutting out abstraction.

princevegeta89 · a year ago

The company I'm at is a well funded startup that doesn't receive a humongous traffic at all. Yet, the so-called engineers in the early days ended up deciding to split every little functionality into a microservice.

Now we have 20+ microservices that are setup together in a fucked up way. Today, every engineer out of our 150+ engineering team struggles with implementing and getting trivial stuff over the finish line.

Many tasks require making code changes in multiple codebases and there are way too many moving parts. The knowledge overhead required to setup and test shit locally is too high as well. And the documentation gets so obsolete so quickly and people spend an obscene amount of time reaching out to other teams and running in circles to get unblocked on things. Our productivity would literally 5x if we just had 3 or 4 services overall. Even 1 giant service with clear abstraction between teams would have worked well, actually.

Yet, for the flashiness and to keep sounding cool, the folks at our company still keep living with the pain. As an IC, I just fucking do my work 5 hours each day and just keep reminding myself to ignore whatever horrors I keep seeing. Seems to keep going well.

gwbas1c · a year ago

IMO: Either advocate to merge services, or find a new job.

As professionals, part of our job is to advocate for the right solution. Often there is a leader who is emotional about their chosen framework / pattern / architecture / library. It's merely a matter of speaking truth to the right power, and then getting enough consensus to push on the leader to see the error in their ways (or to get the other leaders to push the bad leader out.)

In your job, what you really need to do is point out to non-technical leaders that development of new features is too slow, given the current team size. Don't get too technical with them, but try to find high-level analogies. You can also work on your more direct leadership, but that requires feeling out their motivations.

notjoemama · a year ago

I’ve started taking 5 layers out of a Rails app, back to MVC. It’s so much faster now I actually feel bad, and I’m not the one that built the app in the first place. The premise during its construction was that it would scale to millions of active users. It…is not doing that in the wild…

The thing microservices give is an enforced api boundry. OOP classes tried to do that with public/private but fail because something public for this module is private outside. I've written many classes thinking they were for my module only and then someone discovered and abused it elsewhere. Now their code is tightly coupled to mine in a place I didn't intend to be coupled.

i don't know the answer to this it is just a problem I'm fighting.

overlordalex · a year ago

What you'd want is Architecture Unit Tests; you can define in code the metastructures and relationships, and then cause the build to fail if the relationship is violated.

The classic is ArchUnit in Java:

    @ArchTest
    static final ArchRule  layer_dependencies_are_respected = layeredArchitecture().consideringAllDependencies()

            .layer("Controllers").definedBy("com.tngtech.archunit.example.layers.controller..")
            .layer("Services").definedBy("com.tngtech.archunit.example.layers.service..")
            .layer("Persistence").definedBy("com.tngtech.archunit.example.layers.persistence..")

            .whereLayer("Controllers").mayNotBeAccessedByAnyLayer()
            .whereLayer("Services").mayOnlyBeAccessedByLayers("Controllers")
            .whereLayer("Persistence").mayOnlyBeAccessedByLayers("Services");

The problem I've had with these tests is that the run time is abysmal, meaning they only really get run as part of CI and devs complain that they fail too late in the process

Also, I'm on mobile so apologies for the badly formatted code - original can be found here: https://github.com/TNG/ArchUnit-Examples/blob/main/example-j...

Twisol · a year ago

Different languages handle this in different ways, but the most common seems to be adding access controls to the class itself, rather than just its members.

For instance, Java lets you say "public class" for a class visible outside its package, and just "class" otherwise. And if you're using Java 11 modules (nobody is though :( ), you can choose which packages are exported to consumers of your module.

In a similar vein, Rust has a `pub` access control that can be applied to modules, types, functions, and so on. A `pub` symbol is accessible outside the current crate; non-pub symbols are only accessible within one crate.

Of course, lots of languages don't have anything like this. The biggest offender is probably C++, although once its own version of modules is widely supported, we'll be able to control access somewhat like Java modules and Rust crates, with "partitions" serving the role of a (flattened) internal package hierarchy. Right now, if you do shared libraries, you can tightly control what the linker exports as a global symbol, and therefore control what users of your shared library can depend on -- `-fvisibility=hidden` will be your best friend!

steveklabnik · a year ago

> A `pub` symbol is accessible outside the current crate;

This is not universally true; it's more that pub makes it accessible to the enclosing scope. Wrapping in an extra "mod" so that this works in one file:

    mod foo {
        mod bar {
            pub fn baz() {
                
            }
        }
        
        pub fn foo() {
          bar::baz();  
        }
    }
    
    fn main() {
        // this is okay, because foo can call baz
        foo::foo();

        // this is not okay, because bar is private, and so even though baz is marked pub, its parent module isn't
        foo::bar::baz();

    }

mattnewton · a year ago

As an ex-fang engineer myself, I have never advocated for more services, usually have pushed for unifying repos and multiple build targets on the same codebase. I am forever chasing the zen of google3 the way I remember it.

If anything my sin has been forgetting how much engineering went into supporting the monorepo at Google and duo-repo at Facebook when advocating for it.

paperplatter · a year ago

Do FAANG engineers normally advocate for more services instead of fewer? I haven't gotten that impression.

aleksiy123 · a year ago

Smaller services but not necessarily more binaries.

The current direction I think is to build composable services that could be run together or separately.

Where a service is a logical grouping of an RPC interface.

Here is some public work in this direction from Google

https://serviceweaver.dev/

paulddraper · a year ago

Normally more services and fewer repos.

From my experience.

ecshafer · a year ago

There seems to be more interest in building monorepo support now. Some tools, start ups, etc. I would bet Github is working on increasing support as well for large repos. So I think Google was ahead of the curve there.

recursivecaveat · a year ago

What were the two facebook repos? I can't find any reference to them.

simscitizen · a year ago

The main ones were www which contained most of the PHP code and fbcode which contained most of the other backend services. There were actually separate repos for the mobile apps also.

andy_ppp · a year ago

Elixir + Phoenix is so great at this with contexts and eventually umbrella apps. So easy to make things into apps that receive messages and services with a structure. I’m amazed it isn’t more popular really given it’s great at everything from runtime analysis to RPC/message passing to things like sockets/channels/presence and Live View.

frompdx · a year ago

I've been picking up Elixir and the Pheonix framework and I'm impressed so far. Pheonix is a very productive framework. Under the hood Elixir is very lisp-like, but the syntax is more palatable for developers who are put off by the syntax of Lisp.

Why isn't it more popular? It's always an uphill battle to introduce a new programming language or framework if BIGNAME doesn't use it.

sethammons · a year ago

the elixir shop I was at, folks just repl'd into prod to do work. Batshit insanity to me. Is that the elixir way? Are you able to easily lock down all writes and all side effects and be purely read only? If so, they never embraced that.

andrewmutz · a year ago

> repl'd into prod to do work

Like for debugging production problems and fixing customer data? Or for normal development?

If its the former that's a great use of technology, and if its the latter it sound insane.

ryoshu · a year ago

It has hot swappable code. For skilled practitioners it's fine.

It depends on the situation, if something is broken in a live system and you can login and introspect the real thing this is awesome. Obviously there are trade offs and potentially you might break things further!

mushufasa · a year ago

Would Django's concept of an 'app' fit your definition of modular monoliths?

https://docs.djangoproject.com/en/5.1/ref/applications/

In a nutshell, each django project is an 'app' and you can 'install' multiple apps together. They can come with their own database tables + migrations. But all live under the same gunicorn and on the same infra, within the same codebase. Many Django plugins are setup as an 'app'.

halfcat · a year ago

Django apps can be the modular part of a modular monolith, but it requires some discipline. Django apps do not have strong boundaries within a Django project.

Often there will be foreign keys crossing app boundaries which makes one wonder why there are multiple apps at all.

In fact some people opt for putting everything into a single app [0]. Others opt for no app [1].

Django apps are good for installing Django packages into Django projects. But there’s no firm mechanism that enforced any real separation. It’s just other Python modules in a different folder (that you can just import into your other app).

The rule would be something like, if you can’t pip install your Django app into a project, it’s probably too weak of a boundary (that might be a bit too extreme, but if it is, it’s not too far off).

[0] https://careers.doordash.com/blog/tips-for-building-high-qua...

[1] https://noumenal.es/notes/django/single-folder-layout/

ljm · a year ago

I can't help but feel like the author has taken some fairly specific experiences with microservice architecture and drawn a set of conclusions that still results in microservices, but in a monorepo. There's nothing about microservices that suggests you have to go to the trouble of setting up K8s, service meshes, individual databases per service, RPC frameworks, and so on. It's all cargo culting and all this...infra... simply lines the pockets of your cloud provider of choice.

The end result in the context of a monolith reads more like domain driven design with a service-oriented approach and for most people working in a monolithic service, the amount of abstraction you have to layer in to make that make sense is liable to cause more trouble than it's worth. For a small, pizza-sized team it's probably going to be overkill where more time is spent managing the abstraction instead of shipping functionality that is easy to remove.

If you're going to pull in something like Bazel or even an epic Makefile, and the end result is that you are publishing multiple build artifacts as part of your deploy, it's not really a monolith any more, it's just a monorepo. Nothing wrong with that either; certainly a lot easier to work with compared to bouncing around multiple separate repos.

Fundamentally I think that you're just choosing if you want a wide codebase or a deep one. If somehow you end up with both at the same time then you end up with experiences similar to OP.

I think the assumption here is that "microservices" means each team is dealing with lots of services. Sometimes it's like that. But if you go by the "one service <=> one database" rule of thumb, there will probably be 1-3 services per team. And when you want to use other teams' stuff, you'll be thankful it's across an RPC. First basic reason is if you don't agree with that other team on what language to write in.

It'd really help to see a concrete example of a modular monolith compared to the microservice equivalent.

bluGill · a year ago

jillesvangurp · a year ago

Modules are almost as old as compiler technology. A good module structure is a time proven way to deal with growing code bases. If you know your SOLID principles, they apply to most module systems at any granularity. It doesn't matter if they are C header files, functions, Java classes or packages, libraries, python modules, micro services, etc.

I like to think of this in terms of cohesiveness and coupling rather than the SOLID principles. Much easier to reason about and it boils down to the same kind of outcomes.

You don't want a lot of dependencies on other modules (tight coupling) and you don't want to have one module do too many things (lack of cohesiveness). And circular dependencies between modules are generally a bad idea (and sadly quite common in a lot of code bases).

You can trivially break dependency cycles by introducing new modules. This is both good and bad. As soon as you have two modules, you will soon find reasons to have three, four, etc. This seems to be true with any kind of module technology. Modules lead to more modules.

That's good when modules are cheap and easy. E.g. most compilers can deal with inlining and things like functions don't have a high cost. Small functions, classes, etc. are easy to test and easy to reason about. Being able to isolate modules from everything else is a nice property. If you stick to the SOLID principles, you get to have that.

But lots of modules is a problem with micro services because they are kind of expensive as a module relative to alternatives. Having a lot of them isn't necessarily a great idea. You get overhead in the form of build scripts, separate deployments, network traffic, etc. That means increased cost, performance issues, increased complexity, long build times, etc.

Add circular dependencies to the mix and you now get extra headaches resulting from that as well (which one do you deploy first?). Things like graphql (aka. doing database joins outside the database) are making this worse (coupling). And of course many companies confuse their org chart with their internal architecture and run into all sorts of issues when those no longer align. If you have 1 team per service, that's probably going to be an issue. It's called Conway's law. If you have more services than teams you are over engineering. If you struggle to have teams collaborate on a large code base, you definitely have modularization issues. Micro services aren't the solution.

"To get similar characteristics from a monolith, developers need:

    Incremental build systems

    Incremental testing frameworks

    Branch management tooling

    Code isolation enforcement

    Database isolation enforcement"

This sounds a lot like microservices, most of all the last point. Is the only difference that you don't use RPCs?

nine_k · a year ago

> the only difference that you don't use RPCs

But it's a huge difference. No RPC overhead. No lost / duplicate PRC messages. All logs can literally go to the same file (via e.g. simple syslog).

Local deployment is dead simple, and you can't forget to start any service. Prod deployment never needs to handle a mix of versions among deployed services.

Beside that, the build step is much simpler. Common libraries' versions can never diverge, because there's one copy per the whole binary (can be a disadvantage sometimes, too). You can attach a debugger and follow the entire chain, even if it crosses the boundaries of the modules.

With that, you can make self-contained modules as small is it makes logical sense. You can pretty cheaply move pieces of functionality from one module to another, if it makes better sense. It's trivially easy to factor out common parts into another self-contained module.

Still you have all the advantages of fast incremental / partial builds, contained dependencies, and some of the advantages of isolated / parallel testing. But most importantly, it preserves your sanity by limiting the scope of most changes to a single module.

There would be a mix of versions, managed via branches.

The part about debuggability sounded appealing at first, but if the multiple services you want to run are truly that hard to spin up locally, it won't be any easier as a monorepo. First thing you'll do is pass in 30 flags for the different databases to use. If these were RPCs, you could use some common prod or staging instance for things you don't want to bother running locally.

pjmlp · a year ago

Which tends to be available in all major compiled languages.

Don't use scripting monkey patching friendly languages for applications.

If by scripting monkey friendly you mean Javascript or Python, nah I'm going to use that.