Readit News logoReadit News
Posted by u/niros_valtos 4 years ago
Ask HN: Mono-Repo or Multi-Repo?
I know that there is a debate about storing all source code in a mono-repo vs multiple repos. I am thinking about it from a security perspective - a separation to multiple repos reduces the risk of source code exposure/leakage and enable more granular access. However, maybe this isn't a high risk as having an insider threat or an account takeover that may inject a malicious code, so setting up codeowners will do the work even in a mono-repo. What are your thoughts?
aespinoza · 4 years ago
IMHO mono-repo vs multi-repo should be decided based on the sources of change for each component in a product. For example, the cloud components of a product usually change at the same pace and for the same reasons, it makes sense to have them in a mono-repo. Even in a microservices approach. Even in the cloud certain components can change at a different pace and for different reasons. For example if you have grpc api that talks to your mobile app and a webapi exposed to your customers.

I believe that components that move at a different pace and change for different reason should not be in the same repo. It is difficult to setup CI/CD for different ways of deployment and specially if they not changing at the same time.

Now, regarding security, it is important to keep different components of a product in different repos, this will give you the flexibility to manage a more restricted set of credentials and reduce the number of people that have access to it.

In the end it involved 3 things: 1) Sources of Change, 2) CI/CD Processes and 3) Security. You can definitely mix and match.

glacials · 4 years ago
It also depends on your deploy patterns. A microservice architecture embedded in a monorepo gives you false peace of mind that a breaking API change is okay because you’re changing both ends of the contract in the same commit.

But when that commit goes to be deployed and you don’t have atomic/transactional deploys across services, you get downtime between the first service’s deploy finishing and the second’s.

duped · 4 years ago
To address the security concerns

- All code should require manual review from at least one other person (some orgs require 2+). It should be impossible to introduce a code change to any code base with a single compromised developer account. This is true of mono or multi repo

- The only really dangerous shit (like private keys, admin credentials) should not be accessible or committed to repos at all, ever.

- I've never worked anywhere where a source code leak was a legitimate threat (note: not that it wasn't a concern, just that it doesn't normally have drastic consequences, your code isn't as special as you think)

- I have worked places with silo'd repos with granular access and I won't do that again, I ask about it in technical interviews and if your company is doing this it's not really a positive - making it harder for engineers to get work done for bad reasons is a sure fire way to get a toxic engineering culture. There's plenty of places that give us engineers access to everything in the meanwhile.

So to me I don't see the utility of multi repo from a security perspective - I'd argue the infrastructure problem it solves is the same as a package manager (having portions of systems moving at different rates without breaking dependent systems). If you have this problem in your code base then a multi repo org makes a lot of sense.

typedef_struct · 4 years ago
If it releases/deploys/versions together, it shares a repo. Otherwise, meh.
Chyzwar · 4 years ago
Monorepo are usually better for security.

  - you can update dependencies of multiple components/modules in one commit/pr. In general, it would be easier to keep up with third party dependencies.

  - it is harder to monitor and audit multiple repositories than a single monorepo. An attacker can inject a malicious commit in one of the less active repositories and nobody will notice.
In most cases, leakage is the same. If one developer PC get compromised, and you do not use 2FA, the attacker can still get access to most of the code. In many cases, data is a more interesting target than your source code.

From security perspective in order: 1. Educate your developers on security. 2. Develop threat model 3. Monitor for anomalies 4. Use 2FA auth 5. Secure your CI/CD 6. Keep dependencies up to date 7. Security scanning

guenthert · 4 years ago
Security? What aspect of it? If your company is working on multiple projects for which different levels of security clearance are required, then there is little choice but to use multiple repositories.

Similar, if you work on projects for which the customers require following incompatible standardized (codified) procedures, then too multiple repositories are the obvious choice.

If the company you work for is in danger of being split up (for legal, commercial or other reasons) and different projects might go different paths, then you'd be happy to have chosen multiple repositories as well.

I'd think, one would have to have a very good reason to put multiple projects in a single repository.

gitgud · 4 years ago
Mono-repo is probably best, one repo is always easier to keep secure and track changes, rather than a group of repos...

I originally loved the elegance of the multi-repo approach, but in practice it's just more of a pain...

softwaredoug · 4 years ago
I like a monorepo because as a component evolves, it tests its assumptions against the current state of the other components. In a way it forces a dialog between the owners of those components to understand each other's requirements.

It's handy in cases where you tightly integrate and depend on each other's code. Like a more peer to peer relationship. MOST of the time, working at a company, this has been my experience.

The other approach, multiple repos, I feel is best when there's a one-to-many relationship. Like a service with many customers. Or an open source project...

gnur · 4 years ago
Duo repo seems to be the sweet spot for me.

1 for actual application code 1 for infra as code

First can automatically create a PR to the second when any component artefact changes. Also makes ci/cd separation trivial

icedchai · 4 years ago
^ I second this, and is what I've used at 2 previous companies.