I know that there is a debate about storing all source code in a mono-repo vs multiple repos.
I am thinking about it from a security perspective - a separation to multiple repos reduces the risk of source code exposure/leakage and enable more granular access. However, maybe this isn't a high risk as having an insider threat or an account takeover that may inject a malicious code, so setting up codeowners will do the work even in a mono-repo.
What are your thoughts?
I believe that components that move at a different pace and change for different reason should not be in the same repo. It is difficult to setup CI/CD for different ways of deployment and specially if they not changing at the same time.
Now, regarding security, it is important to keep different components of a product in different repos, this will give you the flexibility to manage a more restricted set of credentials and reduce the number of people that have access to it.
In the end it involved 3 things: 1) Sources of Change, 2) CI/CD Processes and 3) Security. You can definitely mix and match.
But when that commit goes to be deployed and you don’t have atomic/transactional deploys across services, you get downtime between the first service’s deploy finishing and the second’s.
- All code should require manual review from at least one other person (some orgs require 2+). It should be impossible to introduce a code change to any code base with a single compromised developer account. This is true of mono or multi repo
- The only really dangerous shit (like private keys, admin credentials) should not be accessible or committed to repos at all, ever.
- I've never worked anywhere where a source code leak was a legitimate threat (note: not that it wasn't a concern, just that it doesn't normally have drastic consequences, your code isn't as special as you think)
- I have worked places with silo'd repos with granular access and I won't do that again, I ask about it in technical interviews and if your company is doing this it's not really a positive - making it harder for engineers to get work done for bad reasons is a sure fire way to get a toxic engineering culture. There's plenty of places that give us engineers access to everything in the meanwhile.
So to me I don't see the utility of multi repo from a security perspective - I'd argue the infrastructure problem it solves is the same as a package manager (having portions of systems moving at different rates without breaking dependent systems). If you have this problem in your code base then a multi repo org makes a lot of sense.
From security perspective in order: 1. Educate your developers on security. 2. Develop threat model 3. Monitor for anomalies 4. Use 2FA auth 5. Secure your CI/CD 6. Keep dependencies up to date 7. Security scanning
Similar, if you work on projects for which the customers require following incompatible standardized (codified) procedures, then too multiple repositories are the obvious choice.
If the company you work for is in danger of being split up (for legal, commercial or other reasons) and different projects might go different paths, then you'd be happy to have chosen multiple repositories as well.
I'd think, one would have to have a very good reason to put multiple projects in a single repository.
I originally loved the elegance of the multi-repo approach, but in practice it's just more of a pain...
It's handy in cases where you tightly integrate and depend on each other's code. Like a more peer to peer relationship. MOST of the time, working at a company, this has been my experience.
The other approach, multiple repos, I feel is best when there's a one-to-many relationship. Like a service with many customers. Or an open source project...
1 for actual application code 1 for infra as code
First can automatically create a PR to the second when any component artefact changes. Also makes ci/cd separation trivial