Git: Malicious repositories can execute remote code while cloning

The commit that fixes this issue:

https://github.com/gitster/git/commit/684dd4c2b414bcf648505e...

(Surprise, the root cause is a cache)

Another cachelty

sethammons · 5 years ago

well played. I think that just got added to my standard vocabulary. Caching has caused more errors and bugs that I've had to deal with than I can recall. My favorite was an off by one error where we returned nicely cached info -- just for the previous user who came through our system! :facepalm: That was a bad one.

Dead Comment

AbraKdabra · 5 years ago

Oh man...

brundolf · 5 years ago

It's amazing how often exploits come down to optimizations. The general form being "the domain logic is X, and is secure, but we faked around it in this one case to make it faster, and it turns out we made a bad assumption while doing so". Meltdown fits this description too.

saagarjha · 5 years ago

Optimizations come from making assumptions, and bugs come from mistaken assumptions.

gfiorav · 5 years ago

Same with bugs. Hate to be the mantra guy but:

1. Make it work

2. Make it right

3. Make it fast (make sure you need to) a.k.a. optimize

4. Make it scale

Agentlien · 5 years ago

I am really fascinated by the responses to this comment. So many people exclaiming how many issues are caused by caches. In ten years as a fulltime programmer the only cache issues I've seen are cache misses. It probably has to do with one's field. I'm a game developer mainly dealing with graphics programming.

ironmagma · 5 years ago

The key problem (as I understand it) is that updating a cache properly requires knowing the exact graph of relations that an entry in the cache has to other entries. So that when that entry changes, you can propagate that change throughout the cache to other concerned entries which need to be recomputed. But knowing that exact graph is too complex a task to be trivial, it seems in this case. Basically it sounds like the non-visual version of rerendering UI when a state changes, which is hard enough even with visual feedback.

josefx · 5 years ago

A lot of threading issues are also cache related. Forget to properly mark access to shared variables and suddenly every thread /CPU core ends up with its own locally cached version of it.

pabs3 · 5 years ago

I wish folks who announce security issues would link to the patches for the issues they are announcing. This should become standard practice.

segfaultbuserr · 5 years ago

Difficult problems in programming:

(1) cache invalidation

(2) off-by-one errors

mattiasfestin · 5 years ago

The classical three problems.

xarope · 5 years ago

I thought the two hardest problems were:

1) naming

2) cache invalidation

...

3) off-by-one errors

bouncycastle · 5 years ago

shouldn't your list start from 0?

eru · 5 years ago

Fortunately, many off-by-one errors can be caught with more ergonomic tooling.

For the simplest example: compare the old C-style for loop vs a Python style for-each loop.

user-the-name · 5 years ago

(3) remembering the joke

echelon · 5 years ago

Per your downvotes - I used to hate jokes on Hacker News and downvote them when I saw them, but I've become more ambivalent. They're a way of amicably sharing culture and experiences with other engineers that transcend any differences in age, gender, race, background, etc.

The formulation of this joke I tend to see is,

The two hardest problems in programming:

(1) cache invalidation

(2) appropriately naming things

(3) off-by-one errors

slavik81 · 5 years ago

They had this bug before in other code unrelated to caching. To me, that suggests a deeper root.

de6u99er · 5 years ago

Strange. The guy who fixed the issue works at Microsoft, but uses his gmx email for Github.

enneff · 5 years ago

And the guy who announced the new Git release works for Google, but uses his pobox.com email for Git development.

eru · 5 years ago

I think you just got a glimpse of a vast sea of internal policy and compliance issues.

Deleted Comment

skrebbel · 5 years ago

Wait, there are people who use real, in-use email addresses in publicly hosted git repos? I mean, it's likely his spamcatcher address, no?

lixtra · 5 years ago

> (Surprise, the root cause is a cache)

Couldn’t it just as well be attributed to improper file path normalization? If we had only lower case ASCII file systems it would not have caused a problem.

yxhuvud · 5 years ago

Things can have more than one cause that act together.

That could be any Git repository.

Have you seen the mayhem that some of mine cause when you clone them and then type ./configure && make, like you have been socially engineered into doing?

pontifier · 5 years ago

It doesn't even have to be there... The main reasons to clone a repo are because you're about to compile and run the code there, or you already have and need to fix something.

I don't personally audit all the code I run, but I hope someone is doing it. That being said, source code being public is much better than the alternative of just downloading binaries from who knows where.

I don't trust anything absolutely, and I don't see a way past it.

minitech · 5 years ago

There is a huge difference between “clone a repo” and “clone a repo and run code from it”.

kazinator · 5 years ago

In spite of my tongue-in-cheek statement, I get it.

It's huge in the context of non-programming uses of Git. If some people are just sharing some text documents with Git, then it's a big deal.

This is likely on the rise.

E.g. if you look at a site like Github, there is a lot of non-code content in it. Some people stash that content, and other people believe that content to just be harmless files that will never perpetrate an exploit just from being cloned.

jtsiskin · 5 years ago

Technically yes, but I can’t think of the last time I cloned a repo without then running code from it...

remram · 5 years ago

For a while I tried to only run untrusted builds in Docker containers, like doing `docker run -v $PWD:/src node npm install`, but IDEs are not really configured to deal with this. Even my Vim has ALE and would just run node_modules/.bin/tsserver on my machine, which could be anything. Why aren't our tools concerned with this at all?

hctaw · 5 years ago

Because at a certain point you shrug and blame the user for downloading sketchy code and executing it.

bvendor · 5 years ago

I get that you are not completely serious, but before the cmake/meson/... people jump on this:

If ./configure is checked in as part of the official repository of a moderately well known project, I doubt any committer would be stupid enough to insert a backdoor into ./configure or the Makefiles.

What can happen if an apostate project is not on GitHub: Some (usually several) faithful persons decide to correct the situation and put multiple unofficial mirrors on GitHub, and other faithful people clone from a random one of these.

In that case however, they get what they deserve.