inframouse (u/inframouse)

inframouse commented on GitLab Database Incident – Live Report docs.google.com/document/... · Posted by u/sbuttgereit

DanielDent · 9 years ago

I'm a huge Gitlab fan. But I long ago lost faith in their ability to run a production service at scale.

Nothing important of mine is allowed to live exclusively on Gitlab.com.

It seems like they are just growing too fast for their level of investment in their production environment.

One of the only reasons I was comfortable using Gitlab.com in the first place was because I knew I could migrate off it without too much disruption if I needed to (yay open source!). Which I ended up forced to do on short notice when their CI system became unusable for people who use their own runners (overloaded system + an architecture which uses a database as a queue. ouch.).

Which put an end to what seemed like constant performance issues. It was overdue, and made me sleep well about things like backups :).

A while back one of their database clusters went into split brain mode, which I could tell as an outsider pretty quickly... but for those on the inside, it took them a while before they figured it out. My tweet on the subject ended up helping document when the problem had started.

If they are going to continue offering Gitlab.com I think they need to seriously invest in their talent. Even with highly skilled folks doing things efficiently, at some point you just need more people to keep up with all the things that need to be done. I know it's a hard skillset to recruit for - us devopish types are both quite costly and quite rare - but I think operating the service as they do today seriously tarnishes the Gitlab brand.

I don't like writing things like this because I know it can be hard to hear/demoralizing. But it's genuine feedback that, taken in the kind spirit is intended, will hopefully be helpful to the Gitlab team.

inframouse · 9 years ago

I think they are running to catch up on the gitlab system itself, let alone running it as a production service. The bugs in the last few months have been epic. Backups not working, merge requests broken, chrome users seeing bugs, chaotic support. Basically their qa and release processes are not remotely enterprise ready.

inframouse commented on Google Cloud is 50% cheaper than AWS thehftguy.wordpress.com/2... · Posted by u/yarapavan

vgt · 9 years ago

You bring up a good point. Amazon does give you a better "vertical scaling" story. I'll still challenge you on the "breadth" when it comes to EC2 - the philosophy is just very different. Why do you need a "IO optimized instance" if you want just fast disk - that notion just seems very foreign and arbitrarily-constrained on Google Cloud.

You bring up Local SSD. Google's Local SSD is just badass by comparison:

- 680,000 Read and 360,000 Write IOPS included in the cost [0]

- $0.218 per GB per month. Instance cost is separate.

- Again, you can attach these to any instance type (hence the point on fragmentation of instances on EC2)

- AWS goes up to 365,000 Read and 315,000 "First Write" IOPS. Only if you buy an i2.8xlarge [2]

- An i2.8xlarge is $6.82 per hour.

You do the math :)

And someone else did more comparisons here [1]

[0] https://cloud.google.com/compute/docs/disks/performance

[1] https://medium.com/google-cloud/new-google-cloud-ssds-have-a...

[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/i2-instan...

inframouse · 9 years ago

The real problem I have is the low network performance. Yes, yes, before everyone jumps all over me and points to Jupiter etc.. I understand the problems in Pb/s bisection bandwidth for the large datacenters. That doesn't change the fact that I don't need an entire datacenter worth of stuff.. but I do need an Amdahl-balanced cluster. So big machines with wimpy (20Gb non-RDMA) networks prevent me doing my HPCish workloads on GCE.

Followed by waiting on GPUs and other user accessible accelerators of course.

inframouse commented on The Technical Interview Rift blog.techmasters.chat/the... · Posted by u/DonPellegrino

luhn · 9 years ago

The argument I usually hear is that many people don't do well under pressure or with others looking on. So by doing coding tests or whiteboard tests or whatever, you're selecting for people who do well in high-pressure situations, rather than people who are talented coders.

But I agree with you. I've interviewed many people who have an impressive resume and can talk the talk, yet can't even do Fizzbuzz.

inframouse · 9 years ago

On the one hand, there a significant number of people that can't fizzbuzz applying for the jobs.

On the other hand, there are a significant number of people that can only do the interviews and are not good engineers in general once hired.

This is made even worse by 1) the cottage industry teaching people to crack/break the interview process, and 2) the number of people hyping their personal "brand" via conf talks, standards nonsense, etc. as a way to sidestep more engineering vetting. (Not everyone does this of course -- but there's a significant number of people that do it just for career upside).

Personal anecdote as a hiring manager: I've found that some of the best interviewees but mediocre mid-level engineers are those with a history of low to mid-level jobs at the largest companies.

inframouse commented on Ex-Mozilla team behind smart home hub Sense refunds backers, focuses on software techcrunch.com/2016/06/08... · Posted by u/rdoherty

inframouse · 9 years ago

Right now this seems mostly to consist of javascript wrappers around existing platform libraries? E.g. to let me read a gpio pin or blink a led? Not that it's a bad thing but it seems a little skinny as a thing right now, and the pitch seems to be that more and more platform stuff arrives. Smells more like a replay for firefox os less innovating on the platform? Any insights?

inframouse commented on Upthere, a cloud storage service, wants to make file syncing a thing of the past theverge.com/2015/10/29/9... · Posted by u/dannylandau

jozzas · 10 years ago

Yeah, I must be missing the point as well.

"Upthere says it plans to offer an API in hopes that developers will make it the default storage solution for their own apps"

So... how is that in any way different to using the AWS API to upload stuff to S3? Being a "storage solution provider" (and reselling other companies' cloud offerings at that) has some razor thin margins and a whole lot of competition. I don't understand what makes these guys different.

inframouse · 10 years ago

They talk about running their own hardware so assume something like open compute or backblaze. But without the size or reliability of S3. Can't imagine they'll be able to do it at lower cost than Amazon either. Like you I have no idea why not use S3.

inframouse commented on Upthere, a cloud storage service, wants to make file syncing a thing of the past theverge.com/2015/10/29/9... · Posted by u/dannylandau

inframouse · 10 years ago

Reading through the spin.. it's just an object store and pubsub with some demo apps? Like S3 and SNS on Amazon or Google Datastore and Cloud pubsub? That's it? Why would anyone use this instead of AWS or Google?

inframouse commented on An Inside Look at Upthere, the Company Aiming to Be Your Personal Cloud techcrunch.com/2015/10/29... · Posted by u/prostoalex

inframouse · 10 years ago

Reading the article makes it sound like apps that upload to object store backend and that they installed some pub sub system to push notifications. Takes 4 years? Seems like spin.

I interviewed there and glassdoor review is accurate for backend. No SRE and my SWE interviewers had obviously never built production scale services. Some managers had run backups. They made a big deal of custom this and custom that but you sign an nda on details. Hints are in public like engs who work there submitting patches from personal gmail for ceph though. My impression was many engs leaving and those staying taking over but not understanding the why as well as how.