puzzle (u/puzzle) - Readit News

puzzle commented on Is Conference Room Air Making Us Dumber? nytimes.com/2019/05/06/he... · Posted by u/bookofjoe

mettamage · 6 years ago

I just want to plug a client that I worked for that solves this problem: Healthy Workers.

Healthy Workers is a Amsterdam-based startup that measures thing such as: air quality, CO2 and consented employee data (e.g. their sleep and focus) and makes an analysis what parts of the building have an unconductive work environment and how this can be improved.

Conference rooms with bad air are the first problem they look at.

They are hiring for a head of sales and a product designer: https://healthyworkers.recruitee.com/

puzzle · 6 years ago

Google has tracked office air quality for years, e.g. through the Aclima partnership.

That's basically because Larry Page really, really cares about it. He's kinda like your friend with a Kubrick obsession that can't stop bringing up facts:

https://twitter.com/elonmusk/status/727189428142235648

He was on to something! Jokes aside, I think he just has a heightened sense of smell and that's why he had air filters stronger than law requirements installed everywhere, at least in Mountain View.

puzzle commented on Practices for writing high-performance Go github.com/dgryski/go-per... · Posted by u/ingve

shereadsthenews · 6 years ago

You are thinking of dl.google.com. It was a Go program that replaced a very old single-threaded C++ server (using the ancient SelectServer C++ core, deprecated at that time). The thing you have to realize about Google infrastructure is it does not require vertical scalability of its service backends. It is very typical to write a program and deploy it with 100 replicas having one CPU each on 100 different machines. A variety of load balancers (all written in C++, naturally) papers over the complexity. Nobody at Google expects a Go program to occupy an entire machine because Borg packs hundreds of services onto a single machine. The question is for _you_ do _you_ have Borg or another workload coordinator that allows you to do this? Or do you have "the database machine" and "the server" where you expect individual processes to scale up to many cores?

BTW the reason dl.google.com rewrite was faster was not because it was in Go, it was because the C++ server was serving off its local disk and the rewrite was serving off a cluster file system with ~infinite I/O capabilities. Apples and oranges.

puzzle · 6 years ago

It was even crazier: when the original download server was written, local disk was faster, mainly because the network wasn't too fast (rack locality was a concern, way back when), but also because GFS chunk servers weren't, either. At the time of the rewrite, Firehose and co. were being deployed everywhere, D did a better job at serving bytes and, later, local disk use was placed in a lower QOS level. Unless you were one of the few teams that had a good rationale for dedicated machines, if you fought for I/O time on a given disk against D, invariably you lost.

puzzle commented on Cost of serving billions of images per month medium.com/p/f499620a14d0... · Posted by u/ghoshbishakh

m0zg · 6 years ago

Why would it depend on request rate rather than upload rate? Just pre-scale several variants on upload. Then run an MR job every now and then and delete the variants that haven't been touched for more than e.g. a month. Scale those on demand, store recents and frequently accessed pre-scaled. This is literally a few days of work.

puzzle · 6 years ago

I bet there's a very long tail, i.e. the majority of images are rarely accessed, if ever.

puzzle commented on Cost of serving billions of images per month medium.com/p/f499620a14d0... · Posted by u/ghoshbishakh

no1youknowz · 6 years ago

I like Unsplash. They and Pixabay are what I use for apps I have developed in the past.

But simply, this floors me. I looked at their costs and with some developer muscle, you could find savings such as:

- Move Fastly to Cloudflare. They don't expressly say what the cost is for that. But moving to CF would eliminate it.

- Move Heroku to Digital Ocean. It's not difficult to create a fully redundant solution.

- Move from Imgix to having Golang micro services which handle the resizing of images and use something like Belugacdn for the CDN. Beluga is $5k a month for a PB. (Or some other cheaper CDN if you don't like Beluga, but damn... imgix pricing)

I'm pretty sure that a savvy CTO could save at least $50k a month with a well designed project that does this over many months and achieves the same result and keeps the redundancy concerns for the small team.

I do realise however, why the team have done this. In the same position (with very few resources) I would probably have done the same. But damn, when a cost of a service gets up to a years salary ($120k) for a good developer. Time to seek alternatives.

puzzle · 6 years ago

Resizing images in Golang is not as trivial as you might imagine at first. You need to be able to handle all sorts of formats and color spaces. You'd be surprised by the kind of weird garbage your users will upload. Then you need to use SIMD, not pure Golang solutions, or performance will suffer. So you end up adopting a wrapper around libvips or similar, at which point you will start to ponder if you should have stuck with C/C++ in the first place. (It all depends on how much of Go's features you use or if it's just a nicer, safer C for you.)

puzzle commented on DuckDuckGo Proposes the “Do-Not-Track Act of 2019” searchengineland.com/duck... · Posted by u/chdaniel

rvnx · 6 years ago

yegg (CEO @ DDG): "We've actually been using Yahoo technology along with our own and others since the very beginning of DuckDuckGo. Over the past year though we've been working on a stronger partnership with Yahoo so we can get access to more features like date filters that everyone has been asking for (that one in particular is our most requested feature by far).

With regards to the ads, nothing is changing in terms of ad privacy/tracking or our privacy policy in general. Ads should just become more relevant."

Have you noticed "improving.duckduckgo.com", the analytics service that logs all requests ?

"To be clear, this means we cannot ever tell what individual people are doing since everyone is anonymous"

Oh, except the IP address, right. Fun fact (experiment now removed): https://web.archive.org/web/20180910042004im_/http://image.b...

puzzle · 6 years ago

But Yahoo search itself has been, for the past four years, a mishmash of Bing, Google and Yahoo results. I wonder which mix of the three Yahoo will pass along when using its API, versus using its search page.

puzzle commented on Google Will Soon Let Users Automatically Scrub Location and Web History buzzfeednews.com/article/... · Posted by u/siberianbear

ben_jones · 6 years ago

I value your contribution to this thread, can I ask some follow-ups?

Does Google have multiple "Deletion policies" such that deleting data from i.e. your GCP bucket follows one policy, and the "scrubbing" described in this article follows an entirely different policy? If so, do different deletion policies have different processes and different audit trails such that the end "deleted" state is subjective and controlled by the engineering and managerial oversight of the engineering/leadership team of that given product(s)?

From my (naive) opinion, it must be really, really, hard to for example, retrain every ML model that a now deleted datapoint ever touched. Its hard too to believe that, at some high level in Alphabet's org, there is no motivation to have the positive PR of feature(s) like this, but still at essence not delete the parts of the data trail that significantly drive Google's revenue. Do these datapoints significantly impact Google's revenue?

puzzle · 6 years ago

Google retrains models all the time. They gave a presentation about ML and production last year:

https://www.usenix.org/conference/srecon18asia/presentation/...

You can see there's a section on privacy and deleted data as well.

Each team has its own policies, because each product is different: at a bare minimum they might be using different storage systems, but it's very likely that their data pipelines are quite different, too. In any case, each team's targets are at least as strict as any published ones, of course.

puzzle commented on Google Will Soon Let Users Automatically Scrub Location and Web History buzzfeednews.com/article/... · Posted by u/siberianbear

loudtieblahblah · 6 years ago

I simply do not believe this.

Unless they're audited by a source that can be trusted and have the findings made public, I will not believe it either.

puzzle · 6 years ago

I'm an ex employee. There's actually a whole team whose sole job is making sure that all other teams have policies, measurements and alerting for deleting data. They'll chase you if any of the above doesn't hold. It's non-trivial work and slows down your development, if you believe in releasing early and fast. For everybody else, it makes total sense.

I bet it's not free to run, but it's cheaper and easier than elsewhere, because Google's infrastructure is built in-house and mostly integrated. I don't envy other companies that want to do the same.

puzzle commented on Google Will Soon Let Users Automatically Scrub Location and Web History buzzfeednews.com/article/... · Posted by u/siberianbear

o10449366 · 6 years ago

I wouldn't be surprised if that's a side effect of Google's plan to "replace" Play Music with YouTube Music, even though they're two completely different services.

puzzle · 6 years ago

Maybe they're the same service behind the scenes. Would you store the same music twice, if you had to run both?

puzzle commented on Google Will Soon Let Users Automatically Scrub Location and Web History buzzfeednews.com/article/... · Posted by u/siberianbear

chris_mc · 6 years ago

Yep, Google Maps still shows my home and work locations on the Commute tab from before I disabled almost everything in my Google account, but won't let me change the work address since we moved locations. So I guess when I change houses (the only one I really care about, since I use it to send an ETA to my wife) I can just turn everything on, change the address, then turn it all off again. It's illogical dark patterns like this that made me start detaching from Google, and will probably drive me to buy a non-Android phone next time, although I detest Apple UI even if their quality is usually great.

puzzle · 6 years ago

Is this on desktop or mobile? It could match the theory in the grandparent post that they preferred sticking to one backend. That also allows handling conflicts in one place, with one protocol. E.g. what would the behaviour be if you edited the home address with a ZIP code on your phone, while offline? What if you try to make the same change from your laptop and e.g. you set a ZIP+4 code? And then what happens when your phone is online again?