mirker (u/mirker) - Readit News

mirker commented on Dynamic Branch Prediction with Perceptrons (2000) [pdf] cs.cmu.edu/afs/cs/academi... · Posted by u/luu

bjourne · 3 years ago

Modern branch predictors are based on relatively simple state machines and already have >95% accuracy. Thus, even if machine learning-based predictors can sometimes beat them, it is not clear that they can beat them enough for the much more complicated circuitry they need to be worth it.

mirker · 3 years ago

One thing to point out is that the threshold of predictor complexity is dependent on the execution pipeline. A very speculative and deep architecture has a bigger need for better predictors, since it has a massive penalty when there is a misprediction.

mirker commented on Releasing 3B and 7B RedPajama together.xyz/blog/redpaja... · Posted by u/antimatter15

mirker · 3 years ago

Does anyone have experience using these open source models in production?

mirker commented on MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks ai.googleblog.com/2023/05... · Posted by u/mfiguiere

light_hue_1 · 3 years ago

That's Google.

I don't bother to read most Google papers unless someone tells me that they're doing something astounding. Just because I know I don't have access to their models, their code or their data. So what's the point?

As a community we need to stop accepting and stop citing papers like these.

There is no science without replicability, and it is literally impossible to replicate this work. It's not worth the paper it's printed on.

It's fine if Google wants to play with its toys at home. But we should stop pretending this is research of any value.

mirker · 3 years ago

There is a ton of value. OpenAI having proprietary LLMs single handedly pivoted the entire field to LLMs. A random GitHub repository doesn’t come close to impact.

mirker commented on MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks ai.googleblog.com/2023/05... · Posted by u/mfiguiere

anentropic · 3 years ago

Given that fact, why don't the paper authors just release the artefacts then?

If it's supposed to stay secret, what's the point of "here's instructions for how to reproduce our big secret"?

Presumably the societal purpose of papers is to share knowledge, and the individual purpose is to take credit and win prestige.

It seems like the first purpose would be better served by also publishing code etc, and the second purpose wouldn't be harmed by it?

mirker · 3 years ago

Because the authors don’t get a large reward for open sourcing the work and they stand to lose future value by lowering the gate to competition. You may want the code, but Google will not care (or it might dislike it).

Look at GPT-3+, OpenAI gets fame and fortune while people struggle to reproduce their last-gen models.

mirker commented on IBM to pause hiring in plan to replace 7,800 jobs with AI News reuters.com/technology/ib... · Posted by u/thesecretceo

Ancalagon · 3 years ago

I’m curious how IBM has the metrics to know the productivity multiplier of AI well enough this far in advanced to accurately replace exactly 7800 jobs. Is the plan in place to also provide every surviving HR manager an OpenAI subscription? It seems to early to me but I also don’t have the time or resources to know or even guess the multiplier effects here

Then again maybe they’re just using AI as an excuse to get rid of these jobs? Kind of like they did with layoffs only to them outsource many of their departments to other countries for cost savings?

mirker · 3 years ago

They asked Watson of course.

mirker commented on Geoffrey Hinton leaves Google and warns of danger ahead nytimes.com/2023/05/01/te... · Posted by u/ramraj07

nradov · 3 years ago

I read that pre-print Microsoft paper. Despite the title, it doesn't actually show any real "sparks" of AGI (in the sense of something that could eventually pass a rigorous Turing test). What the paper actually shows is that even intelligent people have a bias towards perceiving patterns in randomness; our brains seem to be wired that way and this is likely the source of most superstition.

https://arxiv.org/abs/2303.12712

While there is no scientific evidence that LLMs can reach AGI, they will still be practically useful for many other tasks. A human mind paired with an LLM is a powerful combination.

mirker · 3 years ago

Agreed.

Here’s the thing: the authors of that paper got early access to GPT-4 and ran a bunch of tests on it. The important bit is that MSR does not see into OpenAI’s sausage making.

Now imagine if you were a peasant from 1000 AD who was given a car or TV to examine. Could you really be confident you understood how it worked by just running experiments on it as a black box? If you give a non-programmer the linux kernel, will he/she think it’s magical?

Things look like magic especially when you can’t look under the hood. The story of the Mechanical Turk is one example of that.

mirker commented on Why did Google Brain exist? moderndescartes.com/essay... · Posted by u/brilee

oofbey · 3 years ago

TF1 was pretty rough to use, but beat the pants off Theano for usability, which was really the best thing going before it. Sure it was slow as dirt ("tensorslow") even though the awkward design was justified on being able to make it fast. But it was by far the best thing going for a long time.

Google really killed TF with the transition to TF2. Backwards incompatible everything? This only makes sense if you live in a giant monorepo with tools that rewrite everybody's code whenever you change an interface. (e.g. inside google). On the outside it took TF's biggest asset and turned it into a liability. Every library, blog post, stackoverflow post, etc talking about TF was now wrong. So anybody trying to figure out how to get started or build something was forced into confusion. Not sure about this, but I suspect it's Chollet's fault.

mirker · 3 years ago

The APIs were messed up early on, which is a reason TF2 happened. Every team started making their own random implementations of stuff. You had the TF Slim API, you had Keras, etc. The API just got fatter and fatter and then libraries would make cross dependencies to bake in the API mistakes.

mirker commented on Why did Google Brain exist? moderndescartes.com/essay... · Posted by u/brilee

bitL · 3 years ago

I think the main problem was debugging tensors on the fly, impossible with TF/Keras, but completely natural to PyTorch. Most researchers needed to sequentially observe what is going on in tensors (histograms etc.) and even doing backprop for their newly constructed layers by hand and that was difficult with TF.

mirker · 3 years ago

Nah, TF has had dynamic execution since TF2 and it’s still losing users, it seems. The execution model and API is simply more complicated. What’s a session, placeholder, constant, tensor, …? PyTorch was sold as numpy with GPU support and it is pretty close to that. JAX is an attempt to approach language simplicity and purity.

mirker commented on Why did Google Brain exist? moderndescartes.com/essay... · Posted by u/brilee

disgruntledphd2 · 3 years ago

Nope, Pytorch was inspired by the Lua version of Torch which well predates Tensor flow. To be fair, basically every other DL framework made the same mistake though.

Also, tensorflow was a total nightmare to install while Pytorch was pretty straightforward, which definitely shouldn't be discounted.

mirker · 3 years ago

PyTorch examples were also cleaner. torchvision had ResNet training batteries included while TF had role your own or clone some weird Keras repository.

mirker commented on Why did Google Brain exist? moderndescartes.com/essay... · Posted by u/brilee

qumpis · 3 years ago

I'm yet to see an ML PhD be required to learn chemistry to a similar extent that chemists would need to doing ML (especially at research level)

mirker · 3 years ago

I don’t understand what you mean. Here’s how many applied ML papers work: create a new dataset for a novel problem, download a PyTorch model, point model at dataset directory. Is it novel? By construction. Is the ML technique novel? No.