Readit News logoReadit News
mirker commented on Dynamic Branch Prediction with Perceptrons (2000) [pdf]   cs.cmu.edu/afs/cs/academi... · Posted by u/luu
bjourne · 3 years ago
Modern branch predictors are based on relatively simple state machines and already have >95% accuracy. Thus, even if machine learning-based predictors can sometimes beat them, it is not clear that they can beat them enough for the much more complicated circuitry they need to be worth it.
mirker · 3 years ago
One thing to point out is that the threshold of predictor complexity is dependent on the execution pipeline. A very speculative and deep architecture has a bigger need for better predictors, since it has a massive penalty when there is a misprediction.
mirker commented on Releasing 3B and 7B RedPajama   together.xyz/blog/redpaja... · Posted by u/antimatter15
mirker · 3 years ago
Does anyone have experience using these open source models in production?
mirker commented on MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks   ai.googleblog.com/2023/05... · Posted by u/mfiguiere
light_hue_1 · 3 years ago
That's Google.

I don't bother to read most Google papers unless someone tells me that they're doing something astounding. Just because I know I don't have access to their models, their code or their data. So what's the point?

As a community we need to stop accepting and stop citing papers like these.

There is no science without replicability, and it is literally impossible to replicate this work. It's not worth the paper it's printed on.

It's fine if Google wants to play with its toys at home. But we should stop pretending this is research of any value.

mirker · 3 years ago
There is a ton of value. OpenAI having proprietary LLMs single handedly pivoted the entire field to LLMs. A random GitHub repository doesn’t come close to impact.
mirker commented on MaMMUT: A simple vision-encoder text-decoder architecture for multimodal tasks   ai.googleblog.com/2023/05... · Posted by u/mfiguiere
anentropic · 3 years ago
Given that fact, why don't the paper authors just release the artefacts then?

If it's supposed to stay secret, what's the point of "here's instructions for how to reproduce our big secret"?

Presumably the societal purpose of papers is to share knowledge, and the individual purpose is to take credit and win prestige.

It seems like the first purpose would be better served by also publishing code etc, and the second purpose wouldn't be harmed by it?

mirker · 3 years ago
Because the authors don’t get a large reward for open sourcing the work and they stand to lose future value by lowering the gate to competition. You may want the code, but Google will not care (or it might dislike it).

Look at GPT-3+, OpenAI gets fame and fortune while people struggle to reproduce their last-gen models.

mirker commented on IBM to pause hiring in plan to replace 7,800 jobs with AI News   reuters.com/technology/ib... · Posted by u/thesecretceo
Ancalagon · 3 years ago
I’m curious how IBM has the metrics to know the productivity multiplier of AI well enough this far in advanced to accurately replace exactly 7800 jobs. Is the plan in place to also provide every surviving HR manager an OpenAI subscription? It seems to early to me but I also don’t have the time or resources to know or even guess the multiplier effects here

Then again maybe they’re just using AI as an excuse to get rid of these jobs? Kind of like they did with layoffs only to them outsource many of their departments to other countries for cost savings?

mirker · 3 years ago
They asked Watson of course.
mirker commented on Geoffrey Hinton leaves Google and warns of danger ahead   nytimes.com/2023/05/01/te... · Posted by u/ramraj07
nradov · 3 years ago
I read that pre-print Microsoft paper. Despite the title, it doesn't actually show any real "sparks" of AGI (in the sense of something that could eventually pass a rigorous Turing test). What the paper actually shows is that even intelligent people have a bias towards perceiving patterns in randomness; our brains seem to be wired that way and this is likely the source of most superstition.

https://arxiv.org/abs/2303.12712

While there is no scientific evidence that LLMs can reach AGI, they will still be practically useful for many other tasks. A human mind paired with an LLM is a powerful combination.

mirker · 3 years ago
Agreed.

Here’s the thing: the authors of that paper got early access to GPT-4 and ran a bunch of tests on it. The important bit is that MSR does not see into OpenAI’s sausage making.

Now imagine if you were a peasant from 1000 AD who was given a car or TV to examine. Could you really be confident you understood how it worked by just running experiments on it as a black box? If you give a non-programmer the linux kernel, will he/she think it’s magical?

Things look like magic especially when you can’t look under the hood. The story of the Mechanical Turk is one example of that.

mirker commented on Why did Google Brain exist?   moderndescartes.com/essay... · Posted by u/brilee
oofbey · 3 years ago
TF1 was pretty rough to use, but beat the pants off Theano for usability, which was really the best thing going before it. Sure it was slow as dirt ("tensorslow") even though the awkward design was justified on being able to make it fast. But it was by far the best thing going for a long time.

Google really killed TF with the transition to TF2. Backwards incompatible everything? This only makes sense if you live in a giant monorepo with tools that rewrite everybody's code whenever you change an interface. (e.g. inside google). On the outside it took TF's biggest asset and turned it into a liability. Every library, blog post, stackoverflow post, etc talking about TF was now wrong. So anybody trying to figure out how to get started or build something was forced into confusion. Not sure about this, but I suspect it's Chollet's fault.

mirker · 3 years ago
The APIs were messed up early on, which is a reason TF2 happened. Every team started making their own random implementations of stuff. You had the TF Slim API, you had Keras, etc. The API just got fatter and fatter and then libraries would make cross dependencies to bake in the API mistakes.
mirker commented on Why did Google Brain exist?   moderndescartes.com/essay... · Posted by u/brilee
bitL · 3 years ago
I think the main problem was debugging tensors on the fly, impossible with TF/Keras, but completely natural to PyTorch. Most researchers needed to sequentially observe what is going on in tensors (histograms etc.) and even doing backprop for their newly constructed layers by hand and that was difficult with TF.
mirker · 3 years ago
Nah, TF has had dynamic execution since TF2 and it’s still losing users, it seems. The execution model and API is simply more complicated. What’s a session, placeholder, constant, tensor, …? PyTorch was sold as numpy with GPU support and it is pretty close to that. JAX is an attempt to approach language simplicity and purity.
mirker commented on Why did Google Brain exist?   moderndescartes.com/essay... · Posted by u/brilee
disgruntledphd2 · 3 years ago
Nope, Pytorch was inspired by the Lua version of Torch which well predates Tensor flow. To be fair, basically every other DL framework made the same mistake though.

Also, tensorflow was a total nightmare to install while Pytorch was pretty straightforward, which definitely shouldn't be discounted.

mirker · 3 years ago
PyTorch examples were also cleaner. torchvision had ResNet training batteries included while TF had role your own or clone some weird Keras repository.
mirker commented on Why did Google Brain exist?   moderndescartes.com/essay... · Posted by u/brilee
qumpis · 3 years ago
I'm yet to see an ML PhD be required to learn chemistry to a similar extent that chemists would need to doing ML (especially at research level)
mirker · 3 years ago
I don’t understand what you mean. Here’s how many applied ML papers work: create a new dataset for a novel problem, download a PyTorch model, point model at dataset directory. Is it novel? By construction. Is the ML technique novel? No.

u/mirker

KarmaCake day247May 29, 2021View Original