ericlewis (u/ericlewis)

ericlewis commented on Grok 4 Launch [video] twitter.com/xai/status/19... · Posted by u/meetpateltech

z7 · 7 months ago

How do you explain Grok 4 achieving new SOTA on ARC-AGI-2, nearly doubling the previous commercial SOTA?

https://x.com/arcprize/status/1943168950763950555

ericlewis · 7 months ago

I still dont understand why people point to this chart as any sort of meaning. Cost per task is a fairly arbitrary X axis and in no way representing any sort of time scale.. I would love to be told how they didn't underprice their model and give it an arbitrary amount of time to work.

ericlewis commented on Man suffers chemical burn that lasted months after squeezing limes arstechnica.com/health/20... · Posted by u/lisper

crazygringo · a year ago

Fun fact: this is common knowledge in Brazil since the "national drink" is a caipirinha made with fresh lime juice, so if you drink it on the beach, people will explain for you to be very careful -- if you spill a little accidentally on your leg and don't rinse it off and stay in the sun, you'll get a dark spot that will last for months. Happened to me once.

Ever since, I've been baffled as to why this seems like common knowledge there, yet nobody anywhere else seems to have the slightest idea that this is a thing as far as I can tell.

I've always wondered if anybody ever tried it intentionally to "tan". For me it looked just like a big spot that was super tan, significantly darker than actual tanning can get me. It didn't look unhealthy in any way, just darker. I have to assume it has negative effects though...

ericlewis · a year ago

US, GA here. My mom was big on tanning and warned of us this (lemons are also bad). I believe she said something about it being used on purpose for tanning, but that you had to be careful or you would badly burn. She probably did that around the late 80s or early 90s.

ericlewis commented on Private Cloud Compute Security Guide security.apple.com/docume... · Posted by u/djoldman

slashdave · a year ago

That's a strange point of view. Clearly one shouldn't use private information for testing in any production environment.

ericlewis · a year ago

As a person who works on this kinda stuff I know what they mean. It’s very hard to debug things totally blind.

ericlewis commented on Addition is all you need for energy-efficient language models arxiv.org/abs/2410.00907... · Posted by u/InvisibleUp

bobsyourbuncle · a year ago

I’m new to neural nets, when should one use fp8 vs fp16 vs fp32?

ericlewis · a year ago

Higher the precision the better. Use what works within your memory constraints.

ericlewis commented on OpenAI Threatening to Ban Users for Asking Strawberry About Its Reasoning futurism.com/the-byte/ope... · Posted by u/EgoIncarnate

AustinDev · a year ago

This seems like a fun attack vector. Find a service that uses o1 under the hood and then provide prompts that would violate this ToS to get their API key banned and take down the service.

ericlewis · a year ago

If you are using the user attribution with OpenAI (as you should) then they will block that users id and the rest of your app will be fine.

ericlewis commented on OpenAI threatens to revoke o1 access for asking it about its chain of thought twitter.com/SmokeAwayyy/s... · Posted by u/jsheard

darby_nine · a year ago

i don't write understand how that would lead to anything but a slightly different response. How can token prediction have this capability without explicitly enabling some heretofore unenabled mechanism? People have been asking this for years.

ericlewis · a year ago

The theory is that you increase the context with more relevant tokens to the problem at hand, as well as its solutions, which in theory makes it more likely to predict the correct solution.

ericlewis commented on Hardware Acceleration of LLMs: A comprehensive survey and comparison arxiv.org/abs/2409.03384... · Posted by u/matt_d

mikewarot · a year ago

I've always been partial to systolic arrays. I iterated through a bunch of options over the past few decades, and settled upon what I think is the optimal solution, a cartesian grid of cells.

Each cell would have 4 input bits, 1 each from the neighbors, and 4 output bits, again, one to each neighbor. In the middle would be 64 bits of shift register from a long scan chain, the output of which goes to 4 16:1 multiplexers, and 4 bits of latch.

Through the magic of graph coloring, a checkerboard pattern would be used to clock all of the cells to allow data to flow in any direction without preference, and without race conditions. All of the inputs to any given cell would be stable.

This allows the flexibility of an FPGA, without the need to worry about timing issues or race conditions, glitches, etc. This also keeps all the lines short, so everything is local and fast/low power.

What it doesn't do is be efficient with gates, nor give the fastest path for logic. Every single operation happens effectively in parallel. All computation is pipelined.

I've had this idea since about 1982... I really wish someone would pick it up and run with it. I call it the BitGrid.

ericlewis · a year ago

This reminds of a TPU.

ericlewis commented on Comma.ai: Refactoring for Growth blog.comma.ai/refactoring... · Posted by u/ppsreejith

azinman2 · 2 years ago

“FSD (Supervised) v12 upgrades the city-streets driving stack to a single end-to-end neural network trained on millions of video clips, replacing over 300K lines of explicit C++ code”

This was the changelog message lately.

https://www.teslaoracle.com/2024/07/13/tesla-fsd-v12-5-highw....

ericlewis · 2 years ago

I have one of these. It’s not there.

ericlewis commented on ChatGPT unexpectedly began speaking in a user's cloned voice during testing arstechnica.com/informati... · Posted by u/Brajeshwar

martin-t · 2 years ago

I don't understand how anybody can still claim LLMs show "complex reasoning".

It's been shown time and time again that they'll produce a correct chain of reasoning when given a problem (e.g. wolf, goat, cabbage crossing a river; 3 guards and a door; etc.) that is roughly similar to what's in the training data but will fail when given a sufficiently novel modification _while still producing output that is confidently incorrect_.

My own recent experience was asking ChatGPT 3.5 to encode an x86 instruction into binary. It produced the correct result and a page of reasoning which was mostly correct, except 2 errors which if made by a human would be described as canceling each other out.

But GPT didn't make 2 errors, that's anthropomorphizing it. A human would start from the input and use other information plus logical steps to produce the output. An LLM produces a stream of text that is statistically similar to what a human would produce. In this particular case, it's statistics just weren't able to cover the middle of the text well enough but happened to cover the end. There was no "complex reasoning" linking the statements of the text to each other through logical inferences, there was simply text that is statistically likely to be arranged in that way.

ericlewis · 2 years ago

Perhaps it’s because I know human beings that have the exact same operation and failure mode as the LLM here and I’m probably not the only one. Failing at something you’ve never seen and faking through it is a very human endeavor.

ericlewis commented on Tesla's FSD – A Useless Technology Demo tomverbeure.github.io/202... · Posted by u/nxten

drcode · 2 years ago

Maybe I'm just a bad driver, but it feels much safer to me having two entities paying attention, instead of just one

It's much more comfortable not to have to micromanage the exact position of the car at every moment, as a manual driver does

I think your "automatic driving" argument is wishful thinking

ericlewis · 2 years ago

My wife is capable of “automatic driving” and I am not. She describes it as a flow state.