grey-area (u/grey-area)

grey-area commented on My AI Adoption Journey mitchellh.com/writing/my-... · Posted by u/anurag

zx8080 · 3 days ago

> It's a little like using Bitcoin to replace currencies [...]

At least, Bitcoin transactions are deterministic.

Not many would want to use a AI currency (mostly works; always shows "Oh, you are 100% right" after losing one's money).

grey-area · 3 days ago

Sure bitcoin is at least deterministic, but IMO (an that of many in the finance industry) it's solving entirely the wrong problem - in practice people want trust and identity in transactions much more than they want distributed and trustless.

In a similar way LLMs seem to me to be solving the wrong problem - an elegant and interesting solution, but a solution to the wrong problem (how can I fool humans into thinking the bot is generally intelligent), rather than the right problem (how can I create a general intelligence with knowledge of the world). It's not clear to me we can jump from the first to the second.

grey-area commented on Claude Opus 4.6 anthropic.com/news/claude... · Posted by u/HellsMaddy

zaphirplane · 3 days ago

Why doesn’t you ask it and find out ;)

grey-area · 3 days ago

Because the model doesn't know but will happily tell a convincing lie about how it works.

grey-area commented on Claude Opus 4.6 anthropic.com/news/claude... · Posted by u/HellsMaddy

LanceJones · 3 days ago

Assuming this experiment involved isolating the LLM from its training set?

grey-area · 3 days ago

Of course it didn't. Not sure you really can do that - LLMs are a collection of weights from the training set, take away the training set and they don't really exist. You'd have to train one from scratch excluding these books and all excerpts and articles about them somehow, which would be very expensive and I'm pretty sure the OP didn't do that.

So the test seems like a nonsensical test to me.

grey-area commented on Claude Opus 4.6 anthropic.com/news/claude... · Posted by u/HellsMaddy

ck_one · 3 days ago

Just tested the new Opus 4.6 (1M context) on a fun needle-in-a-haystack challenge: finding every spell in all Harry Potter books.

All 7 books come to ~1.75M tokens, so they don't quite fit yet. (At this rate of progress, mid-April should do it ) For now you can fit the first 4 books (~733K tokens).

Results: Opus 4.6 found 49 out of 50 officially documented spells across those 4 books. The only miss was "Slugulus Eructo" (a vomiting spell).

Freaking impressive!

grey-area · 3 days ago

Surely the corpus Opus 4.6 ingested would include whatever reference you used to check the spells were there. I mean, there are probably dozens of pages on the internet like this:

https://www.wizardemporium.com/blog/complete-list-of-harry-p...

Why is this impressive?

Do you think it's actually ingesting the books and only using those as a reference? Is that how LLMs work at all? It seems more likely it's predicting these spell names from all the other references it has found on the internet, including lists of spells.

grey-area commented on We tasked Opus 4.6 using agent teams to build a C Compiler anthropic.com/engineering... · Posted by u/modeless

bdangubic · 3 days ago

I should have guessed someone would answer this question in this thread with Enron :)

I did not ask for random company that went under for any reason but specific question related to users and revenue.

grey-area · 3 days ago

Well there are lots and lots of examples that don't end in bankruptcy, just a very large loss of capital for investors. The majority of the stars of the dotcom bubble just as one example: Qualcomm, pets.com, Yahoo!, MicroStrategy etc etc.

Uber, which you cite as a success, is only just starting to make any money, and any original investors are very unlikely to see a return given the huge amounts ploughed in.

MicroStrategy has transformed itself, same company, same founder, similar scam 20 years later, only this time they're peddling bitcoin as the bright new future. I'm surprised they didn't move on to GAI.

Qualcomm is now selling itself as an AI first company, is it, or is it trying to ride the next bubble?

Even if GAI becomes a roaring success, the prominent companies now are unlikely to be those with lasting success.

grey-area commented on We tasked Opus 4.6 using agent teams to build a C Compiler anthropic.com/engineering... · Posted by u/modeless

ndesaulniers · 3 days ago

I spent a good part of my career (nearly a decade) at Google working on getting Clang to build the linux kernel. https://clangbuiltlinux.github.io/

This LLM did it in (checks notes):

> Over nearly 2,000 Claude Code sessions and $20,000 in API costs

It may build, but does it boot (was also a significant and distinct next milestone)? (Also, will it blend?). Looks like yes!

> The 100,000-line compiler can build a bootable Linux 6.9 on x86, ARM, and RISC-V.

The next milestone is:

Is the generated code correct? The jury is still out on that one for production compilers. And then you have performance of generated code.

> The generated code is not very efficient. Even with all optimizations enabled, it outputs less efficient code than GCC with all optimizations disabled.

Still a really cool project!

grey-area · 3 days ago

Isn't the AI basing what it does heavily on the publicly available source code for compilers in C though? Without that work it would not be able to generate this would it? Or in your opinion is it sufficiently different from the work people like you did to be classed as unique creation?

I'm curious on your take on the references the GAI might have used to create such a project and whether this matters.

grey-area commented on My AI Adoption Journey mitchellh.com/writing/my-... · Posted by u/anurag

keyle · 3 days ago

You need to put this revolution in scale with other revolutions.

How long did it take for horses to be super-seeded by cars?

How long did powertool take to become the norm for tradesmen?

This has gone unbelievably fast.

grey-area · 3 days ago

I think things can only be called revolutions in hindsight - while they are going on it's hard to tell if they are a true revolution, an evolution or a dead-end. So I think it's a little premature to call Generative AI a revolution.

AI will get there and replace humans at many tasks, machine learning already has, I'm not completely sure that generative AI will be the route we take, it is certainly superficially convincing, but those three years have not in fact seen huge progress IMO - huge amounts of churn and marketing versions yes, but not huge amounts of concrete progress or upheaval. Lots of money has been spent for sure! It is telling for me that many of the real founders at OpenAI stepped away - and I don't think that's just Altman, they're skeptical of the current approach.

PS Superseded.

grey-area commented on My AI Adoption Journey mitchellh.com/writing/my-... · Posted by u/anurag

LiamPowell · 3 days ago

> Compilers will produce working output given working input literally 100% of my time in my career.

In my experience this isn't true. People just assume their code is wrong and mess with it until they inadvertently do something that works around the bug. I've personally reported 17 bugs in GCC over the last 2 years and there are currently 1241 open wrong-code bugs.

Here's an example of a simple to understand bug (not mine) in the C frontend that has existed since GCC 4.7: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105180

grey-area · 3 days ago

These are still deterministic bugs, which is the point the OP was making. They can be found and solved once. Most of those bugs are simply not that important, so they never get attention.

LLMS on the other hand are non-deterministic and unpredictable and fuzzy by design. That makes them not ideal when trying to produce output which is provably correct - sure you can output and then laboriously check the output - some people find that useful, some are yet to find it useful.

It's a little like using Bitcoin to replace currencies - sure you can do that, but it includes design flaws which make it fundamentally unsuited to doing so. 10 years ago we had rabid defenders of these currencies telling us they would soon take over the global monetary system and replace it, nowadays, not so much.

grey-area commented on Nanobot: Ultra-Lightweight Alternative to OpenClaw github.com/HKUDS/nanobot... · Posted by u/ms7892

threethirtytwo · 4 days ago

Yep. But on HN, there's a huge cohort of people saying AI is useless.

Everyone sees the downsides but the upside is the one everyone is in denial about. It's like yeah, there's downsides but why is literally everyone using it?

grey-area · 4 days ago

Not everyone is using it.

grey-area commented on xAI joins SpaceX spacex.com/updates#xai-jo... · Posted by u/g-mork

grey-area · 6 days ago

This is financial engineering for an IPO, whatever spurious justifications are provided.