Readit News logoReadit News
jl2718 commented on Stargate Project: SoftBank, OpenAI, Oracle, MGX to build data centers   apnews.com/article/trump-... · Posted by u/tedsanders
jl2718 · a year ago
1. At this scale, we’re not just talking about buying GPUs. It requires semiconductor fabs, assembly factories, power plants, batteries/lithium, cooling, water, hazardous waste disposal. These data centers are going to have to be massively geo-engineered arcologies.

2. What are they doing? AGI/ASI is a neat trick, but then what? I’m not asking because I don’t think there is an answer; I’m asking because I want the REAL answer. Larry Ellison was talking about RNA cancer vaccines. Well, I was the one that made the neural network model for the company with the US patent on this technique, and that pitch makes little sense. As the problem is understood today, the computational problems are 99% solved with laptop-class hardware. There are some remaining problems that are not solved by neural networks, but by molecular dynamics, which are done in FP64. Even if FP8 neural structure approximation speeds it up 100x, FP64 will be 99% of the computation. So what we today call “AI infrastructure” is not appropriate for the task they talk about. What is it appropriate for? Well, I know that Sam is a bit uncreative, so I assume he’s just going to keep following the “HER” timeline and make a massive playground for LLMs to talk to each other and leave humanity behind. I don’t think that is necessarily unworthy of our Apollo-scale commitment, but there are serious questions about the honest of the project, and what we should demand for transparency. We’re obviously headed toward a symbiotic merger where LLMs and GenAI are completely in control of our understanding of the world. There is a difference between watching a high-production movie for two hours, and then going back to reality, versus a never-ending stream of false sensory information engineered individually to specifically control your behavior. The only question is whether we will be able to see behind the curtain of the great Oz. That’s what I mean by transparency. Not financial or organizational, but actual code, data, model, and prompt transparency. Is this a fundamental right worth fighting for?

jl2718 commented on Nobody cares   grantslatton.com/nobody-c... · Posted by u/fzliu
azeirah · a year ago
> The McDonald's touch-screen self-order kiosk takes 27 clicks to get a meal. They try to up-sell you 3 times. Just let me pay for my fucking burger, Jesus Christ. The product manager, the programmer, the executives. None of these people care.

I was working in this space! And I got fired for refusing to work on more upsell features for clients like Coca Cola and such.

I don't want to work on adding fucking ADS into checkout. That is fucked up.

jl2718 · a year ago
I have an interesting anecdote about that. I was consulting for a very large tech company on their advertising product. They essentially wanted an upsell product to sell to advertisers, like a premium offering to increase their reach. My first step is always to establish a baseline by backtesting their algorithm against simple zeroth and first-order estimators. Measuring this is a little bit complicated, but it seemed their targeting was worse than naive-bayes by a large factor, especially with respect to customer conversion. I was a pretty good data scientist, but this company paid their DS people an awful lot of money, so I couldn’t have been the first to actually discover this. The short story is that they didn’t want a better algorithm. They wanted an upsell feature. I started getting a lot of work in advertising, and it took me a number of clients to see a general trend that the advertising business is not interested in delivering ads to the people that want the product. Their real interest is in creating a stratification of product offerings that are all roughly as valuable to the advertiser as the price paid for them. They have to find ways to split up the tranches of conversion probability and sell them all separately, without revealing that this is only possible by selling ad placements that are intentionally not as good as they could be. Note that this is not insider knowledge of actual policy, just common observations from analyzing data at different places.
jl2718 commented on CodeMic: A new way to talk about code   codemic.io/... · Posted by u/seansh
jl2718 · a year ago
> code is not literature

One thing I’ve thought about is how AI assistants are actually turning code into literature, and literature into code.

In old-fashioned programming, you can roughly observe a correlation between programmer skill and linear composition of their programs, as in, writing it all out at once from top to bottom without breaks. There was then this pre-modern era where that practice was criticized in favor of things like TDD and doc-first and interfaces, but it still probably holds on the subtasks of those methods. Now there are LLM agents that basically operate the same way. A stronger model will write all at once, while a weaker model will have to be guided through many stages of refinement. Also, it turns the programmer into a literary agent, giving prose descriptions piece by piece to match the capabilities of the model, but still in linear fashion.

And I can’t help but think that this points to an inadequacy of the language. There should be a programming language that enables arbitrary complexity through deterministic linear code, as humans seem to have an innate comfort with. One question I have about this is why postfix notation is so unpopular versus infix or prefix, where complex expressions in postfix read more like literature where details build up to greater concepts. Is it just because of school? Could postfix fix the stem/humanities gap?

I see LLMs as translators, which is not new because that’s what they were built for, but in this case between two very different structures of language, which is why they must grow in parameters with the size of the task rather than process linearly along a task with limited memory, as in the original spoken language to spoken language task. If mathematics and programming were more like spoken language, it seems the task would be massively simpler. So maybe the problem for us too is the language and not the intelligence.

jl2718 commented on Google support third-party tools in Gemini Code Assist   techcrunch.com/2024/12/17... · Posted by u/nancy_le
xnx · a year ago
Official announcement 3 days ago: https://cloud.google.com/blog/products/application-developme...

I'm curious what the Google Docs integration is. IDEs should be a lot more like Google Docs: cloud first, continuous save, multi-player, commenting, total revision history, etc. I would love to write working script code in Google Docs and instantly have access to it via a url.

jl2718 · a year ago
This is an awesome development. I don’t want to take anything away from the credit due to the product. But I really dislike these bloviated corporate press releases. It reads like a full article generated from a 1-sentence LLM prompt. Perhaps the Internet UX from here will be a competition between AI-based content generation, and AI-based summarization, essentially DECCO instead of CODEC. Kind of like how spam grew to consume 99% of email, so everybody has to run spam filters to get what they want. Technology and the abuse of it move together.
jl2718 commented on Ask HN: What AI tools changed your work/life?    · Posted by u/divan
bootstrpppin · a year ago
ChatGPT but not for the good

It's made me very lazy with my thinking and writing.

jl2718 · a year ago
This is the bigger reality. It’s turned almost all business and academic writing into long-winded meaningless trash. Well, more than it already was I guess. It seems that the way people use it is to expand few bits of information into many bits of content to convince others that work was done. It’s like the Turing test for laziness. The other issue is that it tends toward agreement on anything it wasn’t trained to specifically disagree about. I can see a smarter and more disagreeable bot doing much worse on LMSys than the sycophant models. Nothing new there I guess. But it’s spilling over to human norms as well, in that previously normal human deviation from chat model style interactions is anomalous, so everybody has to use the AI, and therefore nobody is providing any more value than the LLM, so everybody is getting laid off, except the disagreeable guy, and he gets fired first. It’s hacking us in the positive reinforcement vulnerabilities, ones that get worse the more they’re exploited, but it has none of the human resource constraints that previously kept them in check.
jl2718 commented on Why Chinese spies are sending a chill through Silicon Valley   telegraph.co.uk/business/... · Posted by u/RickJWagner
jl2718 · a year ago
This is a ridiculous take on espionage. “Technology” would be the lowest priority for collection at Google.
jl2718 commented on How Intel Missed the iPhone: The XScale Era   thechipletter.substack.co... · Posted by u/chmaynard
jl2718 · a year ago
Andy Grove flew in Clayton Christensen to let him talk for about 15 seconds before deciding that Intel would disrupt themselves by taking huge losses on Celeron. But Celeron did not save Intel; ASCII Red and multicore saved Intel. If he had actually read Clayton’s book, he would have understood that. Otellini got the disruption theory correct, and stayed out of mobile. But was that right? Maybe not in the current monetary environment where investment flows dwarf operating flows. A big mobile market could attract more investment than the losses it would generate. So disruption theory now works in reverse, and I’m not sure how far that implication goes.
jl2718 commented on Eric Schmidt deleted Stanford interview   youtube.com/watch?v=3f6XM... · Posted by u/zniturah
mikewarot · a year ago
At about 39 minutes in, he's asked if efforts analogous to seti at home can be used to get around the scaling problems with training the next round of big models.

He gives a strongly NVidia oriented answer that I happen to think is dead wrong. Pushing more and more GPU/Memory bandwidth into more and more expensive packages that are obsolete after a year or two isn't the approach that I think will win in the end.

I think systems which eliminate the memory/compute distinction completely, like FPGA but more optimized for throughput, instead of latency, are the way to go.

Imagine if you had a network of machines, that could each handle one layer of an LLM with no memory transfers, your bottleneck would be just getting the data between layers. GPT 4, for example, is likely a 8 separate columns of 120 layers of of 1024^2 parameter matrix multiplies. Assuming infinitely fast compute, you still have to transfer at least 2KB of parameters between layers for every token. Assuming PCI Express 7, at about 200 Gigabytes/second, that's about 100,000,000 tokens/second across all of the computing fabric.

Flowing 13 trillion tokens through that would take 36 hours/epoch.

Doing all of that in one place is impressive. But if you can farm it out, and have a bunch of CPUs and network connections, you're transferring 4k each way for each token from each workstation. It wouldn't be unreasonable to aggregate all of those flows across the internet without the need for anything super fancy. Even if it took a month/epoch, it could keep going for a very long time.

jl2718 · a year ago
I think you need higher algorithmic intensity. Gradient descent is best for monolithic GPUs. There could be other possibilities for layer-distributed training.
jl2718 commented on Napkin math suggests Bitcoin will perish unless its mining incentives change   keydiscussions.com/2024/0... · Posted by u/spenvo
_heimdall · 2 years ago
How is it different exactly? If I had 51% of the hashing power on the bitcoin network, couldn't I change block history and have a majority of the network agree on that new chain?
jl2718 · a year ago
No. If you had 51%, you could revert one block of history for every 49 blocks of attack time. In addition, you have no ability to create transactions that were not already signed by the owners, nor create bitcoins more than the block reward. This is because of the UTXO model rather than the state machine model. In Bitcoin, every transaction is verified against history, while the EVM chains only verify transactions against state. So if you control EVM state, you can bootstrap every new node to any state you wish, but UTXO verification requires rewriting the entire history.
jl2718 commented on Napkin math suggests Bitcoin will perish unless its mining incentives change   keydiscussions.com/2024/0... · Posted by u/spenvo
jl2718 · 2 years ago
“Bitcoin security” is a different notion than almost all other popular chains. A prolonged 51% attack on bitcoin implies the ability to double-spend, but not at all the ability to affect prior balances. A 51% attack on most smart contract chains implies the ability to change any and all state arbitrarily.

The simplest solution is to wait until the cost of hashing exceeds the value of your transaction by some reasonable factor. I expect that better solutions will come along by soft fork without adverse effect on supply or decentralization.

u/jl2718

KarmaCake day3462February 17, 2018View Original