But to say that they don't write any code at all, it's really stretched. Maybe I'm not good enough at AI-assisted and vibe coding, but code-quality always seems to drop down really hard the moment one steps a bit outside the common patterns.
But to say that they don't write any code at all, it's really stretched. Maybe I'm not good enough at AI-assisted and vibe coding, but code-quality always seems to drop down really hard the moment one steps a bit outside the common patterns.
“The question of whether a computer can think is no more interesting than the question of whether a submarine can swim.”
I am not interested in computers that have their own intelligence but I do want computers that increase my own intelligence.If I had an AGI that designs me a safe, small and cheap fusion reactor, of course I would be interested in that.
My intelligence is intrinsically limited by my biology. The only way to really scale it up is to wire stuff into my brain, and I'd prefer an AGI over that every day.
Edit: answered the question
Deep Learning-based methods will absolutely have a place in this in the future, but today's machines are usually classic methods. Advantages are that the hardware is much cheaper and requires less electric and thermal management. This changes these days with cheaper NPUs, but with machine lifetimes measured in decades, it will take a while.
I believe in FOSS and can make an argument that lots of people on HN will accept, but many outside this context will not understand it or care.
This is like complaining about long division not behaving nicely when dividing by 0. The algorithm isn't the problem, and blaming the wrong part does not help understanding.
It distracts from what is actually helping which is using different functions with nicer behaving gradients, e.g., the Huber loss instead of quadratic.
Fully agree. It's not the "fault" of Backprop. It does what you tell it to do, find the direction in which your loss is reduced the most. If the first layers get no signal because the gradient vanishes, then the reason is your network layout: Very small modifications in the initial layers would lead to very large modifications in the final layers (essentially an unstable computation), so gradient descend simply cannot move that fast.
Instead, it's a vital signal for debugging your network. Inspecting things like gradient magnitudes per layer shows you might have vanishing or exploding gradients. And that has lead to great inventions how to deal with that, such as residual networks and a whole class of normalization methods (such as batch normalization).
I've dreamed of a NeRF-powered backrooms walking simulator for quite a while now. This approach is "worse" because the mesh seems explicit rather than just the world becoming what you look at, but that's arguably better for real-world use cases of course.
True, it sounds (and looks) a lot like https://scp-wiki.wikidot.com/scp-3008
Then I realized: the internet; the power-grid (at least in most developed countries); there are things that don't actually fail catastrophically, even though they are extremely complex, and not always built by efficient organizations. Whats the retort to this argument?
(I think this was published in one of Llang's papers but in a rather obscure language.)