It’s strange how AI style is so easy to spot. If LLMs just follow the style that they encountered most frequently during training, wouldn’t that mean that their style would be especially hard to spot?
It’s strange how AI style is so easy to spot. If LLMs just follow the style that they encountered most frequently during training, wouldn’t that mean that their style would be especially hard to spot?
This paper, for example, uses the 'dual N-back test' as part of its evaluation. In humans this relates to variation in our ability to use working memory, which in humans relates to 'g'; but it seems pretty meaningless when applied to transformers -- because the task itself has nothing intrinsically to do with intelligence, and of course 'dual N-back' should be easy for transformers -- they should have complete recall over their large context window.
Human intelligence tests are designed to measure variation in human intelligence -- it's silly to take those same isolated benchmarks and pretend they mean the same thing when applied to machines. Obviously a machine doing well on an IQ test doesn't mean that it will be able to do what a high IQ person could do in the messy real world; it's a benchmark, and it's only a meaningful benchmark because in humans IQ measures are designed to correlate with long-term outcomes and abilities.
That is, in humans, performance on these isolated benchmarks is correlated with our ability to exist in the messy real-world, but for AI, that correlation doesn't exist -- because the tests weren't designed to measure 'intelligence' per se, but human intelligence in the context of human lives.
Surely you can appreciate that if the next stop on the journey of technology can take over the process of improvement itself that would make it an awfully notable stop? Maybe not "destination", but maybe worth the "endless conversation"?
Does it actually work? Isn’t AI training so far simply ignores all license and copyright restrictions completely?
Also choosing to close schools during COVID was as catastrophic as many predicted. Our kid was in 7th grade during COVID and teachers each year report the effects are still being felt across many students. Of course, naturally great students recovered quickly and innately poor students remained poor but the biggest loss was in the large middle of B/C students.
https://www.goodmorningamerica.com/family/story/author-sugge...
> HTTP cookies were never intended for session management
Seems odd. IIRC that's exactly what they were meant for. State management for http which is stateless. Am I missing some history here?
Adopt who? There is almost no children available for adoption, only highly handicapped children who needs an auxiliary family.
Might be easier with a donor egg, but where are you going to get that? Egg donation is highly regulated and many would find it hard to get a donor. Of course this solution also requires a donor egg, so you'd already need to have that available.
This is not true, at least in the United States. For one thing, there are many children in foster care who want to be adopted. It is also possible, though difficult and expensive, to adopt infants from mothers giving up their children for adoption as well. I am not saying it's an easy option or that everyone should do it, but it is an option.
- Is the work easier to do? I feel like the work is harder.
- Is the work faster? It sounds like it’s not faster.
- Is the resulting code more reliable? This seems plausible given the extensive testing, but it’s unclear if that testing is actually making the code more reliable than human-written code, or simply ruling out bugs an LLM makes but a human would never make.
I feel like this does not look like a viable path forward. I’m not saying LLMs can’t be used for coding, but I suspect that either they will get better, to the point that this extensive harness is unnecessary, or they will not be commonly used in this way.
The author didn't discuss the speed of the work very much. It is certainly true that LLMs can write code faster than humans, and sometimes that works well. What would be nice is an analysis of the productivity gains from LLM-assisted coding in terms of how long it took to do an entire project, start to finish.