Readit News logoReadit News
blake929 commented on Retroformer: Retrospective Large Language Agents   arxiv.org/abs/2308.02151... · Posted by u/blake929
blake929 · 2 years ago
Abstract: Recent months have seen the emergence of a powerful new trend in which large language models (LLMs) are augmented to become autonomous language agents capable of performing objective oriented multi-step tasks on their own, rather than merely responding to queries from human users. Most existing language agents, however, are not optimized using environment-specific rewards. Although some agents enable iterative refinement through verbal feedback, they do not reason and plan in ways that are compatible with gradient-based learning from rewards. This paper introduces a principled framework for reinforcing large language agents by learning a retrospective model, which automatically tunes the language agent prompts from environment feedback through policy gradient. Specifically, our proposed agent architecture learns from rewards across multiple environments and tasks, for fine-tuning a pre-trained language model which refines the language agent prompt by summarizing the root cause of prior failed attempts and proposing action plans. Experimental results on various tasks demonstrate that the language agents improve over time and that our approach considerably outperforms baselines that do not properly leverage gradients from the environment. This demonstrates that using policy gradient optimization to improve language agents, for which we believe our work is one of the first, seems promising and can be applied to optimize other models in the agent architecture to enhance agent performances over time.
blake929 commented on Attention Is Off By One   evanmiller.org/attention-... · Posted by u/elbasti
blake929 · 2 years ago
Some very interesting discussion of outlier features and quantization: https://timdettmers.com/2022/08/17/llm-int8-and-emergent-fea...

* Outlier values are used to prune values. * Transformers seem to undergo a "phase shift" in how outlier features are treated around 6.7B parameters. This could complicate research on removing them.

Maybe you and Tim Dettmers would have a lot to talk about :)

blake929 commented on Oregon decriminalized hard drugs – early results aren’t encouraging   theatlantic.com/politics/... · Posted by u/slapshot
tlogan · 2 years ago
I used to strongly support making drugs legal. I thought: this is a free country, you should be able to do what you want.

But what I've seen in San Francisco has made me think differently. Most people who use drugs eventually end up not being able to live like normal adults. And no one willingly goes to get help or treatment.

The problem will stick around because politicians care more about how things look. They'll say the numbers are wrong, or focus on wedge issues like transgender, guns, but they're not going to do anything on hard issues like this one.

Does anyone have ideas on what we should do? Should we make drugs illegal again and force people into rehab? Should we require drug tests for homeless people to receive government help like SF CAAP payments?

blake929 · 2 years ago
I'm not sure SF is a good example. It's not a healthy city, but it's problems go way beyond drug use and it doesn't have the same policies as what Oregon adopted.
blake929 commented on Tesla created secret team to suppress thousands of driving range complaints   reuters.com/investigates/... · Posted by u/mfiguiere
blake929 · 2 years ago
A lot of comments are discussing the difficulty in estimating range accurately or how all EPA estimates are inflated. But the article claims Tesla knowingly uses an algorithm with inflated numbers and swaps the rost estimate out for a more accurate estimate at 50% charge. That's different than a good faith attempt at estimating range and a dark pattern.
blake929 commented on Tesla created secret team to suppress thousands of driving range complaints   reuters.com/investigates/... · Posted by u/mfiguiere
nicpottier · 2 years ago
I think it's basically some PM taking a stand at Tesla with the fact that there's just no good estimate possible without knowing the route, so they will always show EPA estimate.

Any sane a Tesla driver changes that to show percent battery instead of range.

As others have said the trip planner is excellent, within a percent or two even in winter conditions driving over passes etc..

Note that range changes a lot more due to elevation, speed, cold etc, compared to gas cars because electrics are so much more efficient.

blake929 · 2 years ago
The article says it was a mandate from Elon Musk. Not sure I believe the claim, but I'd also be surprised if he wasn't aware just how optimistic the EPA estimate is.
blake929 commented on SpaceX Starship rocket explodes minutes after launch from Texas   apnews.com/article/spacex... · Posted by u/fnordpiglet
blake929 · 2 years ago
I get that there are multiple endpoints being tested here and some of them may have been satisfied, but I feel like I'm living in a bizarro world on hacker news where people are arguing that multiple failed engines, a failed stage detachment and an exploding rocket worth millions of dollars and and carrying a large number of limited supply raptor engines is a "massive success". Can we just call it for what it is? Mixed results maybe? Is that not a fair assessment?

u/blake929

KarmaCake day101July 13, 2020View Original