remilouf (u/remilouf)

Every AI Integration Is Held Together with Parsing Logic and Prayer blog.dottxt.co/do-one-thi...

remilouf commented on .txt raises $11.9M to make language models programmable techcrunch.com/2024/10/17... · Posted by u/cpfiffer

jart · a year ago

The llama.cpp --grammar flag just raised $12m from European investors.

remilouf · a year ago

This is actually pretty funny.

remilouf commented on Show HN: Infinite Testimonials with FastHTML, Outlines, and Modal github.com/aastroza/fasth... · Posted by u/cpfiffer

zero-sharp · 2 years ago

So it's a bullshit generator? Hopefully this is just for demos.

remilouf · 2 years ago

That’d be a pretty inefficient way to generate bullshit at scale

remilouf commented on · Posted by u/remilouf

remilouf · 2 years ago

LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.

remilouf commented on Tool Use (function calling) docs.anthropic.com/claude... · Posted by u/akadeb

rolisz · 2 years ago

my concern with grammar based sampling is that it makes the model dumber: after all, you are forcing it to say something else than what it thought would be best.

remilouf · 2 years ago

Looks like it’s quite the opposite: http://blog.dottxt.co/performance-gsm8k.html

remilouf commented on Structured Generation Improves LLM Performance: GSM8K Benchmark blog.dottxt.co/performanc... · Posted by u/Homunculiheaded

curionav · 2 years ago

Intuitively, regex or json grammar have a much lower "semantic dimension" than what today LLMs allow. Maybe the observed performance gains result from such lower dimensionality.

remilouf · 2 years ago

What do you mean by "semantic dimension"?

remilouf commented on Structured Generation Improves LLM Performance: GSM8K Benchmark blog.dottxt.co/performanc... · Posted by u/Homunculiheaded

remilouf · 2 years ago

That whole structured generation line of work looks promising. I hope someone else takes this and runs evaluations on other benchmarks. Curious to see if the results translate!

u/remilouf

KarmaCake day436May 14, 2017

About

Twitter: @remilouf GH: https://github.com/rlouf

View Original