remoquete (u/remoquete)

remoquete commented on Can LLMs do randomness? rnikhil.com/2025/04/26/ll... · Posted by u/whoami_nr

One thing to consider: we don’t know if these LLMs are wrapped with server-side logic that injects randomness (e.g. using actual code or external RNG). The outputs might not come purely from the model's token probabilities, but from some opaque post-processing layer. That’s a major blind spot in this kind of testing.

remoquete · 9 months ago

Agreed. These tests should be performed on local models.

remoquete commented on Sycophancy in GPT-4o openai.com/index/sycophan... · Posted by u/dsr12

thethethethe · 9 months ago

I'm not sure how this problem can be solved. How do you test a system with emergent properties of this degree that whose behavior is dependent on existing memory of customer chats in production?

remoquete · 9 months ago

Using prompts know to be problematic? Some sort of... Voight-Kampff test for LLMs?

theletterf commented on Sycophancy in GPT-4o openai.com/index/sycophan... · Posted by u/dsr12

theletterf · 9 months ago

Don't they test the models before rolling out changes like this? All it takes is a team of interaction designers and writers. Google has one.

Posted by u/theletterf 9 months ago

Get Weird and Disappear ludic.mataroa.blog/blog/g...

Posted by u/remoquete 10 months ago

Build your own tech writing tools using LLMs passo.uno/build-tech-writ...

Posted by u/remoquete 10 months ago

Show HN: A tool that lets tech writers test custom LLM prompts with context

remoquete commented on Show HN: I built an AI that turns GitHub codebases into easy tutorials github.com/The-Pocket/Tut... · Posted by u/zh2408

amelius · 10 months ago

I've said this a few times on HN: why don't we use LLMs to generate documentation? But then came the naysayers ...

remoquete · 10 months ago

Because you can't. See my previous comment. https://news.ycombinator.com/item?id=43748908

remoquete commented on Show HN: I built an AI that turns GitHub codebases into easy tutorials github.com/The-Pocket/Tut... · Posted by u/zh2408

kaycebasques · 10 months ago

My bet is that the combination of humans and language models is stronger than humans alone or models alone. In other words there's a virtuous cycle developing where the codebases that embrace machine documentation tools end up getting higher quality docs in the long run. For example, last week I tried out a codebase summary tool. It had some inaccuracies and I knew exactly where it was pulling the incorrect data from. I fixed that data, re-ran the summarization tool, and was satisfied to see a more accurate summary. But yes, it's probably key to keep human technical writers (like myself!) in the loop.

remoquete · 10 months ago

Indeed. Augmentation is the way forward.

remoquete commented on Show HN: I built an AI that turns GitHub codebases into easy tutorials github.com/The-Pocket/Tut... · Posted by u/zh2408

remoquete · 10 months ago

This is nice and fun for getting some fast indications on an unknown codebase, but, as others said here and elsewhere, it doesn't replace human-made documentation.

https://passo.uno/whats-wrong-ai-generated-docs/

remoquete commented on Claude Code: Best practices for agentic coding anthropic.com/engineering... · Posted by u/sqs

remoquete · 10 months ago

What's the Gemini equivalent of Claude Code and OpenAI's Codex? I've found projects like reugn/gemini-cli, but Gemini Code Assist seems limited to VS Code?

u/remoquete

KarmaCake day1189October 2, 2019

About

I'm a technical writer and aspiring Rust developer based in Barcelona, Spain

My site: https://passo.uno

View Original