robbiemitchell (u/robbiemitchell)

robbiemitchell commented on · Posted by u/ameeromidvar

This appears to be an excuse to rep your own AI startup.

robbiemitchell commented on Can LLMs write better code if you keep asking them to “write better code”? minimaxir.com/2025/01/wri... · Posted by u/rcarmo

robbiemitchell · a year ago

I get a better first pass at code by asking it to write code at the level of a "staff level" or "principal" engineer.

For any task, whether code or a legal document, immediately asking "What can be done to make it better?" and/or "Are there any problems with this?" typically leads to improvement.

robbiemitchell commented on Show HN: Countless.dev – A website to compare every AI model: LLMs, TTSs, STTs countless.dev/... · Posted by u/ahmetd

robbiemitchell · a year ago

One helpful addition would be Requests Per Minute (RPM), which varies wildly and is critical for streaming use cases -- especially with Bedrock where the quota is account wide.

robbiemitchell commented on Don't Look Twice: Faster Video Transformers with Run-Length Tokenization rccchoudhury.github.io/rl... · Posted by u/jasondavies

robbiemitchell · a year ago

For training, would it be useful to stabilize the footage first?

robbiemitchell commented on AI's $600B Question sequoiacap.com/article/ai... · Posted by u/fh973

malfist · 2 years ago

Let's see: Microsoft Windows: wasn't close to the first OS

Microsoft Office: wasn't close to the first office editing suite

Google: Wasn't close to the first search engine

Facebook: Wasn't close to the first social media website

Apple: ~~First "smart phone"~~ but not the first personal computer. Comments reminded me that it wasn't the first smartphone

Netflix: Wasn't close to the first video rental service.

Amazon: Wasn't close to the first web store

None of the big five were first in their dominate categories. They were first to offer some gimmick (i.e., google was fast, netflix was by mail, no late fees), but not first categorically.

Though they certainly did benefit from learnings of those that came before them.

robbiemitchell · 2 years ago

> some gimmick

"key differentiator" and not necessarily easy to pull off or pay for

robbiemitchell commented on How Alexa dropped the ball on being the top conversational system mihaileric.com/posts/how-... · Posted by u/nutellalover

karaterobot · 2 years ago

> Amid this news, a former Alexa colleague messaged me: “You’d think voice assistants would have been our forte at Alexa.”

I assume the goal of Alexa was never to be the top conversational system on the planet, it was to sell more stuff on Amazon. Apple's approach to making a friendly and helpful chat assistant helps keep people inside their ecosystem, but it's not clear how any skill beyond "Alexa, buy more soap" was going to contribute meaningfully to Alexa's success as a product from Amazon's perspective. I saw the part about them having a "how good at conversation is it" metric, but that cannot be the metric that leadership actually cared about, it was always going to be "how much stuff did we sell off Alexa". In other words, Amazon did not ever appear to be in the race to make the best voice assistant, and I'm not sure why they would want to be.

robbiemitchell · 2 years ago

It wasn't even set up for success at selling.

After years of raising 3 kids, you would think if I ask to add diapers to the cart, it would know something. But no, it would just go with whatever is the top recommended, or first in a search, or something like that. Nothing using the brand or most recent sizes we purchased.

There was no serious attempt to drive real commerce. Instead, Alexa became full of recommendation slots that PMs would battle over. "I set that timer for you. Do you want to try the Yoga skill?"

On the other hand, they have taken on messy problems and solved them well, but not using technology, and for no real financial gain. For example, if you ask for the score of the Tigers game, Alexa has to reconcile which "Tigers" sports team you mean among both your own geography and the worldwide teams, at all levels from worldwide to local, across all sports, might have had games of interest. People worked behind the scenes to manage this manually, tracking teams of interest and filling intent slots daily.

robbiemitchell commented on What we've learned from a year of building with LLMs eugeneyan.com/writing/llm... · Posted by u/ViktorasJucikas

BoorishBears · 2 years ago

> Lack of JSON schema restriction is a significant barrier to entry on hooking LLMs up to a multi step process.

How are you struggling with this, let alone as a significant barrier? JSON adherence with a well thought out schema hasn't been a worry between improved model performance and various grammar based constraint systems in a while.

> Another is preventing LLMs from adding intro or conclusion text.

Also trivial to work around by pre-filling and stop tokens, or just extremely basic text parsing.

Also would recommend writing out Stream-Triggered Augmented Generation since the term is so barely used it might as well be made up from the POV of someone trying to understand the comment

robbiemitchell · 2 years ago

Asking even a top-notch LLM to output well formed JSON simply fails sometimes. And when you’re running LLMs at high volume in the background, you can’t use the best available until the last mile.

You work around it with post-processing and retries. But it’s still a bit brittle given how much stuff happens downstream without supervision.

robbiemitchell commented on What we've learned from a year of building with LLMs eugeneyan.com/writing/llm... · Posted by u/ViktorasJucikas

dbs · 2 years ago

Show me the use cases you have supported in production. Then I might read all the 30 pages praising the dozens (soon to be hundreds?) of “best practices” to build LLMs.

robbiemitchell · 2 years ago

Processing high volumes of unstructured data (text)… we’re using a STAG architecture.

- Generate targeted LLM micro summaries of every record (ticket, call, etc.) continually

- Use layers of regex, semantic embeddings, and scoring enrichments to identify report rows (pivots on aggregates) worth attention, running on a schedule

- Proactively explain each report row by identifying what’s unusual about it and LLM summarizing a subset of the microsummaries.

- Push the result to webhook

Lack of JSON schema restriction is a significant barrier to entry on hooking LLMs up to a multi step process.

Another is preventing LLMs from adding intro or conclusion text.