rdli (u/rdli) - Readit News

rdli commented on Ask HN: Who is hiring? (September 2025) · Posted by u/whoishiring

rdli · 5 hours ago

Polar Sky | Founding AI Lead | Bay Area/Seattle | Hybrid/Onsite | Full-time

We're a well-funded, pre-seed cybersecurity startup focused on data security. I'm looking for a founding AI lead with experience in fine-tuning LLMs (expertise around RL + reasoning models a big plus). This person would own the full AI stack from data to training to eval to test-time compute.

Who's a good fit:

* If you've always thought about starting a company, but for whatever reason (funding, life, idea), this is a great opportunity to be part of the founding team. We're 2 people right now.

* You enjoy understanding customer problems and their use cases, and then figuring out the best solution (sometimes technical, sometimes not) to their problems.

* You want to help figure out what a company looks like in this AI era.

* You enjoy teaching and sharing knowledge.

Questions, interest, just email jobs@polarsky.ai.

rdli commented on OpenAI to buy AI startup from Jony Ive bloomberg.com/news/articl... · Posted by u/minimaxir

rdli · 3 months ago

Seems that OpenAI is acquiring Io for $6.4B in an all-equity deal.

rdli commented on LLM-D: Kubernetes-Native Distributed Inference llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton

smarterclayton · 3 months ago

llm-d is intended to be three clean layers:

1. Balance / schedule incoming requests to the right backend

2. Model server replicas that can run on multiple hardware topologies

3. Prefix caching hierarchy with well-tested variants for different use cases

So it's a 3-tier architecture. The biggest difference with Dynamo is that llm-d is using the inference gateway extension - https://github.com/kubernetes-sigs/gateway-api-inference-ext... - which brings Kubernetes owned APIs for managing model routing, request priority and flow control, LoRA support etc.

rdli · 3 months ago

I would think that that the NVidia Dynamo SDK (pipelines) is a big difference as well (https://github.com/ai-dynamo/dynamo/tree/main/deploy/sdk/doc...), or am I missing something?

rdli commented on LLM-D: Kubernetes-Native Distributed Inference llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton

qntty · 3 months ago

It sounds like you might be confusing different parts of the stack. NVIDIA Dynamo for example supports vLLM as the inference engine. I think you should think of something like vLLM as more akin to GUnicorn, and llm-d as an application load balancer. And I guess something like NVIDIA Dynamo would be like Django.

rdli · 3 months ago

In this analogy, Dynamo is most definitely not like Django. It includes inference aware routing, KV caching, etc. -- all the stuff you would need to run a modern SOTA inference stack.

rdli commented on LLM-D: Kubernetes-Native Distributed Inference llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton

rdli · 3 months ago

This is really interesting. For SOTA inference systems, I've seen two general approaches:

* The "stack-centric" approach such as vLLM production stack, AIBrix, etc. These set up an entire inference stack for you including KV cache, routing, etc.

* The "pipeline-centric" approach such as NVidia Dynamo, Ray, BentoML. These give you more of an SDK so you can define inference pipelines that you can then deploy on your specific hardware.

It seems like LLM-d is the former. Is that right? What prompted you to go down that direction, instead of the direction of Dynamo?

Posted by u/rdli 4 months ago

The Agentic AI Runtime Stack wing.vc/content/the-agent...

Posted by u/rdli 5 months ago

Outperforming DeepSeekR1-32B with OpenThinker2 open-thoughts.ai/blog/thi...

Posted by u/rdli 6 months ago

Engineering Reasoning LLMs: Notes and Observations thelis.org/blog/reasoning...

rdli commented on Train Your Own O1 Preview Model Within $450 sky.cs.berkeley.edu/proje... · Posted by u/9woc

rdli · 6 months ago

The blog post was a little unclear, so my summary was:

- They used QwQ to generate training data (with some cleanup using GPT-4o-mini)

- The training data was then used to FT Qwen2.5-32B-Instruct (non-reasoning model)

- Result was that Sky-T1 performs slightly worse than QwQ but much better than Qwen2.5 on reasoning tasks

There are a few dismissive comments here but I actually think this is pretty interesting as it shows how you can FT a foundation model to do better at reasoning.

Posted by u/rdli 7 months ago

Anthropic at $60B – How much might employee stock options be worth?thelis.org/blog/anthropic...

u/rdli

KarmaCake day1120May 21, 2015

About

@rdli@mastodon.social @rdli.bsky.social

https://www.thelis.org

View Original