Readit News logoReadit News
rdli commented on Ask HN: Who is hiring? (September 2025)    · Posted by u/whoishiring
rdli · 5 hours ago
Polar Sky | Founding AI Lead | Bay Area/Seattle | Hybrid/Onsite | Full-time

We're a well-funded, pre-seed cybersecurity startup focused on data security. I'm looking for a founding AI lead with experience in fine-tuning LLMs (expertise around RL + reasoning models a big plus). This person would own the full AI stack from data to training to eval to test-time compute.

Who's a good fit:

* If you've always thought about starting a company, but for whatever reason (funding, life, idea), this is a great opportunity to be part of the founding team. We're 2 people right now.

* You enjoy understanding customer problems and their use cases, and then figuring out the best solution (sometimes technical, sometimes not) to their problems.

* You want to help figure out what a company looks like in this AI era.

* You enjoy teaching and sharing knowledge.

Questions, interest, just email jobs@polarsky.ai.

rdli commented on OpenAI to buy AI startup from Jony Ive   bloomberg.com/news/articl... · Posted by u/minimaxir
rdli · 3 months ago
Seems that OpenAI is acquiring Io for $6.4B in an all-equity deal.
rdli commented on LLM-D: Kubernetes-Native Distributed Inference   llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton
smarterclayton · 3 months ago
llm-d is intended to be three clean layers:

1. Balance / schedule incoming requests to the right backend

2. Model server replicas that can run on multiple hardware topologies

3. Prefix caching hierarchy with well-tested variants for different use cases

So it's a 3-tier architecture. The biggest difference with Dynamo is that llm-d is using the inference gateway extension - https://github.com/kubernetes-sigs/gateway-api-inference-ext... - which brings Kubernetes owned APIs for managing model routing, request priority and flow control, LoRA support etc.

rdli · 3 months ago
I would think that that the NVidia Dynamo SDK (pipelines) is a big difference as well (https://github.com/ai-dynamo/dynamo/tree/main/deploy/sdk/doc...), or am I missing something?
rdli commented on LLM-D: Kubernetes-Native Distributed Inference   llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton
qntty · 3 months ago
It sounds like you might be confusing different parts of the stack. NVIDIA Dynamo for example supports vLLM as the inference engine. I think you should think of something like vLLM as more akin to GUnicorn, and llm-d as an application load balancer. And I guess something like NVIDIA Dynamo would be like Django.
rdli · 3 months ago
In this analogy, Dynamo is most definitely not like Django. It includes inference aware routing, KV caching, etc. -- all the stuff you would need to run a modern SOTA inference stack.
rdli commented on LLM-D: Kubernetes-Native Distributed Inference   llm-d.ai/blog/llm-d-annou... · Posted by u/smarterclayton
rdli · 3 months ago
This is really interesting. For SOTA inference systems, I've seen two general approaches:

* The "stack-centric" approach such as vLLM production stack, AIBrix, etc. These set up an entire inference stack for you including KV cache, routing, etc.

* The "pipeline-centric" approach such as NVidia Dynamo, Ray, BentoML. These give you more of an SDK so you can define inference pipelines that you can then deploy on your specific hardware.

It seems like LLM-d is the former. Is that right? What prompted you to go down that direction, instead of the direction of Dynamo?

rdli commented on Train Your Own O1 Preview Model Within $450   sky.cs.berkeley.edu/proje... · Posted by u/9woc
rdli · 6 months ago
The blog post was a little unclear, so my summary was:

- They used QwQ to generate training data (with some cleanup using GPT-4o-mini)

- The training data was then used to FT Qwen2.5-32B-Instruct (non-reasoning model)

- Result was that Sky-T1 performs slightly worse than QwQ but much better than Qwen2.5 on reasoning tasks

There are a few dismissive comments here but I actually think this is pretty interesting as it shows how you can FT a foundation model to do better at reasoning.

u/rdli

KarmaCake day1120May 21, 2015
About
@rdli@mastodon.social @rdli.bsky.social

https://www.thelis.org

View Original