Readit News logoReadit News
abdullin commented on Ask HN: What Are You Working On? (Nov 2025)    · Posted by u/david927
hattmall · a month ago
Ooooh, neat, I had a similar idea, like an AI olympics that could be live streamed where they have to do several multi-stepped tasks
abdullin · a month ago
Yep, exactly the same concept. Except not live-streaming, but giving out a lot of multi-step tasks that require reasoning and adaptation.

Here is a screenshot of a test task: https://www.linkedin.com/posts/abdullin_ddd-ai-sgr-here-is-h...

Although… since I record all interactions, could replay all them as if they were streamed.

abdullin commented on Ask HN: What Are You Working On? (Nov 2025)    · Posted by u/david927
abdullin · a month ago
I’m working on a platform to run a friendly competition in “who builds the best reasoning AI Agent”.

Each participating team (got 300 signups so far) will get a set of text tasks and a set of simulated APIs to solve them.

For instance the task (a typical chatbot task) could say something like: “Schedule 30m knowledge exchange next week between the most experienced Python expert in the company and 3-5 people that are most interested in learning it “

AI agent will have to solve through this by using a set of simulated APIs and playing a bit of calendar Tetris (in this case - Calendar API, Email API, SkillWill API).

Since API instances are simulated and isolated (per team per task), it becomes fairly easy to automatically check correctness of each solution and rank different agents in a global leaderboard.

Code of agents stays external, but participants fill and submit brief questionnaires about their architectures.

By benchmarking different agentic implementations on the same tasks - we get to see patterns in performance, accuracy and costs of various architectures.

Codebase of the platform is written mostly in golang (to support thousands of concurrent simulations). I’m using coding agents (Claude Code and Codex) for exploration and easy coding tasks, but the core has still to be handcrafted.

abdullin commented on Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?    · Posted by u/superasn
KaiserPro · 4 months ago
Same explanation but with less mysticism:

Inference is (mostly) stateless. So unlike training where you need to have memory coherence over something like 100k machines and somehow avoid the certainty of machine failure, you just need to route mostly small amounts of data to a bunch of big machines.

I don't know what the specs of their inference machines are, but where I worked the machines research used were all 8gpu monsters. so long as your model fitted in (combined) vram, you could job was a goodun.

To scale the secret ingredient was industrial amounts of cash. Sure we had DGXs (fun fact, nvidia sent literal gold plated DGX machines) but they wernt dense, and were very expensive.

Most large companies have robust RPC, and orchestration, which means the hard part isn't routing the message, its making the model fit in the boxes you have. (thats not my area of expertise though)

abdullin · 4 months ago
> Inference is (mostly) stateless

Quite the opposite. Context caching requires state (K/V cache) close to the VRAM. Streaming requires state. Constrained decoding (known as Structured Outputs) also requires state.

abdullin commented on Show HN: Conductor, a Mac app that lets you run a bunch of Claude Codes at once   conductor.build/... · Posted by u/Charlieholtz
abdullin · 5 months ago
Is it similar to what OpenAI Codex does with isolated environments per agent run?
abdullin commented on Ask HN: Any active COBOL devs here? What are you working on?    · Posted by u/_false
Cthulhu_ · 5 months ago
For sure; I'll believe that an AI can read and "understand" code, extract meaning and requirements from it, but it won't be the same as a human that knows requirements.

Then again, a human won't know all requirements either; over time, requirements are literally encoded.

abdullin · 5 months ago
In systems like that you can record human interactions with the old version, replay against the new one and compare outcomes.

Is there a delta? Debug and add a unit test to capture the bug. Then fix and move to the next delta.

abdullin commented on Ask HN: Any active COBOL devs here? What are you working on?    · Posted by u/_false
mtmail · 5 months ago
Met one close to retirement who worked on a ERP system in the food processing industry. Nightly batch jobs would trigger orders from their suppliers, customer service would enter new orders. Two SAP migrations already failed, costing the company millions. All company process knowledge was in code, database fields have been repurposed (but no renamed, too much work), feature development stop long time ago. In parallel a new system was built in-house (no longer trusting external consultants) and his job was explaining what the system does. Probably well paid but he didn't seem to care, he just wanted to work less and retire on good terms.
abdullin · 5 months ago
I grew to like migration projects like that.

Currently working on migration of 30yo ERP without tests in Progress to Kotlin+PostgreSQL.

AI agents don’t care which code to read or convert into tests. They just need an automated feedback loop and some human oversight.

abdullin commented on Andrej Karpathy: Software in the era of AI [video]   youtube.com/watch?v=LCEmi... · Posted by u/sandslash
bayindirh · 6 months ago
What you are describing is another application. My comment was squarely aimed at "vibe coding".

Protecting and preserving dying languages and culture is a great application for natural language processing.

For the record, I'm neither against LLMs, nor AI. What I'm primarily against is, how LLMs are trained and use the internet via their agents, without giving any citations, and stripping this information left and right and cry "fair use!" in the process.

Also, Go and Python are a nice languages (which I use), but there are other nice ways to build agents which also allows them to migrate, communicate and work in other cooperative or competitive ways.

So, AI is nice, LLMs are cool, but hyping something to earn money, deskill people, and pointing to something which is ethically questionable and technically inferior as the only silver bullet is not.

IOW; We should handle this thing way more carefully and stop ripping people's work in the name of "fair use" without consent. This is nuts.

Disclosure: I'm a HPC sysadmin sitting on top of a datacenter which runs some AI workloads, too.

abdullin · 6 months ago
I think there are two different layers that get frequently mixed.

(1) LLMs as models - just the weights and an inference engine. These are just tools like hammers. There is a wide variety of models, starting from transparent and useless IBM Granite models, to open-weights Llama/Qwen to proprietary.

(2) AI products that are built on top of LLMs (agents, RAG, search, reasoning etc). This is how people decide to use LLMs.

How these products display results - with or without citations, with or without attribution - is determined by the product design.

It takes more effort to design a system that properly attributes all bits of information to the sources, but it is doable. As long as product teams are willing to invest that effort.

abdullin commented on Andrej Karpathy: Software in the era of AI [video]   youtube.com/watch?v=LCEmi... · Posted by u/sandslash
diggan · 6 months ago
> distilled from experiences reported by multiple companies

Distilled from my experience, I'd still say that the UX is lacking, as sequential chat just isn't the right format. I agree with Karpathy that we haven't found the right way of interacting with these OSes yet.

Even with what you say, variations were implemented in a rush. Once you've iterated with one variation you can not at the same time iterate on another variant, for example.

abdullin · 6 months ago
Yes. I believe, the experience will get better. Plus more AI vendors will catch up with OpenAI and offer similar experiences in their products.

It will just take a few months.

abdullin commented on Andrej Karpathy: Software in the era of AI [video]   youtube.com/watch?v=LCEmi... · Posted by u/sandslash
bayindirh · 6 months ago
So you can create a more buggy code remixed from scraped bits from the internet which you don't understand, but somehow works rather than creating a higher quality, tighter code which takes the same amount of time to type? All the while offloading all the work to something else so your skills can atrophy at the same time?

Sounds like progress to me.

abdullin · 6 months ago
Here is another way to look at the problem.

There is a team of 5 people that are passionate about their indigenous language and want to preserve it from disappearing. They are using AI+Coding tools to:

(1) Process and prepare a ton of various datasets for training custom text-to-speech, speech-to-text models and wake word models (because foundational models don't know this language), along with the pipelines and tooling for the contributors.

(2) design and develop an embedded device (running ESP32-S3) to act as a smart speaker running on the edge

(3) design and develop backend in golang to orchestrate hundreds of these speakers

(4) a whole bunch of Python agents (essentially glorified RAGs over folklore, stories)

(5) a set of websites for teachers to create course content and exercises, making them available to these edge devices

All that, just so that kids in a few hundred kindergartens and schools would be able to practice their own native language, listen to fairy tales, songs or ask questions.

This project was acknowledged by the UN (AI for Good programme). They are now extending their help to more disappearing languages.

None of that was possible before. This sounds like a good progress to me.

Edit: added newlines.

u/abdullin

KarmaCake day144November 11, 2013
About
Technical advisor. I help teams ship ML-driven products faster.

Blog: https://abdullin.com LinkedIn: https://linkedin.com/in/abdullin/

View Original