markab21 (u/markab21)

markab21 commented on Orchestrate teams of Claude Code sessions code.claude.com/docs/en/a... · Posted by u/davidbarker

IhateAI · 3 days ago

Any self respecting engineer should recognize that these tools and models only serve to lower the value of your labor. They aren't there to empower you, they aren't going to enable you to join the ruling class with some vibe-rolled slop SaaS.

Using these things will fry your brain's ability to think through hard solutions. It will give you a disease we haven't even named yet. Your brain will atrophy. Do you want your competency to be correlated 1:1 to the quality and quantity of tokens you can afford (or be loaned!!)?

Their main purpose is to convince C-suite suits that they don't need you, or they should be justified in paying you less.This will of course backfire on them, but in the meantime, why give them the training data, why give them the revenue??

I'd bet anything these new models / agentic-tools are designed to optimize for token consumption. They need the revenue BADLY. These companies are valued at 200 X Revenue.. Google IPO'd at 10-11 x lmfao . Wtf are we even doing? Can't wait to watch it crash and burn :) Soon!

markab21 · 3 days ago

Shaking fist at clouds!!

markab21 commented on Qwen3-Coder-Next qwen.ai/blog?id=qwen3-cod... · Posted by u/danielhanchen

vessenes · 5 days ago

3B active parameters, and slightly worse than GLM 4.7. On benchmarks. That's pretty amazing! With better orchestration tools being deployed, I've been wondering if faster, dumber coding agents paired with wise orchestrators might be overall faster than using the say opus 4.5 on the bottom for coding. At least we might want to deploy to these guys for simple tasks.

markab21 · 5 days ago

It's getting a lot easier to do this using sub-agents with tools in Claude. I have a fleet of Mastra agents (TypeScript). I use those agents inside my project as CLI tools to do repetitive tasks that gobble tokens such as scanning code, web search, library search, and even SourceGraph traversal.

Overall, it's allowed me to maintain more consistent workflows as I'm less dependent on Opus. Now that Mastra has introduced the concept of Workspaces, which allow for more agentic development, this approach has become even more powerful.

markab21 commented on TimeCapsuleLLM: LLM trained only on data from 1800-1875 github.com/haykgrigo3/Tim... · Posted by u/admp

ben_w · a month ago

> Could this be an experiment to show how likely LLMs are to lead to AGI, or at least intelligence well beyond our current level?

You'd have to be specific what you mean by AGI: all three letters mean a different thing to different people, and sometimes use the whole means something not present in the letters.

> If you could only give it texts and info and concepts up to Year X, well before Discovery Y, could we then see if it could prompt its way to that discovery?

To a limited degree.

Some developments can come from combining existing ideas and seeing what they imply.

Other things, like everything to do with relativity and quantum mechanics, would have required experiments. I don't think any of the relevant experiments had been done prior to this cut-off date, but I'm not absolutely sure of that.

You might be able to get such an LLM to develop all the maths and geometry for general relativity, and yet find the AI still tells you that the perihelion shift of Mercury is a sign of the planet Vulcan rather than of a curved spacetime: https://en.wikipedia.org/wiki/Vulcan_(hypothetical_planet)

markab21 · a month ago

Basically looking for emergent behavior.

markab21 commented on Show HN: Mysti – Claude, Codex, and Gemini debate your code, then synthesize github.com/DeepMyst/Mysti... · Posted by u/bahaAbunojaim

bahaAbunojaim · a month ago

Executing multiple agents on the same model also works.

I find it helpful to even change the persona of the same agent “the prompt” or the model the agent is using. These variations always help but I found having multiple different agents with different LLMs in the backend works better

markab21 · a month ago

I love where you're going with this. In my experience it's not about a different persona, it's about constantly considering context that triggers, different activations enhance a different outcome. You can achieve the same thing, of course by switching to an agent with a separate persona, but you can also get it simply by injecting new context, or forcing the agent to consider something new. I feel like this concept gets cargo-culted a little bit.

I personally have moved to a pattern where i use mastra-agents in my project to achieve this. I've slowly shifted the bulk of the code research and web research to my internal tools (built with small typescript agents).. I can now really easily bounce between different tools such as claude, codex, opencode and my coding tools are spending more time orchestrating work than doing the work themselves.

markab21 commented on Apps SDK developers.openai.com/app... · Posted by u/alvis

markab21 · 4 months ago

The skepticism is understandable given the trajectory of GPTs and custom instructions, but there's a meaningful technical difference here: the Apps SDK is built on the Model Context Protocol (MCP), which is an open specification rather than a proprietary format.

MCP standardizes how LLM clients connect to external tools—defining wire formats, authentication flows, and metadata schemas. This means apps you build aren't inherently ChatGPT-specific; they're MCP servers that could work with any MCP-compatible client. The protocol is transport-agnostic and self-describing, with official Python and TypeScript SDKs already available.

That said, the "build our platform" criticism isn't entirely off base. While the protocol is open, practical adoption still depends heavily on ChatGPT's distribution and whether other LLM providers actually implement MCP clients. The real test will be whether this becomes a genuine cross-platform standard or just another way to contribute to OpenAI's ecosystem.

The technical primitives (tool discovery, structured content return, embedded UI resources) are solid and address real integration problems. Whether it succeeds likely depends more on ecosystem dynamics than technical merit.

markab21 commented on Mistral NeMo mistral.ai/news/mistral-n... · Posted by u/bcatanzaro

pantulis · 2 years ago

Does it have any relation to Nvidia's Nemo? Otherwise, it's unfortunate naming

markab21 · 2 years ago

It looks like it was built jointly with nvidia: https://huggingface.co/nvidia/Mistral-NeMo-12B-Instruct

markab21 commented on How Mandelbrot set images are affected by floating point precision github.com/ProfJski/Float... · Posted by u/todsacerdoti

captaincrowbar · 2 years ago

Some people, when they have a problem, think, “I know, I’ll use floating point.” Now they have 1.9999998 problems.

markab21 · 2 years ago

I bet you have been waiting years to pull that one out of your pocket.

Well played sir! Nice shot man! :D

markab21 commented on Memory and new controls for ChatGPT openai.com/blog/memory-an... · Posted by u/Josely

markab21 · 2 years ago

I've found myself more and more using local models rather than ChatGPT; it was pretty trivial to set up Ollama+Ollama-WebUI, which is shockingly good.

I'm so tired of arguing with ChatGPT (or what was Bard) to even get simple things done. SOLAR-10B or Mistral works just fine for my use cases, and I've wired up a direct connection to Fireworks/OpenRouter/Together for the occasion I need anything more than what will run on my local hardware. (mixtral MOE, 70B code/chat models)

markab21 commented on Mistral CEO confirms 'leak' of new open source AI model nearing GPT4 performance venturebeat.com/ai/mistra... · Posted by u/pg_1234

Nick87633 · 2 years ago

Where do you like to keep up to date on these? Arxiv preprints, or some other place?

markab21 · 2 years ago

For Llama-based progress - Reddit - /r/LocalLlama has been my top source of info, although it's been getting a little more noisy lately.

I also hang out on a few Discord servers: - Nous Research - TogetherAI / Fireworks / Openrouter - LangChain - TheBloke AI - Mistral AI

These, along with a couple of newsletters, basically keep a pulse on things.

markab21 commented on Gulf Stream weakening now 99% certain, and ramifications will be global livescience.com/planet-ea... · Posted by u/mindracer

markab21 · 2 years ago

[flagged]