elyase (u/elyase) - Readit News

elyase commented on Coding with LLMs in the summer of 2025 – an update antirez.com/news/154... · Posted by u/antirez

vl · 7 months ago

It's ultrathink one word, not ultra-think. (See below).

I use Claude Code with Opus, and had same experience - was pushing it hard to implement complex test, and it gave me an empty test function with test plan inside in a comment (lol).

I do want to try Gemini 2.5 Pro, but I don't know a tool which would make experience compatible to Claude Code. Would it make sense to use with Cursor? Do they try to limit context?

  ~/.nvm/versions/node/v22.16.0/lib/node_modules/@anthropic-ai/claude-code $ npx prettier cli.js | ack ultrathink -C 20
  var jw1 = { HIGHEST: 31999, MIDDLE: 1e4, BASIC: 4000, NONE: 0 },
  Yk6 = {
    english: {
      HIGHEST: [
        { pattern: "think harder", needsWordBoundary: !0 },
        { pattern: "think intensely", needsWordBoundary: !0 },
        { pattern: "think longer", needsWordBoundary: !0 },
        { pattern: "think really hard", needsWordBoundary: !0 },
        { pattern: "think super hard", needsWordBoundary: !0 },
        { pattern: "think very hard", needsWordBoundary: !0 },
        { pattern: "ultrathink", needsWordBoundary: !0 },
      ],
      MIDDLE: [
        { pattern: "think about it", needsWordBoundary: !0 },
        { pattern: "think a lot", needsWordBoundary: !0 },
        { pattern: "think deeply", needsWordBoundary: !0 },
        { pattern: "think hard", needsWordBoundary: !0 },
        { pattern: "think more", needsWordBoundary: !0 },
        { pattern: "megathink", needsWordBoundary: !0 },
      ],
      BASIC: [{ pattern: "think", needsWordBoundary: !0 }],
      NONE: [],
    },

elyase · 7 months ago

https://github.com/sst/opencode

elyase commented on Tools: Code Is All You Need lucumr.pocoo.org/2025/7/3... · Posted by u/Bogdanp

elyase · 7 months ago

This is similar to the tool call (fixed code & dynamic params) vs code generation (dynamic code & dynamic params) discussion: tools offer contrains and save tokens, code gives you flexibility. Some papers suggest that generating code is often superior and this will likely become even more true as language models improve

[1] https://huggingface.co/papers/2402.01030

[2] https://huggingface.co/papers/2401.00812

[3] https://huggingface.co/papers/2411.01747

I am working on a model that goes a step beyond and even makes the distinction between thinking and code execution unnecessary (it is all computation in the end), unfortunately no link to share yet

elyase commented on Is gravity just entropy rising? Long-shot idea gets another look quantamagazine.org/is-gra... · Posted by u/pseudolus

pif · 8 months ago

As an experimental physicist, I refuse to get excited about a new theory until the proponent gets to an observable phenomenon that can fix the question.

elyase · 8 months ago

Between two models the one with the shorter Minimum Description Length (MDL) will more likely generalize better

elyase commented on A Research Preview of Codex openai.com/index/introduc... · Posted by u/meetpateltech

prhn · 9 months ago

Is anyone using any of these tools to write non boilerplate code?

I'm very interested.

In my experience ChatGPT and Gemini are absolutely terrible at these types of things. They are constantly wrong. I know I'm not saying anything new, but I'm waiting to personally experience an LLM that does something useful with any of the code I give it.

These tools aren't useless. They're great as search engines and pointing me in the right direction. They write dumb bash scripts that save me time here and there. That's it.

And it's hilarious to me how these people present these tools. It generates a bunch of code, and then you spend all your time auditing and fixing what is expected to be wrong.

That's not the type of code I'm putting in my company's code base, and I could probably write the damn code more correctly in less time than it takes to review for expected errors.

What am I missing?

elyase · 9 months ago

What you're missing is how to use the tools properly. With solid documentation, good project management practices, a well-organized code structure and tests, any junior engineer should be able to read up on your codebase, write linted code following your codebase style, verify it via tests and write you a report of what was done, challenges faced etc. State of the art coding agents will do that at superhuman speeds.

If you haven't set things up properly (important info lives only in people’s heads / meetings, tasks dont have clear acceptance criteria, ...) then you aren't ready for Junior Developers yet. You need to wait until your Coding Agents are at Senior level.

elyase commented on Launch HN: Browser Use (YC W25) – open-source web agents github.com/browser-use/br... · Posted by u/MagMueller

gregpr07 · a year ago

What’s your take - how can we expose Browser Use to as many use cases as possible? Is there easier way than openapi config?

elyase · a year ago

I want to use browser-use in Cursor but I am using another option because it doesn't support MCP integration which is the common language they support for external tools

elyase commented on GPT-4o with scheduled tasks (jawbone) is available in beta chatgpt.com/?model=gpt-4o... · Posted by u/TheJCDenton

elyase · a year ago

There is more information in these twitter threads:

https://x.com/karinanguyen_/status/1879270529066262733 https://x.com/OpenAI/status/1879267276291203329

elyase commented on Show HN: Tonic Validate Metrics – an open-source RAG evaluation metrics package github.com/TonicAI/tvalme... · Posted by u/Ephil012

elyase · 2 years ago

How does it compare to https://github.com/explodinggradients/ragas