meander_water (u/meander_water)

meander_water commented on SpaCy: Industrial-Strength Natural Language Processing (NLP) in Python github.com/explosion/spaC... · Posted by u/marklit

coder68 · 3 days ago

I have been working on text classification tasks at work, and I have found that for my particular use-case, LLMs are not performing well at all. I have spent a few thousand dollars trying, and I have tried everything from few-shot to asking simple binary yes/no questions, and I have had mixed success.

I have stopped trying to use LLMs for this project and switched to discriminative models (Logistic Regression with TFIDF or Embeddings), which are both more computationally efficient and more debuggable. I'm not entirely sure why, but for anything with many possible answers, or to which there is some subjectivity, I have not had success with LLMs simply due to inconsistency of responses.

For VERY obvious tasks like: "is this store a restaurant or not?" I have definitely had success, so YMMV.

meander_water · 2 days ago

Are your categories fixed? If so you could constrain the output using enums in structured outputs.

re: inconsistencies in output, OpenAI provide a seed and system_fingerprint options to (mostly) produce deterministic output.

meander_water commented on We put a coding agent in a while loop github.com/repomirrorhq/r... · Posted by u/sfarshid

VincentEvans · 5 days ago

There will be a a new kind of job for software engineers, sort of like a cross between working with legacy code and toxic site cleanup.

Like back in the day being brought in to “just fix” a amalgam of FoxPro-, Excel-, and Access-based ERP that “mostly works” and only “occasionally corrupts all our data” that ambitious sales people put together over last 5 years.

But worse - because “ambitious sales people” will no longer be constrained by sandboxes of Excel or Access - they will ship multi-cloud edge-deployed kubernetes micro-services wired with Kafka, and it will be harder to find someone to talk to understand what they were trying to do at the time.

meander_water · 4 days ago

I think we're already there [0].

[0] https://x.com/PovilasKorop/status/1959590015018652141

Im really curious about what other jobs will pop up. As long as there is an element of probability associated with AI, there will need to be manual supervision for certain tasks/jobs.

meander_water commented on ThinkMesh: A Python lib for parallel thinking in LLMs github.com/martianlantern... · Posted by u/martianlantern

meander_water · 5 days ago

Nice, I'd love to see this added to the llm-reasoners project [0]. They've got a nice set of reasoning techniques implemented from papers.

[0] https://github.com/maitrix-org/llm-reasoners

meander_water commented on How to build a coding agent ghuntley.com/agent/... · Posted by u/ghuntley

ofirpress · 5 days ago

We (the Princeton SWE-bench team) built an agent in ~100 lines of code that does pretty well on SWE-bench, you might enjoy it too: https://github.com/SWE-agent/mini-swe-agent

meander_water · 5 days ago

> 1. Analyze the codebase by finding and reading relevant files 2. Create a script to reproduce the issue 3. Edit the source code to resolve the issue 4. Verify your fix works by running your script again 5. Test edge cases to ensure your fix is robust

This prompt snippet from your instance template is quite useful. I use something like this for getting out of debug loops:

> Analyse the codebase and brainstorm a list of potential root causes for the issue, and rank them from most likely to least likely.

Then create scripts or add debug logging to confirm whether your hypothesis is correct. Rule out root causes from most likely to least by executing your scripts and observing the output in order of likelihood.

meander_water commented on Developer's block underlap.org/developers-b... · Posted by u/todsacerdoti

meander_water · 6 days ago

Great advice all round.

> Take time with learning

But this one in particular stands out. We are being constantly pushed to ship code at faster and faster rates. AI has only hastened the process.

If you want to learn anything new you have to slow it down, push back against all the forces urging you to do more, ship more, make more money.

If you're using AI tools, do the opposite of what everyone else is doing. For every piece of generated code you accept, scrutinize every line, ask clarifying questions, ask for alternate implementations, ask what the tradeoffs are.

Just be curious.

This will be slow, but that's the point.

meander_water commented on Show HN: I replaced vector databases with Git for AI memory (PoC) github.com/Growth-Kinetic... · Posted by u/alexmrv

meander_water · 8 days ago

You could use BM25S [0] instead of rank-bm25 for a nice speedup.

Also, there are tradeoffs associated with using BM25 instead of embedding similarity. You're essentially trading semantic understanding for computational speed and keyword matching.

[0] https://github.com/xhluca/bm25s

meander_water commented on The End of Handwriting wired.com/story/the-end-o... · Posted by u/beardyw

meander_water · 9 days ago

I've tried for years to keep a regular journal. But everytime I stare at a blank screen I can't summon up enough activation energy to write anything.

On a whim, I tried writing in a physical journal, and to my surprise I found it a lot easier to be consistent and write down my thoughts before they disappear. It also improved my handwriting over time, and also your hands hurt less the more you write.

One theory I have is that writing is just slow enough for me to buffer my thoughts in memory. Typing is too fast, and by the time I've written a sentence I've lost track of my train of thought.

meander_water commented on AGENTS.md – Open format for guiding coding agents agents.md/... · Posted by u/ghuntley

setopt · 9 days ago

The point is that .agents is a hidden file while AGENTS.md is in your face like a README intended for humans.

Having an in-your-face file that links to a hidden file serves no purpose.

meander_water · 9 days ago

I don't see the point of having it hidden though. Having it "in your face" means you can actively tune it yourself, or using the LLM itself.

meander_water commented on AGENTS.md – Open format for guiding coding agents agents.md/... · Posted by u/ghuntley

CharlesW · 10 days ago

This should've been an .agents¹ with an index.md.

For tiny, throwaway projects, a monolithic .md file is fine. A folder allows more complex projects to use "just enough hierarchy" to provide structure, with index.md as the entry point. Along with top-level universal guidance, it can include an organization guide (easily maintained with the help of LLMs).

  index.md
  ├── auth.md
  ├── performance.md
  ├── code_quality
  ├── data_layer
  ├── testing
  └── etc

In my experience, this works loads better than the "one giant file" method. It lets LLMs/agents add relevant context without wasting tokens on unrelated context, reduces noise/improves response accuracy, and is easier to maintain for both humans and LLMs alike.

¹ Ideally with a better name than ".agents", like ".codebots" or ".context".

meander_water · 9 days ago

There shouldn't be anything stopping you from doing that.

You can just use the AGENTS.md file as an index pointing to other doc files.

This example does that -

https://github.com/apache/airflow/blob/main/AGENTS.md