Readit News logoReadit News
noodletheworld commented on DeepSeek-v3.1   api-docs.deepseek.com/new... · Posted by u/wertyk
dmos62 · 2 days ago
Sounds a bit presumptious to me. Sure, they have your needle, but they also need a cost-efficient way to find it in their hay stack.
noodletheworld · 2 days ago
Security through obscurity is not security.

Your api key is linked to your credit card, which is linked to your identity.

…but hey, youre right.

Lets just trust them not to be cheating. Cool.

noodletheworld commented on DeepSeek-v3.1   api-docs.deepseek.com/new... · Posted by u/wertyk
amelius · 2 days ago
Aren't good benchmarks supposed to be secret?
noodletheworld · 2 days ago
How can a benchmark be secret if you post it to an API to test a model on it?

"We totally promise that when we run your benchmark against our API we won't take the data from it and use to be better at your benchmark next time"

:P

If you want to do it properly you have to avoid any 3rd party hosted model when you test your benchmark, which means you can't have GPT5, claude, etc. on it; and none of the benchmarks want to be 'that guy' who doesn't have all the best models on it.

So no.

They're not secret.

noodletheworld commented on Weaponizing image scaling against production AI systems   blog.trailofbits.com/2025... · Posted by u/tatersolid
Martin_Silenus · 3 days ago
I did not ask about what AI can do.
noodletheworld · 3 days ago
> Is it part of the multi-modal system without it being able to differenciate that text from the prompt?

Yes.

The point the parent is making is that if your model is trained to understand the content of an image, then that's what it does.

> And even if they can't, they should at least improve the pipeline so that any OCR feature should not automatically inject its result in the prompt, and tell user about it to ask for confirmation.

That's not what is happening.

The model is taking <image binary> as an input. There is no OCR. It is understanding the image, decoding the text in it and acting on it in a single step.

There is no place in the 1-step pipeline to prevent this.

...and sure, you can try to avoid it procedural way (eg. try to OCR an image and reject it before it hits the model if it has text in it), but then you're playing the prompt injection game... put the words in a QR code. Put them in french. Make it a sign. Dial the contrast up or down. Put it on a t-shirt.

It's very difficult to solve this.

> It's hard to believe they can't prevent this.

Believe it.

noodletheworld commented on How to stop feeling lost in tech: the wafflehouse method   yacinemahdid.com/p/how-to... · Posted by u/research_pie
noodletheworld · 3 days ago
This isn't realistic.

Wanting things to be true does not make them true.

“Get a promotion this year, be a manager next year, manage the division in three years” is not a plan you can execute.

This is just the old self affirmation stuff you hear all the time: you won't succeed if you want it a bit. You wont succeed if want it and do nothing. You will succeed if you go all in, 100%.

It is BS.

You wont succeed if you go all in, statistically.

You might get a different outcome, but you wont hit your goal.

It is provably false that everyone who goes all succeeds; Not everyone gets to be an astronaut, no matter how hard they work.

The reality is that some people will put a little effort in and succeed, and some people will put a lot in and succeed. Other people will fail.

Your goals are not indicators of future success.

Only actual things that have actually happened are strong signals for future events.

The advice of having goals is helpful, but the much much more important thing to do is measure what actually happens and realistically create goals based on actual reality.

Try things. Measure things. Adopt things that work. Consciously record what you do, how it goes, how long it takes and use that to estimate achievable goals, instead of guessing randomly.

noodletheworld commented on Show HN: Project management system for Claude Code   github.com/automazeio/ccp... · Posted by u/aroussi
the_mitsuhiko · 4 days ago
> You can't, at least for production code.

You can. People do. It's not perfect at it yet, but there are success stories of this.

noodletheworld · 4 days ago
Are you talking about the same thing as the OP?

I mean, the parent even pointed out that it works for vibe coding and stuff you don't care about; ...but the 'You can't' refers to this question by the OP:

> I really need to approve every single edit and keep an eye on it at ALL TIMES, otherwise it goes haywire very very fast! How are people using auto-edits and these kind of higher-level abstraction?

No one I've spoken to is just sitting back writing tickets while agents do all the work. If it was that easy to be that successful, everyone would be doing it. Everyone would be talking about it.

To be absolutely clear, I'm not saying that you can't use agents to modify existing code. You can. I do; lots of people do. ...but that's using it like you see in all the demos and videos; at a code level, in an editor, while editing and working on the code yourself.

I'm specifically addressing the OPs question:

Can you use unsupervised agents, where you don't interact at a 'code' level, only at a high level abstraction level?

...and, I don't think you can. I don't believe anyone is doing this. I don't believe I've seen any real stories of people doing this successfully.

noodletheworld commented on Claudia – Desktop companion for Claude code   claudiacode.com/... · Posted by u/zerealshadowban
msikora · 6 days ago
The idea here is an IDE for Claude Code specifically. is most likely the strongest coding agent right now, but not everyone loves the command line only interface. So I totally get it.
noodletheworld · 6 days ago
Is a whole IDE really the solution though?

There are already a plugins to use claude code in other IDEs.

This “Ill write a whole IDE because you get the best UX” seems like its a bit of a fallacy.

There are lots of ways you could do that.

A standalone application is just convenient for your business/startup/cross sell/whatever.

noodletheworld commented on Show HN: Yet another memory system for LLMs   github.com/trvon/yams... · Posted by u/blackmanta
ActorNightly · 10 days ago
>MCP server (requires Boost)

I see stuff like this, and I really have to wonder if people just write software with bloat for the sake of using a particular library.

noodletheworld · 10 days ago
? Are you complaining about MCP or boost?

It’s an optional component.

What do you want the OP to do?

MCP may not be strictly necessary but it’s straight in line with the intent of the library.

Are you going to take shots at llama.cpp for having an http server and a template library next?

Come on. This uses conan, it has a decent cmake file. The code is ok.

This is pretty good work. Dont be a dick. (Yeah, ill eat the down votes, it deserves to be said)

noodletheworld commented on Ask HN: Do you struggle with flow state when using AI assisted coding tools?    · Posted by u/rasca
noodletheworld · 18 days ago
When you stop and wait for an LLM it breaks flow.

If you stop and review what an agent did, it breaks flow.

Personally, my experience has been the best with doing this:

- Start coding

- On a second laptop, run an agent in yolo and tell it to periodically pull changes and if there are any `AGENT-TODO:` items in the code, do them.

- As you code, if you find something irritating or boring, `AGENT-TODO: ...` it.

- Periodically push your changes.

- Periodically pull any changes your agent has pushed down.

- Keep working; don't stop and check on the agent. Don't confirm every action. Just yolo.

If that's too scary, have it put up PRs instead of pushing to the live branch. /shrug

...but, tldr: if you're sitting there watching an agent do things, you're not in flow. If you're kicking multiple agents off and sitting waiting for them, you're more productive, but that is absolutely not flow state.

Anyone who thinks they are, doesn't know what flow state is.

The key to maintaining flow is having a second clone, or a second machine or something where you can keep doing work after you kick the agent off to do something.

(yeah, you don't need a second laptop, but it's nice; agents will often run things to check they work or steal ports or run tests that can screw with you if you're on the same machine)

noodletheworld commented on GitHub CEO Warns Developers: "Either Embrace AI or Get Out of This Career"   finalroundai.com/blog/git... · Posted by u/pjmlp
noodletheworld · 19 days ago
> You know what else we noticed in the interviews? Developers rarely mentioned “time saved” as the core benefit of working in this new way with agents.

> They were all about increasing ambition. We believe that means that we should update how we talk about (and measure) success when using these tools

I’m struggling to understand what this means.

noodletheworld commented on Deep Agents   blog.langchain.com/deep-a... · Posted by u/saikatsg
noodletheworld · 22 days ago
This matches my expectations.

Now that its increasingly clear that writing MCP servers isn't a winning strategy, people need a new way to jump on the band wagon as easily as possible.

Writing your own agent like geminin and claude code is the new hotness right now.

- low barrier to entry (tick)

- does something reasonably useful (tick)

- doesnt require any deep ai knowledge or skill (tick)

- easy to hype (tick)

Its like “cursor but for X” but easier to ship.

Were going to see a tonne of coding agents built this way, but my intuition is, and what Ive seen so far, is theyre not actually introducing anything novel.

Maybe having a quick start like this is good, because it drops the value of an unambitious direct claude code clone to zero.

I like it.

u/noodletheworld

KarmaCake day234December 14, 2024View Original