deepdarkforest (u/deepdarkforest)

deepdarkforest commented on 95% of Companies See 'Zero Return' on $30B Generative AI Spend thedailyadda.com/95-of-co... · Posted by u/speckx

quotemstr · 8 days ago

Defense will increasingly become a national priority over the next few decades. Pax Americana is teetering.

deepdarkforest · 8 days ago

That and consumer robotics. The latter will explode if (big if) RL and llm reasoning get combined into something solid. Lots and lots of smart people are working on it already of course, we are seeing great improvements but nothing really usable. i think we will finally get to a real hype stage in maybe 3-4 years

deepdarkforest commented on Ask HN: Have any successful startups been made by 'vibe coding'? · Posted by u/nomilk

alecsm · 10 days ago

What I've seen is mostly small apps and people that claim they're making thousands of dollars every month with them.

And they usually try to sell you their courses/mentorships.

I don't know if I'm being suspicious because they seem fake or because they came out of nowhere and are earning in a month what I make in a year.

deepdarkforest · 10 days ago

Here is the secret:

Step 1: Vibecode 5 trash generic apps (eg AI interior designers, gpt wrappers)

Step 2: Launch with paying 15k in google/meta ads

Step 3: Receive back ~5k in revenue

Step 4: Spam on twitter+linkedin clickbaity "heres how i reached 60K ARR in 3 milliseconds" posts to attract ex-crypto bros and hustlers

Step 5: Use your newfound following for sponsored content, courses and hopefully actual organic growth for your aforementioned trash apps

deepdarkforest commented on Launch HN: Cyberdesk (YC S25) – Automate Windows legacy desktop apps · Posted by u/mahmoud-almadi

deepdarkforest · 15 days ago

Congrats! I think the space is very interesting, I was a founder of a similar windows CUA infra/ RPA agents but pivoted. My thoughts:

1) The funny thing about determinism is how deterministic you should be when to break, its kind of a recursive problem. agents are inherently very tough to guardrail on an action space so big like in CUA. The guys from browser use realized it as well and built workflow-use. Or you could try RL or finetuning per task but is not viable(economically or tech wise) currently.

2) As you know, It's a very client facing/customized solution space You might find this interesting, it reflects my thoughts in the space as well. Tough to scale as a fresh startup unless you really niche down on some specific workflows. https://x.com/erikdunteman/status/1923140514549043413 (he is also building in the deterministic agent space now funnily enough) 3) It actually is annoyingly expensive with Claude if you break caching, which you have to at some point if you feed in every screenshot etc. You mentioned you use multiple models (i guess uitars/omniparser?), but in the comments you said claude?

4) Ultimately the big bet in the RPA space, as again you know, is that the TAM wont shrink a lot due to more and more SAP's, ERP's etc implementing API's. Of course the big money will always be in ancient apps that wont, but then again in that space, uipath and the others have a chokehold. (and their agentic tech is actually surprisingly good when i had a look 3 months ago)

Good luck in any case! I feel like its one of those spaces that we are definitely still a touch too early, but its such a big market that there is plenty of space for a lot of people.

deepdarkforest commented on OpenAI's new GPT-5 models announced early by GitHub theverge.com/news/752091/... · Posted by u/bkolobara

deepdarkforest · 22 days ago

> It handles complex coding tasks with minimal prompting...

I find it interesting how marketers are trying to make minimal prompting a good thing, a direction to optimize. Even if i talk to a senior engineer, i'm trying to be specific as possible to avoid ambiguities etc. Pushing the models to just do what they think its best is a weird direction. There are so many subtle things/understandings of the architecture that are just in my head or a colleagues head. Meanwhile, i found that a very good workflow is asking claude code to come back with clarifying questions and then a plan, before just starting to execute.

deepdarkforest commented on Tao on “blue team” vs. “red team” LLMs mathstodon.xyz/@tao/11491... · Posted by u/qsort

deepdarkforest · a month ago

Using LLMs as a critic/red teamer is great in theory, but economically is not that more useful, doesnt save that much time, if anything, it increases the time because you might uncover more errors or think about your work more. Which is amazing if you value quality work and you have learnt to think. Unfortunately, all the VC money is pushing the opposite, using LLMs to just do mediocre work. No point of critiquing anything if your job is to output some slop from bullet points, pass it along to the reader/recipient who also uses LLms to boil your slop down back to bullet points and pass it again etc. Even mentally, it's much more enticing or addicting to use LLMs for everything if you don't' care about the output of your work, and let your brain atrophy.

I also see this in a lot of undergrads i work with. The top 10% is even better with LLMs, they know much more and they are more productive. But the rest have just resulted to turning in clear slop with no care. I still have not read a good solution on how to incentivize/restrict the use of LLms in both academia or at work correctly. Which i suspect is just the old reality of quality work is not desirable by the vast majority, and LLMs are just magnifying this

deepdarkforest commented on Cognition (Devin AI) to Acquire Windsurf cognition.ai/blog/windsur... · Posted by u/alazsengul

greenhat76 · a month ago

I have been testing Devin for a long time, early access and all. I'm not impressed by it at all, a decent developer with their LLM of choice does a far better job.

deepdarkforest · a month ago

I mean isn't that amazing for an 1 year old product? If it's already better than a terrible dev with an LLM, or better than a decent dev without an LLM, it's not hard to imagine in 2-3-5 years Devin is better and cheaper than most hires you could do. Without having to do HR, equity etc.

deepdarkforest commented on NeuralOS: An operating system powered by neural networks neural-os.com/... · Posted by u/yuntian

yuntian · a month ago

Thanks everyone for trying out NeuralOS, and apologies for the frustrating user experience!

I coded up the demo myself and didn't anticipate how disruptive the intermittent warning messages about waiting users would become. The demo is quite resource-intensive: each session currently requires its own H100 GPU, and I'm already using a dispatcher-worker setup with 8 parallel workers. Unfortunately, demand exceeded my setup, causing significant lag and I had to limit sessions to 60 more seconds when others are waiting. Additionally, the underlying diffusion model itself is slow to run, resulting in a frame rate typically below 2 fps, further compounded by network bottlenecks.

As for model capabilities, NeuralOS is indeed quite limited at this point (as acknowledged in my paper abstract). That's why the demo interactions shown in my tweet were minimal (opening Firefox, typing a URL).

Overall, this is meant as a proof-of-concept demonstrating the potential of generative, neural-network-powered GUIs. It's fully open-source, and I hope others can help improve it going forward!

Thanks again for the honest feedback.

deepdarkforest · a month ago

This is a shockingly fresh idea. I get that this generates every pixel from scratch, unlike Gemini approaches. But, i wonder how do you think this type of neural OS would be able to communicate with the internet or other similar neural os. It would have to at least send and get http responses?

deepdarkforest commented on Death by a Thousand Slops daniel.haxx.se/blog/2025/... · Posted by u/robin_reala

xg15 · a month ago

I think looking at one example is useful: https://hackerone.com/reports/2823554

What they did was:

1) Prompt LLM for a generic description of potential buffer overflows in strcopy() and a generic demonstration code for a buffer overflow. (With no connection to curl or even OpenSSL at all)

2) Present some stack traces and grep results that show usage of strcopy() in curl and OpenSSL.

3) Simply claim that the strcopy() usages from 2) somehow indicate a buffer overflow, with no additional evidence.

4) When called out, just pretend that the demonstrator code from 1) were the evidence, even though it's obvious that it's just a textbook example and doesn't call any code from curl.

It's not that they found some potentially dangerous code in curl and didn't go all the way to prove an overflow, which could have at least some value.

The entire thing is just bullshit made to look like a vulnerability report. There is nothing behind it at all.

Edit: Oh, cherry on top: The demonstrator doesn't even use strcopy() - nor any other kind of buffer overflow. It tries to construct some shellcode in a buffer, then gives up and literally calls execve("/bin/sh")...

deepdarkforest · a month ago

> The problem is in strcpy in the src files of curl.. have you seen the exploit code ??????

The worst part is that once they are asked for clarifications by the poor maintainers, they go on offense and become aggressive. Like imagine the nerve of some people, to use LLMs to try to gaslight an actual expert that they made a mistake, and then act annoyed/angry when the expert asks normal questions