oceansweep (u/oceansweep)

oceansweep commented on OSS ChatGPT WebUI – 530 Models, MCP, Tools, Gemini RAG, Image/Audio Gen llmspy.org/docs/v3... · Posted by u/mythz

freedomben · 2 months ago

You can't downvote a post, so that's not a factor.

Also it's not as powerful as you think. In the past I have spent a lot of time looking at /new, and upvoting stories that I think should be surfaced. The vast majority of them still never hit near the front page.

It's a real shame, because some of the best and most relevant submissions don't seem to make it.

oceansweep · 2 months ago

You can absolutely downvote posts. You have to have a certain amount of karma before the option becomes available.

oceansweep commented on Over fifty new hallucinations in ICLR 2026 submissions gptzero.me/news/iclr-2026... · Posted by u/puttycat

agentultra · 3 months ago

If I gave you a gun without a safety could you be the one to blame when it goes off because you weren’t careful enough?

The problem with this analogy is that it makes no sense.

LLMs aren’t guns.

The problem with using them is that humans have to review the content for accuracy. And that gets tiresome because the whole point is that the LLM saves you time and effort doing it yourself. So naturally people will tend to stop checking and assume the output is correct, “because the LLM is so good.”

Then you get false citations and bogus claims everywhere.

oceansweep · 3 months ago

Yes. That is absolutely the case. One of the Most popular handguns does not have a safety switch that must be toggled before firing. (Glock series handguns)

If someone performs a negligent discharge, they are responsible, not Glock. It does have other safety mechanisms to prevent accidental fires not resulting from a trigger pull.

oceansweep commented on Qwen3-VL can scan two-hour videos and pinpoint nearly every detail the-decoder.com/qwen3-vl-... · Posted by u/thm

bigmadshoe · 3 months ago

Yeah the needle in a haystack tests are so stupid. It seems clear with LLMs that performance degrades massively with context size, yet those tests claim the model performs perfectly.

oceansweep · 3 months ago

https://fiction.live/stories/Fiction-liveBench-Mar-25-2025/o...

oceansweep commented on How good engineers write bad code at big companies seangoedecke.com/bad-code... · Posted by u/gfysfm

nosianu · 3 months ago

Then how do you work with this: https://news.ycombinator.com/item?id=18442941

I did that job, just after university, but that is not my comment. I bookmarked it though because that person said it so well.

You will write bad code, because what you already find there - and that one company is not alone! - is already so bad, there is no way to do a good job on top of literally millions of escalating hacks.

And don't think that you could clean this up - not even with ten years of time is that possible. You can only rewrite from scratch. Trying to rewrite even a tiny part is like picking up one spaghetti and always ending up with the whole bowl on your fork.

oceansweep · 4 months ago

Just wanted to comment on the fact that I remember seeing that comment, and it left such an impression I remember it 7 years later. Thanks for the reminder, going to bookmark it this time.

oceansweep commented on Kimi K2 Thinking, a SOTA open-source trillion-parameter reasoning model moonshotai.github.io/Kimi... · Posted by u/nekofneko

aliljet · 4 months ago

How does one effectively use something like this locally with consumer-grade hardware?

oceansweep · 4 months ago

Epyc Genoa CPU/Mobo + 700GB of DDR5 ram. The model is a MoE, so you don't need to stuff it all into VRAM, you can use a single 3090/5090 to hold the activated weights, and hold the remaining weights in DDR5 ram. Can see their deployment guide for reference here: https://github.com/kvcache-ai/ktransformers/blob/main/doc/en...

oceansweep commented on AWS multiple services outage in us-east-1 health.aws.amazon.com/hea... · Posted by u/kondro

cutler · 5 months ago

MTBF?

oceansweep · 5 months ago

Mean time between failures

oceansweep commented on The force-feeding of AI features on an unwilling public honest-broker.com/p/the-f... · Posted by u/imartin2k

kgeist · 8 months ago

Model: Qwen3 32b

GPU: RTX 5090 (no rops missing), 32 GB VRAM

Quants: Unsloth Dynamic 2.0, it's 4-6 bits depending on the layer.

RAM is 96 GB: more RAM makes a difference even if the model fits entirely in the GPU: filesystem pages containing the model on disk are cached entirely in RAM so when you switch models (we use other models as well) the overhead of unloading/loading is 3-5 seconds.

The Key Value Cache is also quantized to 8 bit (less degrades quality considerably).

This gives you 1 generation with 64k context, or 2 concurrent generations with 32k each. Everything takes 30 GB VRAM, which also leaves some space for a Whisper speech-to-text model (turbo & quantized) running in parallel as well.

oceansweep · 8 months ago

Are you doing this with vLLM? If you're using Llama.cpp/Ollama, you could likely see some pretty massive improvements.

oceansweep commented on Reading RSS content is a skilled activity doliver.org/articles/rss-... · Posted by u/d0liver

karpathy · a year ago

I also find myself wanting to go back to RSS for the exact same reasons of 1st paragraph. You own your content and host it. Unfortunately all the RSS readers are too raw and I think one of them has to port over Twitter features, things like: more ephemeral feed instead of an inbox, reply, quote tweet, retweet, like, follow, and new: LLM-driven customizable algorithmic feed.

oceansweep · a year ago

Do you think you could expand on that? Like how you might imagine an ideal workflow to go? Would there be like a sea of tags that you could wade through, or just an '/all', only items you specifically subscribe to + your connections subscriptions according to some ranking algo? Items would 'fall off' or out of view(?) after X time/X amount of browsing a differently weighted topic?

I ask because being honest, you're a big inspiration for myself, and inadvertently, adding an LLM-curated RSS feed reader as a planned feature to a project I'm working on. (I saw https://github.com/karpathy/LLM101n when I was getting interested in LLMs, and then got inspired by your project to start to attempt to build something like the primer from the diamond age.

Where that leads is that I see an RSS feed reader + curation via self-described or identified interests as being a 'core' piece of information gathering for the 'future' individual and have had it on the to-do list as a feature-add for my project.

oceansweep commented on Show HN: Morphik – Open-source RAG that understands PDF images, runs locally github.com/morphik-org/mo... · Posted by u/Adityav369

thot_experiment · a year ago

I'd love to have something like this but calling a cloud is a no-go for me. I have a half baked tool that a friend of mine and I applied to the Mozilla Builders Grant with (didn't get in), it's janky and I don't have time to work on it right now but it does the thing. I also find myself using OpenWebUI's context RAG stuff sometimes but I'd really like to have a way to dump all of my private documents into a DB and have search/RAG work against them locally, preferably in a way that's agnostic of the LLM backend.

Does such a project exist?

oceansweep · a year ago

Hey yes, I’m building exactly that.

https://github.com/rmusser01/tldw

I first built a POC in gradio and am now rebuilding it as a FastAPI app. The media processing endpoints work but I’m still tweaking media ingestion to allow for syncing to clients(idea is to allow for client-first design). The GitHub doesn’t show any of the recent changes, but if you check back in 2-3 weeks, I think I’ll have the API version pushed to the main branch.

oceansweep commented on TL;DW: Too Long; Didn't Watch Distill YouTube Videos to the Relevant Information tldw.tube/... · Posted by u/pkaeding

ryanmcbride · a year ago

I thought maybe I'd finally be able to through a Wendigoon video with this, since I'm generally interested in the things he talks about but can't stand some of his linguistic tics. Unfortunately looks like most of his videos are too long for this, which I guess is ironic considering the name.

TL;DP

oceansweep · a year ago

You could try my app https://github.com/rmusser01/tldw

Supports arbitrary length videos and also lets you choose what LLM API to use.