Readit News logoReadit News
tgtweak commented on Elevated errors across many models   status.claude.com/inciden... · Posted by u/pablo24602
tgtweak · 2 days ago
I trust companies that immediately and regularly update their status/issues page and follow up any outages with proper and comprehensive post-mortems. Sadly this is becoming the exception these days and not the norm.
tgtweak commented on Elevated errors across many models   status.claude.com/inciden... · Posted by u/pablo24602
tgtweak · 2 days ago
There really should be an http header dedicated to "outage status" with a link to the service outage details page... clients (for example, in this case, your code IDE) could intercept this and notify users.

503 is cool and yes, there is the "well if it's down how are they going to put that up" but in reality most downtimes you see are on the backend and not on the reverse proxies/gateways/cdns where it would be pretty trivial to add a issues/status header with a link to the service status page and a note.

tgtweak commented on GPT-5.2   openai.com/index/introduc... · Posted by u/atgctg
patates · 5 days ago
I don't have that experience with gemini. Up to 90% full, it's just fine.
tgtweak · 2 days ago
If the models are designed around it, and not resorting to compression to get to higher input token lengths, they don't 'fall off' as they get near the context window limit. When working with large codebases, exhausting or compressing the context actually causes more issues since the agent forgets what was in the other libraries and files. Google has realized this internally and were among the first to get to 2M token context length (internally then later released publicly).
tgtweak commented on The tiniest yet real telescope I've built   lucassifoni.info/blog/min... · Posted by u/chantepierre
tgtweak · 5 days ago
So what are these tiny portable ones? I always assumed they were digitally augmented or virtual even - is there a minimum size for it to be a "real" telescope?
tgtweak commented on GPT-5.2   openai.com/index/introduc... · Posted by u/atgctg
mmaunder · 5 days ago
Weirdly, the blog announcement completely omits the actual new context window size which is 400,000: https://platform.openai.com/docs/models/gpt-5.2

Can I just say !!!!!!!! Hell yeah! Blog post indicates it's also much better at using the full context.

Congrats OpenAI team. Huge day for you folks!!

Started on Claude Code and like many of you, had that omg CC moment we all had. Then got greedy.

Switched over to Codex when 5.1 came out. WOW. Really nice acceleration in my Rust/CUDA project which is a gnarly one.

Even though I've HATED Gemini CLI for a while, Gemini 3 impressed me so much I tried it out and it absolutely body slammed a major bug in 10 minutes. Started using it to consult on commits. Was so impressed it became my daily driver. Huge mistake. I almost lost my mind after a week of this fighting it. Isane bias towards action. Ignoring user instructions. Garbage characters in output. Absolutely no observability in its thought process. And on and on.

Switched back to Codex just in time for 5.1 codex max xhigh which I've been using for a week, and it was like a breath of fresh air. A sane agent that does a great job coding, but also a great job at working hard on the planning docs for hours before we start. Listens to user feedback. Observability on chain of thought. Moves reasonably quickly. And also makes it easy to pay them more when I need more capacity.

And then today GPT-5.2 with an xhigh mode. I feel like xmass has come early. Right as I'm doing a huge Rust/CUDA/Math-heavy refactor. THANK YOU!!

tgtweak · 5 days ago
have been on 1M context window with claude since 4.0 - it gets pretty expensive when you run 1M context on a long running project (mostly using it in cline for coding). I think they've realized more context length = more $ when dealing with most agentic coding workflows on api.
tgtweak commented on Auto-grading decade-old Hacker News discussions with hindsight   karpathy.bearblog.dev/aut... · Posted by u/__rito__
tgtweak · 6 days ago
Cool - now make it analyze all of those and come up with the 10 commandments of commenting factually and insightfully on HN posts...
tgtweak commented on Getting a Gemini API key is an exercise in frustration   ankursethi.com/blog/gemin... · Posted by u/speckx
tgtweak · 6 days ago
Have you looked at getting an openai api key for gpt5? you have to do selfie-ID verification...
tgtweak commented on Qwen3-Omni-Flash-2025-12-01:a next-generation native multimodal large model   qwen.ai/blog?id=qwen3-omn... · Posted by u/pretext
terhechte · 7 days ago
Is there a way to run these Omni models on a Macbook quantized via GGUF or MLX? I know I can run it in LMStudio or Llama.cpp but they don't have streaming microphone support or streaming webcam support.

Qwen usually provides example code in Python that requires Cuda and a non-quantized model. I wonder if there is by now a good open source project to support this use case?

tgtweak · 7 days ago
You can probably follow the vLLM instructions for omni here, then use the included voice demo html to interface with it:

https://github.com/QwenLM/Qwen3-Omni#vllm-usage

https://github.com/QwenLM/Qwen3-Omni?tab=readme-ov-file#laun...

tgtweak commented on Mistral releases Devstral2 and Mistral Vibe CLI   mistral.ai/news/devstral-... · Posted by u/pember
lostmsu · 7 days ago
V100 is outdated (no bf16, dropped in CUDA 13) and power hungry (8 cards 3 years continuous use are about $12k of electricity).
tgtweak · 7 days ago
Depends where you are plugging them in - but yes they are older gen - despite this, 8xV100 will outperform most of what you can buy for that price simply by way of memory and nvlink bandwidth. If you want to practically run a local model that takes 200GB of memory (Devstral-2-123B-Instruct-2512 for example or GPT-OSS-120B with long context window) without resorting to aggressive ggufs or memory swapping, you don't have many cheaper options. You can also parallelize several models on one node to get some additional throughput for bulk jobs.
tgtweak commented on Mistral releases Devstral2 and Mistral Vibe CLI   mistral.ai/news/devstral-... · Posted by u/pember
tgtweak · 7 days ago
PSA: 10X savings when you have to prompt it 10 times to get the correct solution is not actually faster.

u/tgtweak

KarmaCake day3976January 18, 2016View Original