Readit News logoReadit News
atlacatl_sv commented on XBai o4, where o=open, and o4 represents our fourth gen open-source LLM tech   github.com/MetaStone-AI/X... · Posted by u/atlacatl_sv
atlacatl_sv · a month ago
XBai o4 excels in complex reasoning capabilities and has now completely surpassed OpenAI-o3-mini in Medium mode.
atlacatl_sv commented on Best explanation of Q, K, V? (Attention)    · Posted by u/profsummergig
atlacatl_sv · 7 months ago
Please have a look at this video, hope it helps: https://youtu.be/KJtZARuO3JY
atlacatl_sv commented on DeepSeek-R1   github.com/deepseek-ai/De... · Posted by u/meetpateltech
ein0p · 8 months ago
Downloaded the 14B, 32B, and 70B variants to my Ollama instance. All three are very impressive, subjectively much more capable than QwQ. 70B especially, unsurprisingly. Gave it some coding problems, even 14B did a pretty good job. I wish I could collapse the "thinking" section in Open-WebUI, and also the title for the chat is currently generated wrong - the same model is used by default as for generation, so the title begins with "<thinking>". Be that as it may, I think these will be the first "locally usable" reasoning models for me. URL for the checkpoints: https://ollama.com/library/deepseek-r1
atlacatl_sv · 7 months ago
Thanks for sharing your experience with the 14B, 32B, and 70B variants! I'm curious, what hardware setup are you using to run these models on your Ollama instance?
atlacatl_sv commented on Are there HN posts with more than 2k upvotes?    · Posted by u/dkpk
atlacatl_sv · 8 months ago
atlacatl_sv commented on Glowing Orb on Camera in the New Jersey Sky   youtube.com/watch?v=eMitm... · Posted by u/atlacatl_sv
atlacatl_sv · 9 months ago
Here is another video from the Pentagon from five years ago where they developed a plasma technology that looks very similar to the glowing orb. On another note, I'm wondering why Hacker News seems to ignore this topic. I don't see any drones or orbs on the front page.

https://youtu.be/UYr3zPP5rCw

atlacatl_sv commented on Mamba Explained: The State Space Model Taking On Transformers   kolaayonrinde.com/blog/20... · Posted by u/koayon
thecolorgreen · 2 years ago
Why doesn't Equation 1b use the h' defined in Equation 1a?
atlacatl_sv · 2 years ago
I believe h' is for the next state. y(t) is to predict the next word so it uses the current hidden state h(t).
atlacatl_sv commented on Salvadoran Government Conspired with Gang Leader to Recapture Crook   elfaro.net/en/202401/el_s... · Posted by u/atlacatl_sv
atlacatl_sv · 2 years ago
As usual, Bukele attributes everything to George Soros[1]. Every single time evidence emerges, he shifts the blame to Soros. Does anyone understand why dictators often seem to blame Soros?

[1]https://twitter.com/nayibbukele/status/1751365034477752782

u/atlacatl_sv

KarmaCake day397September 3, 2019View Original