You need to enable JavaScript to run this app.
Readit News
Overwview
Stories
Comments
Posted by
u/veryluckyxyz
9 days ago
Deep Think with Confidence
jiaweizzhao.github.io/dee...
Posted by
u/veryluckyxyz
3 months ago
A Batch Size and Token NUM- BER Agnostic Learning Rate Scheduler
arxiv.org/abs/2408.13359...
Posted by
u/veryluckyxyz
3 months ago
Easily Understand Rdma Technology
naddod.com/blog/easily-un...
Posted by
u/veryluckyxyz
3 months ago
Model Merging in Pre-Training of Large Language Models
arxiv.org/abs/2505.12082...
Posted by
u/veryluckyxyz
4 months ago
Understanding Perception and Reasoning Through Model Merging
arxiv.org/abs/2505.05464...
Posted by
u/veryluckyxyz
4 months ago
Building and better understanding vision-language models (2024)
huggingface.co/papers/240...
Posted by
u/veryluckyxyz
4 months ago
HF smolagents computer-agent demo
huggingface.co/spaces/smo...
Posted by
u/veryluckyxyz
4 months ago
Do Reasoning Models Show Better Verbalized Calibration?
arxiv.org/abs/2504.06564...
Posted by
u/veryluckyxyz
5 months ago
Robustly identifying concepts introduced during chat fine-tuning with crosscoder
arxiv.org/abs/2504.02922...
Posted by
u/veryluckyxyz
5 months ago
Retrieval with Learned Similarities
arxiv.org/pdf/2407.15462v...
Load more content (10 of 75)
v
u/veryluckyxyz
Karma
Cake day
534
September 30, 2014
View Original