veryluckyxyz (u/veryluckyxyz) - Readit News

Posted by u/veryluckyxyz 9 days ago

Deep Think with Confidence jiaweizzhao.github.io/dee...

Posted by u/veryluckyxyz 3 months ago

A Batch Size and Token NUM- BER Agnostic Learning Rate Scheduler arxiv.org/abs/2408.13359...

Posted by u/veryluckyxyz 3 months ago

Easily Understand Rdma Technology naddod.com/blog/easily-un...

Posted by u/veryluckyxyz 3 months ago

Model Merging in Pre-Training of Large Language Models arxiv.org/abs/2505.12082...

Posted by u/veryluckyxyz 4 months ago

Understanding Perception and Reasoning Through Model Merging arxiv.org/abs/2505.05464...

Posted by u/veryluckyxyz 4 months ago

Building and better understanding vision-language models (2024)huggingface.co/papers/240...

Posted by u/veryluckyxyz 4 months ago

HF smolagents computer-agent demo huggingface.co/spaces/smo...

Posted by u/veryluckyxyz 4 months ago

Do Reasoning Models Show Better Verbalized Calibration?arxiv.org/abs/2504.06564...

Posted by u/veryluckyxyz 5 months ago

Robustly identifying concepts introduced during chat fine-tuning with crosscoder arxiv.org/abs/2504.02922...

Posted by u/veryluckyxyz 5 months ago

Retrieval with Learned Similarities arxiv.org/pdf/2407.15462v...

u/veryluckyxyz

KarmaCake day534September 30, 2014View Original