Readit News logoReadit News
eachro commented on FunctionGemma 270M Model   blog.google/technology/de... · Posted by u/mariobm
eachro · 2 days ago
Do you think this would be appropriate for a command line tool that hits various apis as the function calls? Ex: "what's the weather in SF tomorrow?" Or "daily price change of apple, Tesla stock for past week"? (Let's assume I have documented the apis thoroughly somewhere that the model has access to or fine tuned it on this data)
eachro commented on Integer Programming (1977) [pdf]   web.mit.edu/15.053/www/AM... · Posted by u/todsacerdoti
eachro · 3 months ago
Does anyone know what the state of the art industry solvers do for these problems? I had dabbled a bit in ml approaches to combinatorial optimization with great interest a few years back, but I don't think any of these rl based methods ended up being used in production.
eachro commented on Amazon has mostly sat out the AI talent war   businessinsider.com/amazo... · Posted by u/ripe
eachro · 4 months ago
Didn't Amazon aquihire Adept Labs?
eachro commented on Gemma 3 270M re-implemented in pure PyTorch for local tinkering   github.com/rasbt/LLMs-fro... · Posted by u/ModelForge
eachro · 4 months ago
If you wanted to train it from scratch, how long would it take on a reasonable GPU setup?
eachro commented on Fairness is what the powerful 'can get away with' study shows   phys.org/news/2025-07-fai... · Posted by u/PaulHoule
eachro · 4 months ago
I'm reminded of the nixon quote: "When the president does it, that means it's not illegal."
eachro commented on NYC's office-to-residential conversions could create 17,000 new homes   6sqft.com/nycs-first-wave... · Posted by u/geox
eachro · 5 months ago
What would it take to make NYC more like Tokyo where you have consumer/retail level things on the not-ground floor level.
eachro commented on Smollm3: Smol, multilingual, long-context reasoner LLM   huggingface.co/blog/smoll... · Posted by u/kashifr
eachro · 5 months ago
From what I've heard, the llama3 models are fairly easy to fine-tune (please correct me if I'm wrong or if there are more amenable models here). How easy is it to finetune smollm3? I know a lot of the MoE LLMs have been quite fickle in this regard.
eachro commented on Everything around LLMs is still magical and wishful thinking   dmitriid.com/everything-a... · Posted by u/troupo
eachro · 6 months ago
"And 50% of the time they work 50% of the time."

I think this is still an incredible outcome given how many dice rolls you can take in parallel with multiple claude/o3/gemini attempts at a problem with slightly different prompts. Granted, each rollout does not come for free given the babysitting you need to do but the cost is much lower than going down the path yourself/having junior colleagues make the attempt.

eachro commented on Use keyword-only arguments in Python dataclasses   chipx86.blog/2025/06/29/t... · Posted by u/Bogdanp
gjvc · 6 months ago
did you mean: "pydantic base models" ?
eachro · 6 months ago
Yeah haha I got autocorrected
eachro commented on Use keyword-only arguments in Python dataclasses   chipx86.blog/2025/06/29/t... · Posted by u/Bogdanp
eachro · 6 months ago
Is there a reason to use data classes over pedantic base models anymore?

u/eachro

KarmaCake day884March 5, 2013View Original