eachro (u/eachro) - Readit News

eachro commented on FunctionGemma 270M Model blog.google/technology/de... · Posted by u/mariobm

eachro · 2 days ago

Do you think this would be appropriate for a command line tool that hits various apis as the function calls? Ex: "what's the weather in SF tomorrow?" Or "daily price change of apple, Tesla stock for past week"? (Let's assume I have documented the apis thoroughly somewhere that the model has access to or fine tuned it on this data)

eachro commented on Integer Programming (1977) [pdf] web.mit.edu/15.053/www/AM... · Posted by u/todsacerdoti

eachro · 3 months ago

Does anyone know what the state of the art industry solvers do for these problems? I had dabbled a bit in ml approaches to combinatorial optimization with great interest a few years back, but I don't think any of these rl based methods ended up being used in production.

eachro commented on Amazon has mostly sat out the AI talent war businessinsider.com/amazo... · Posted by u/ripe

eachro · 4 months ago

Didn't Amazon aquihire Adept Labs?

eachro commented on Gemma 3 270M re-implemented in pure PyTorch for local tinkering github.com/rasbt/LLMs-fro... · Posted by u/ModelForge

eachro · 4 months ago

If you wanted to train it from scratch, how long would it take on a reasonable GPU setup?

eachro commented on Fairness is what the powerful 'can get away with' study shows phys.org/news/2025-07-fai... · Posted by u/PaulHoule

eachro · 4 months ago

I'm reminded of the nixon quote: "When the president does it, that means it's not illegal."

eachro commented on NYC's office-to-residential conversions could create 17,000 new homes 6sqft.com/nycs-first-wave... · Posted by u/geox

eachro · 5 months ago

What would it take to make NYC more like Tokyo where you have consumer/retail level things on the not-ground floor level.

eachro commented on Smollm3: Smol, multilingual, long-context reasoner LLM huggingface.co/blog/smoll... · Posted by u/kashifr

eachro · 5 months ago

From what I've heard, the llama3 models are fairly easy to fine-tune (please correct me if I'm wrong or if there are more amenable models here). How easy is it to finetune smollm3? I know a lot of the MoE LLMs have been quite fickle in this regard.

eachro commented on Everything around LLMs is still magical and wishful thinking dmitriid.com/everything-a... · Posted by u/troupo

eachro · 6 months ago

"And 50% of the time they work 50% of the time."

I think this is still an incredible outcome given how many dice rolls you can take in parallel with multiple claude/o3/gemini attempts at a problem with slightly different prompts. Granted, each rollout does not come for free given the babysitting you need to do but the cost is much lower than going down the path yourself/having junior colleagues make the attempt.

eachro commented on Use keyword-only arguments in Python dataclasses chipx86.blog/2025/06/29/t... · Posted by u/Bogdanp

gjvc · 6 months ago

did you mean: "pydantic base models" ?

eachro · 6 months ago

Yeah haha I got autocorrected

eachro commented on Use keyword-only arguments in Python dataclasses chipx86.blog/2025/06/29/t... · Posted by u/Bogdanp

eachro · 6 months ago

Is there a reason to use data classes over pedantic base models anymore?

u/eachro

KarmaCake day884March 5, 2013View Original