Readit News logoReadit News
kk58 commented on Build real-time knowledge graph for documents with LLM   cocoindex.io/blogs/knowle... · Posted by u/badmonster
gitroom · 3 months ago
[flagged]
kk58 · 3 months ago
What's the angle you're thinking?
kk58 commented on Does RL Incentivize Reasoning in LLMs Beyond the Base Model?   limit-of-rlvr.github.io/... · Posted by u/leodriesch
kk58 · 4 months ago
Reasoning models aren't really reasoners, its basically neural style transfer protocol where you force a model "decoder" to emit tokens in a style that appears to be Reasoning like a deductive thinking.
kk58 commented on I built a large language model "from scratch"   brettgfitzgerald.com/post... · Posted by u/controversy187
sakesun · 6 months ago
I've found none of the explanations of how LLMs are built have been satisfying, especially considering how impressive the applications of them are.
kk58 · 6 months ago
Curious what's your questions that's really unanswered?
kk58 commented on "Language Models Can Solve Engineering Optimization Problems"   sciencedirect.com/science... · Posted by u/kk58
kk58 · 7 months ago
Using LLMs as Optimizers: A new population-based method called LEO that leverages large language models for numerical optimization tasks like nozzle shape and windfarm layout design. Shows comparable results to traditional methods while highlighting unique challenges of LLM-based optimization
kk58 commented on An Introduction to Neural Ordinary Differential Equations [pdf]   diposit.ub.edu/dspace/bit... · Posted by u/gballan
barrenko · 7 months ago
How would you describe what a neural ODE is in the simplest possible terms? Let's say I know what an NN and a DE are :).
kk58 · 7 months ago
classic NN takes a vector of data through layers to make a prediction. Backprop adjusts network weights till predictions are right. These network weights form a vector, and training changes this vector till it hits values that mean "trained network".

Neural ODE reframes this: instead of focusing on the weights, focus on how they change. It sees training as finding a path from untrained to trained state. At each step, it uses ODE solvers to compute the next state, continuing for N steps till it reaches values matching training data. This gives you the solution for the trained network.

kk58 commented on Eighty Years of the Finite Element Method (2022)   link.springer.com/article... · Posted by u/sandwichsphinx
angry_moose · 10 months ago
Those are definitely up next in the flashy-new-thing pipeline and I'm not that up to speed on them yet.

Another group within my company is evaluating them right now and the early results seems to be "not very accurate, but directionally correct and very fast" so there may be some value in non-FEM experts using them to quickly tell if A or B is a better design; but will still need a more proper analysis in more accurate tools.

It's still early though and we're just starting to see the first non-research solvers hitting the market.

kk58 · 10 months ago
Very curious, we are getting good results with PiNN and operators, what's your domain?
kk58 commented on STORM: Get a Wikipedia-like report on your topic   storm.genie.stanford.edu/... · Posted by u/fragmede
kk58 · a year ago
Doesn't even work. 500 error
kk58 commented on The Tyranny of Possibilities in the Design of Task-Oriented LLM Systems   arxiv.org/abs/2312.17601... · Posted by u/dhruvdh
kk58 · 2 years ago
LLM are fundamentally stateless. The exoskeleton of agents is essentially a work around to enable a state that resides externally to a LLM. This is required in some manner for creating an agent. Task oriented agents require reasoning and planning but the nature of its wildly different compared to the kind of behaviour that is observed in the interactive simulacra paper. Your ideas have merits but I feel it needs to have some further qualifications

u/kk58

KarmaCake day130August 25, 2017
About
Machine learning reasearch on graphs/complex network, climate adaptation technology and scientific machine learning
View Original