This interface needs to have a better relationship with streaming, there is always a lag in response and a lot of people are going to want to stream the response in non blocking threads instead of hanging the process waiting for the response. Its possible this is just a documentation issue, but either way streaming is a first class citizen on anything that takes more than a couple seconds to finish and uses IO.
Valid point. I'm actually already working on testing better streaming using async-http-faraday, which configures the default adapter to use async_http with falcon and async-job instead of thread-based approaches like puma and SolidQueue. This should significantly improve resource efficiency for AI workloads in Ruby - something I'm not aware is implemented by other major Ruby LLM libraries. The current approach with blocks is idiomatic Ruby, but the upcoming async support will make the library even better for production use cases. Stay tuned!
This will synchronously block until ‘chat.ask’ returns though. Be prepared to be paying for the memory of your whole app tens/low hundreds of MB of memory being held alive doing nothing (other than handling new chunks) until whatever streaming API this is using under the hood is finished streaming.
Every language prioritizes something (or somethings) because every language was made by a person (or people) with a reason; python and correctness; Java and splitting up work; Go and something like "simplicity" (not that these are the only priorities for each language). As another comment points out, Matz prioritized developer happiness.
My favorite example of this is the amazing useful and amazing whack Ruby array arithmetic; subtraction (`arr1 - arr2`) is element-wise removal, but addition (`arr1 + arr2`) is a simple append. These are almost always exactly what you want to do when you reach for them, but they're completely "incorrect" mathematically.
I was an early contributor to Langchain and it was great at first - keep in mind, that's before chat models even existed, not to mention tools, JSON mode, etc.
Langchain really, I think, pushed the LLM makers forward toward adding those features but unfortunately it got left in the dust and became somewhat of a zombie. Simultaneously, the foundational LLM providers kept adding things to turn them more into a walled garden, where you no longer needed to connect multiple things (like scraping websites with one tool, feeding that into the LLM, then storing in a vector datastore - now that's all built in).
I think Langchain has tried to pivot (more than once perhaps) but had they not taken investor $$ early on (and good for them) I suspect that it would have just dried up and the core team would have gone on to work at OpenAI, Anthropic, etc.
langchain and llamaindex are such garbage libraries: not only they never document half of the features they have, but they keep breaking their APIs from one version to the next.
I was about to mention those. I decided a while ago to build everything myself instead of relying on these libraries. We could use a PythonLLM over here because it seems like nobody cares about developer experience in the Python space.
Thank you! This is what the Ruby community has always prioritized - developer experience. Making complex things simple and joyful to use isn't just aesthetic preference, it's practical engineering. When your interface matches how developers think about the problem domain, you get fewer bugs and more productivity.
Thanks for flagging this. The eval was only in the docs and meant only as an example, but we definitely don't want to promote dangerous patterns in the docs. I updated them.
I think it's the very nice-looking and clean high-level API that should be a pleasure to use (when it fits the job, of course).
I'm pretty sure this API semantics (instance builder to configure, and then it's ask/paint/embed with language-native way to handle streaming and declarative tools) would look beautiful and easy to use in many other languages, e.g. I can imagine a similar API - save, of course, for the Rails stuff - in Python, C# or Erlang. While this level of API may be not perfectly sufficient for all possible LLM use cases, it should certainly speed up development time when this level of API is all that's possible needed.
It's the extra parens, semi-colons, keywords and type annotations. Ruby makes the tradeoff for legibility above all else. Yes, you can obviously read the TypeScript, but there's an argument to be made that it takes more effort to scan the syntax as well as to write the code.
Also:
const chat: Chat = LLM.chat;
...is not instantiating a class, where Ruby is doing so behind the scenes. You'd need yet another pair of parens to make a factory!
Aside from that the DSL is quite excellent.
Checkout the async gem, including async-http, async-websockets, and the Falcon web server.
https://github.com/socketry/falcon
Valid point. I'm actually already working on testing better streaming using async-http-faraday, which configures the default adapter to use async_http with falcon and async-job instead of thread-based approaches like puma and SolidQueue. This should significantly improve resource efficiency for AI workloads in Ruby - something I'm not aware is implemented by other major Ruby LLM libraries. The current approach with blocks is idiomatic Ruby, but the upcoming async support will make the library even better for production use cases. Stay tuned!
My favorite example of this is the amazing useful and amazing whack Ruby array arithmetic; subtraction (`arr1 - arr2`) is element-wise removal, but addition (`arr1 + arr2`) is a simple append. These are almost always exactly what you want to do when you reach for them, but they're completely "incorrect" mathematically.
Langchain really, I think, pushed the LLM makers forward toward adding those features but unfortunately it got left in the dust and became somewhat of a zombie. Simultaneously, the foundational LLM providers kept adding things to turn them more into a walled garden, where you no longer needed to connect multiple things (like scraping websites with one tool, feeding that into the LLM, then storing in a vector datastore - now that's all built in).
I think Langchain has tried to pivot (more than once perhaps) but had they not taken investor $$ early on (and good for them) I suspect that it would have just dried up and the core team would have gone on to work at OpenAI, Anthropic, etc.
Dead Comment
It doesn't deal with any of the hard problems you'll routine face with implementation.
I'm pretty sure this API semantics (instance builder to configure, and then it's ask/paint/embed with language-native way to handle streaming and declarative tools) would look beautiful and easy to use in many other languages, e.g. I can imagine a similar API - save, of course, for the Rails stuff - in Python, C# or Erlang. While this level of API may be not perfectly sufficient for all possible LLM use cases, it should certainly speed up development time when this level of API is all that's possible needed.
If you see the typescript options it's like giving yourself a water boarding session through your own volition.
Also:
...is not instantiating a class, where Ruby is doing so behind the scenes. You'd need yet another pair of parens to make a factory!This is mainly a matter of syntactic style!
Dead Comment
Dead Comment
Ruby: late to the party, brought a keg.
Keep going! Happy to see ollama support PR in draft.