arunmu (u/arunmu) - Readit News

arunmu commented on The Case Against PGVector alex-jacobs.com/posts/the... · Posted by u/tacoooooooo

arunmu · 4 months ago

There is pgvectorscale from timescale which uses disk ann based data structure and has support for pre and post filtering.

arunmu commented on BM25 Search in Postgres tigerdata.com/blog/introd... · Posted by u/arunmu

akulkarni · 5 months ago

Appreciate the feedback! Will chat with the team.

arunmu · 5 months ago

Thanks. We are already using timescale postgres image for pgvectorscale with some customizations on tsvector and GIN indexes. Would be nice to have bm25 as well. Any specific reason why this was not made open source from the get go or is it the usual phased approach by Timescale (now Tigerdata)? If not, it is a worrying signal as the same could happen with pgvectorscale development as well.

Anyways really appreciate the free offerings by timescale. Really makes things easy.

arunmu commented on BM25 Search in Postgres tigerdata.com/blog/introd... · Posted by u/arunmu

arunmu · 5 months ago

I would very much like it if it was opensourced like pgvectorscale and timescale extension itself

arunmu commented on TimescaleDB helped us scale analytics and reporting blog.cloudflare.com/times... · Posted by u/arunmu

arunmu · 8 months ago

I am a big fan of cloudflare blogs. The tech blogs are usually highly detailed and there is so much to learn from those.

But this one, interesting but was not a practical choice at all from what I gather reading the blog. The reason given for not using Clickhouse which they are already using for analytics was vague and ambiguous. Clickhouse does support JSON which can be re-written into a more structured table using MV. Aggregation and other performance tuning steps are bread and butter of using Clickhouse.

The decision to go with postgres and learn the already known limitations the hard way and then continue using it by bringing up a new technology (Timescale) does not sound good, assuming that Cloudflare at this point might already have lots of internal tools for monitoring clickhouse clusters.

arunmu commented on Agentic patters from scratch using Groq github.com/neural-maze/ag... · Posted by u/mtrofficus

asabla · a year ago

Built a couple of things with Semantic Kernel. Both some private test projects, but also two customer facing applications and one internal.

It's heavily tilted towards OpenAI and it's offerings (either through OpenAI API or through Azure). However, it works decent enough for other alternatives as well, like: huggingface or ollama. Compared to the others (CrewAI etc). I kind of feel like Semantic Kernel hasn't really solved observe ability yet. Sure you can connect what ever logging/metric solution .Net supports, but it's not as seamless like the others. Semantic Kernel is available in .Net, Java and Python. But it's quite obvious .Net is a lot more polished then the others. Python usually gets new features faster, or at least pocs or previews.

Some learnings from it all:

- It's quite easy to get started with

- I like the distinction between native plugins and textbased ones (if a plugin should run code or not)

- There is a feeling of black magic in the background, in the sense of observe ability

- A bit more manual work to get things in order, compared to the alternatives

- Rapid development, it's quite clear the development team from Microsoft is doing a lot of work with this library

All and all, if you feel comfortable with writing C#, then Semantic Kernel is totally a viable option. If you prefer python over anything else, then I would say llamaindex or langchain is probably a better option (for now).

edit: updated some formatting

arunmu · a year ago

Thanks. I would have preferred to use Go instead of Python, but somehow the language is not picking up a lot in terms of new LLM frameworks.

As of now, I am using very light weight abstractions over prompts in python and that gets the job done. But, it is way too early and I can see how pipelining multiple LLM calls would need a good library that is not too complex and involved. In the end it is just a API call and you hope for the best result :)

arunmu commented on Agentic patters from scratch using Groq github.com/neural-maze/ag... · Posted by u/mtrofficus

arunmu · a year ago

> No LangChain, no LangGraph, no LlamaIndex, no CrewAI

Bless you. Using these over complicated abstractions (except CrewAI which I haven't yet checked out) never made sense to me. I understand that LLM is no magic wand and there is a need to make it systematic rather than slapping prompts everywhere. But these frameworks are not the solution to it. Next I will be looking at is Microsofts semantic-kernel. Anybody has any good words for it ?