> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?
> How does search compare to generating things with citations?
pdfGPT: https://github.com/bhaskatripathi/pdfGPT :
> PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities.
GH "pdfgpt" topic: https://github.com/topics/pdfgpt
knowledge_gpt: https://github.com/mmz-001/knowledge_gpt
From https://news.ycombinator.com/item?id=39112014 : paperai
neuml/paperai: https://github.com/neuml/paperai :
> Semantic search and workflows for medical/scientific papers
RAG: https://news.ycombinator.com/item?id=38370452
Google Desktop (2004-2011): https://en.wikipedia.org/wiki/Google_Desktop :
> Google Desktop was a computer program with desktop search capabilities, created by Google for Linux, Apple Mac OS X, and Microsoft Windows systems. It allowed text searches of a user's email messages, computer files, music, photos, chats, Web pages viewed, and the ability to display "Google Gadgets" on the user's desktop in a Sidebar
GNOME/tracker-miners: https://gitlab.gnome.org/GNOME/tracker-miners
src/miners/fs: https://gitlab.gnome.org/GNOME/tracker-miners/-/tree/master/...
SPARQL + SQLite: https://gitlab.gnome.org/GNOME/tracker-miners/-/blob/master/...
https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; promptfoo, chainforge, mixtral
Absolutely worse, LLM are not made for it at all.
quickwit was built to handle extremely large data volumes, you can ingest and search TB and PB of logs.
meilisearches indexing doesn't scale as it will become slower the more data you have, e.g. I failed to ingest 7GB of data.