Readit News logoReadit News
qdequelen commented on Tantivy – full-text search engine library inspired by Apache Lucene   github.com/quickwit-oss/t... · Posted by u/kaathewise
PSeitz · a year ago
They serve quite different use cases.

quickwit was built to handle extremely large data volumes, you can ingest and search TB and PB of logs.

meilisearches indexing doesn't scale as it will become slower the more data you have, e.g. I failed to ingest 7GB of data.

qdequelen · a year ago
Hey PSeitz, Meilisearch CEO here. Sorry to hear that you failed to index a low volume of data. When did you last try Meilisearch? We have made significant improvements in the indexing speed. We have a customer with hundreds of gigabytes of raw data on our cloud, and it scales amazingly well. https://x.com/Kerollmops/status/1772575242885484864
qdequelen commented on Nvidia's Chat with RTX is an AI chatbot that runs locally on your PC   theverge.com/2024/2/13/24... · Posted by u/nickthegreek
westurner · 2 years ago
From "Artificial intelligence is ineffective and potentially harmful for fact checking" (2023) https://news.ycombinator.com/item?id=37226233 : pdfgpt, knowledge_gpt, elasticsearch :

> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?

> How does search compare to generating things with citations?

pdfGPT: https://github.com/bhaskatripathi/pdfGPT :

> PDF GPT allows you to chat with the contents of your PDF file by using GPT capabilities.

GH "pdfgpt" topic: https://github.com/topics/pdfgpt

knowledge_gpt: https://github.com/mmz-001/knowledge_gpt

From https://news.ycombinator.com/item?id=39112014 : paperai

neuml/paperai: https://github.com/neuml/paperai :

> Semantic search and workflows for medical/scientific papers

RAG: https://news.ycombinator.com/item?id=38370452

Google Desktop (2004-2011): https://en.wikipedia.org/wiki/Google_Desktop :

> Google Desktop was a computer program with desktop search capabilities, created by Google for Linux, Apple Mac OS X, and Microsoft Windows systems. It allowed text searches of a user's email messages, computer files, music, photos, chats, Web pages viewed, and the ability to display "Google Gadgets" on the user's desktop in a Sidebar

GNOME/tracker-miners: https://gitlab.gnome.org/GNOME/tracker-miners

src/miners/fs: https://gitlab.gnome.org/GNOME/tracker-miners/-/tree/master/...

SPARQL + SQLite: https://gitlab.gnome.org/GNOME/tracker-miners/-/blob/master/...

https://news.ycombinator.com/item?id=38355385 : LocalAI, braintrust-proxy; promptfoo, chainforge, mixtral

qdequelen · 2 years ago
> Are LLM tools better or worse than e.g. meilisearch or elasticsearch for searching with snippets over a set of document resources?

Absolutely worse, LLM are not made for it at all.

qdequelen commented on Meilisearch v1.6   blog.meilisearch.com/meil... · Posted by u/Culonavirus
qdequelen · 2 years ago
Thanks for the highlight!
qdequelen commented on Fly Kubernetes   fly.io/blog/fks/... · Posted by u/ferriswil
qdequelen · 2 years ago
Do you handle high throughput volumes? I would need this for testing to host a database service at scale.
qdequelen commented on Show HN: I scraped 25M Shopify products to build a search engine   searchagora.com/... · Posted by u/pencildiver
pencildiver · 2 years ago
Scraper is built in Javascript and a Mongo database. Probably not the most scalable way to do it, but I found that all Shopify stores have a public JSON file available at [Base URL]/products.json. So found a list of stores, built a crawler to go store-by-store, and standardized the data on my end.

Here's an example: https://www.wildfox.com/products.json

qdequelen · 2 years ago
Did you only get the schema.json?
qdequelen commented on Show HN: I scraped 25M Shopify products to build a search engine   searchagora.com/... · Posted by u/pencildiver
qdequelen · 2 years ago
Hey, I'm the CEO of Meilisearch. If your issue is performance, I would love to give you a try with Meilisearch. You'll be able to create an "as you type" experience with our engine that responds in less than 50ms!
qdequelen commented on How to deliver the best search results: inside a full text search engine   blog.meilisearch.com/how-... · Posted by u/CaroFG
qdequelen · 2 years ago
Thanks @CaroFG, I'm sure that if anyone has questions, the engineering team would love to answer!
qdequelen commented on Algolia New Pricing   algolia.com/pricing/... · Posted by u/naiv
naiv · 2 years ago
I think you are correct.

My biggest issues with Algolia were always:

- search requests should be separated from number of documents

-- think of a geoname service where there are 10 mio. documents vs. 500k search requests -- seems to be solved now

- it is crazy to require a new index for each sort direction

-- this is still the case

imho they should introduce cpu cycles + storage. until then self hosted Typesense, Meilisearch, Elasticsearch or hosted Typesense, Elasticsearch are still superior. I am leaving out Meilisearch here as their entry level is also nuts at 1.2k/month for hosted.

qdequelen · 2 years ago
Hello, I'm the Meilisearch CEO. I think you're also correct, Jabo.

I just want to clarify. Meilisearch's pricing doesn't start at 1.2K/month, but at 0/month. We have a usage-based pricing that is basically 0.25/1000 documents and searches. And, funny thing, we are thinking about splitting the searches and documents, too, but we wanted to have more data to be sure to select the right unit price for each. :)

qdequelen commented on Rust Is the Future of JavaScript Infrastructure (2021)   leerob.io/blog/rust... · Posted by u/winter_blue
qdequelen · 3 years ago
Make the web faster with Rust.

u/qdequelen

KarmaCake day271December 8, 2019
About
Co-Founder & CEO @meilisearch. Ex student at @42born2code.
View Original