richardmeng (u/richardmeng)

richardmeng commented on Vectorless: open-source PDF chatbot without RAG · Posted by u/richardmeng

revskill · 6 months ago

Instead of a dockerfile now u got a vendor lockin thing.

richardmeng · 6 months ago

We'll dockerize it.

richardmeng commented on Vectorless: open-source PDF chatbot without RAG · Posted by u/richardmeng

revskill · 6 months ago

Lol, vercel again.

richardmeng · 6 months ago

what's wrong with vercel?

richardmeng commented on Show HN: Sumble – knowledge graph for GTM data – query tech stack, key projects sumble.com... · Posted by u/antgoldbloom

richardmeng · 7 months ago

Sumble has been my critical tool to research the organization structure and responsibility in a large company, technology adoption like which organization has the LLM adoption.

Congrats on the launch!

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

nextworddev · 2 years ago

Why not just focus on the UI part and make it integrate with different data sources?

richardmeng · 2 years ago

A lot of infrastructure work is needed to make the SQL experience seamless work for unstructured data. And at the most part we do fork the open core data warehouse and build on top of it.

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

Irishsteve · 2 years ago

When you say parse - do you mean for prior art or to generate ideas?

richardmeng · 2 years ago

I think by parse it means more like document understanding

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

alpineidyll3 · 2 years ago

I am glad to see people focusing on this.

If this tool could parse drug patents and draw molecular structures with associated data, I know we would pay 200k/yr+ for that service, and there's a market for it.

In my own field, there's an incredibly important application to parse patents and scientific papers, but this would require specific image=>text models in order to get the required information out with high fidelity. Do you guys have plans to enable user supplied workflows where perhaps image patches can be sent to bespoke encoders, or finetunes?

richardmeng · 2 years ago

Today's large vision models like GPT-4o can parse the content heavy papers pretty well (and respect their structures).

Yah basically it allows you to send PDFs as image patches into GPT-4o model that workflow can be easily built.

Feel free to send me an email richard@roe-ai.com, happy to evaluate your case and try to save that 200K :p

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

namanyayg · 2 years ago

Congrats on the launch! What are you using to make the LLM understand a video file?

Are you doing transcription + sending frames to a vision or is there a third party service for this?

richardmeng · 2 years ago

We use Gemini to analyze the video in its raw format.

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

fsndz · 2 years ago

Why this when I can just use postgreSQL and pgvector ? Like in this example I found recently: https://www.lycee.ai/courses/91b8b189-729a-471a-8ae1-717033c...

richardmeng · 2 years ago

To add to Jason's point -

There is a big UI part here, because for multimodal data analytics, we think it's crucial for people to see and hear data.

For the RAG search, many DBs have built-in vector search, but chunking, indexing, and maintaining the index are kind of on your own. This may not be a problem for technical people, but it's a hassle for data people who own hundreds of data products within a company. Therefore, we have a semantic search index builder that allows one to build an auto-refreshing semantic search index with no code, and completely keep hands free from coming up with their own vectors.

In addition, data analysis often needs to interrogate the search results further. For example, let's say we have used pgvector to find all the photos related to the Golden Gate Bridge. But then we want to interrogate questions like which of these images has someone wearing a blue shirt. We have to apply another model, and that is outside of a normal DB's responsibility.

richardmeng commented on Launch HN: Roe AI (YC W24) – AI-powered data warehouse to query multimodal data · Posted by u/richardmeng

datadrivenangel · 2 years ago

Is this more for data engineers or data analysts?

Seems like the type of thing that would be very useful in helping build data pipelines on semi-structured data.

richardmeng · 2 years ago

I guess to add to Jason's point, it depends on how data engineers/data analysts are perceived in their roles within the company. For some companies, we see a data analyst taking end-to-end responsibility from the data engineering to BI, but for others we also see a clear separation, data engineers doing data pipelining and data modeling, but data analysts are, in fact, business analysts. Regardless, we think that SQL is the common interface for both of the parties, and we're excited to see who will be the power users.