Congrats on the launch!
If this tool could parse drug patents and draw molecular structures with associated data, I know we would pay 200k/yr+ for that service, and there's a market for it.
In my own field, there's an incredibly important application to parse patents and scientific papers, but this would require specific image=>text models in order to get the required information out with high fidelity. Do you guys have plans to enable user supplied workflows where perhaps image patches can be sent to bespoke encoders, or finetunes?
Yah basically it allows you to send PDFs as image patches into GPT-4o model that workflow can be easily built.
Feel free to send me an email richard@roe-ai.com, happy to evaluate your case and try to save that 200K :p
Are you doing transcription + sending frames to a vision or is there a third party service for this?
There is a big UI part here, because for multimodal data analytics, we think it's crucial for people to see and hear data.
For the RAG search, many DBs have built-in vector search, but chunking, indexing, and maintaining the index are kind of on your own. This may not be a problem for technical people, but it's a hassle for data people who own hundreds of data products within a company. Therefore, we have a semantic search index builder that allows one to build an auto-refreshing semantic search index with no code, and completely keep hands free from coming up with their own vectors.
In addition, data analysis often needs to interrogate the search results further. For example, let's say we have used pgvector to find all the photos related to the Golden Gate Bridge. But then we want to interrogate questions like which of these images has someone wearing a blue shirt. We have to apply another model, and that is outside of a normal DB's responsibility.
Seems like the type of thing that would be very useful in helping build data pipelines on semi-structured data.