d7y (u/d7y) - Readit News

d7y commented on Ask HN: How do I train a custom LLM/ChatGPT on my own documents in Dec 2023? · Posted by u/divan

d7y · 2 years ago

Try https://github.com/SecureAI-Tools/SecureAI-Tools -- it's an open-source application layer for Retrieval-Augmented Generation (RAG). It allows you to use any LLM -- you can use OpenAI APIs, or run models locally with Ollama.

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

teruakohatu · 2 years ago

Can a directory of PDFs be queried or does it only support a single document?

d7y · 2 years ago

Hi there!

We just added this in the latest release (v0.0.2). You can now create a document collection and upload as many PDFs into it as needed. The documents are processed in the background and once processing finishes, you can create as many chats with it as needed.

Quick demo: https://youtu.be/PwvfVx8VCoY Installation Instructions: https://github.com/SecureAI-Tools/SecureAI-Tools?tab=readme-...

Please try it out, and let me know how it goes. We're always looking to improve the tool so let us know if you have any feedback for us :)

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

vunderba · 2 years ago

Great job. This is a relatively crowded area, particularly RAG style chat systems. It might be nice for SecureAI to call out what makes their product different from other open source players in the same space, specifically Khoj and Danswer, both of which allow you to chat with your documents, offer network authentication, and allow you to plug in your own LLM.

Danswer

https://github.com/danswer-ai/danswer

Khoj

https://github.com/khoj-ai/khoj

d7y · 2 years ago

A great question.

We are trying to build a single platform for all the AI tool needs. Chat-with-LLM and chat-with-documents are just a couple of apps or experiences that we have started with, but we have ambitious goals. In future, we would love to provide an SDK that exposes common abstractions and lets everyone build apps/experiences for the long tail of use cases.

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

hifreq · 2 years ago

I couldn't find this info in the readme... does this tool anonymize ChatGPT requests? What does it mean that it's a private an secure tool in the context of using ChatGPT?

d7y · 2 years ago

It is secure because it allows you to fully customize where to process the data (i.e. LLM inference), where to store it, and data-retention policies, etc. You can choose to use a locally running LLM (like it does in my second video) or use a secure third-party service provider like Azure OpenAI.

For example, if you want GDPR compliance, then you can choose Azure OpenAI running in the EU region. For HIPAA compliance, you should choose a service provider that provides the Business Associate Agreement (BAA). You can even run it in air-gapped facilities (like GitLab's offline mode [1]). In all of these cases, you can always run an Ollama-like inference service on your infra and point SecureAI Tools to it)

[1]: https://docs.gitlab.com/ee/topics/offline/

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

smeej · 2 years ago

If I've already run OCR on my PDFs and that's added now as an invisible layer, would it work then?

I've had a workflow digitizing my incoming paper documents, running OCR, and tagging them, all locally, and it would be great to have an easy front-end to talk to them.

d7y · 2 years ago

I haven't tried this myself, but I think it should work. It would be worth trying at least, so I highly encourage you to play with it, and file issues if you find any issues with it.

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

lysecret · 2 years ago

I see many of my friends building some kind of RAG system with chat interface.

I have been building some stuff on top of the OpenAi interface (to use their store) but find myself wanting to implement some simple UI elements (like a date selected or a simple dashboard).

So I feel like these types of apps have a few re occurring elements:

1. A chat interface „frontend“ (with threads, interfaces to popular APIs or local models) nice Ui ideally extensibility to some custom UI elements authentication etc.

2. API calls. (E.g. like OpenAI actions) Simplest case just reading and writing to a db (simple crud).

3. Local data + RAG. With a custom retrieval/search logic could be embeddings or simpler search methods.

Do you know open source software for all three elements? Of course you can piece it together and maybe this is the best approach. But maybe you could build something integrated.

d7y · 2 years ago

I love the idea of allowing anyone to build apps or experiences on top of some of these common elements.

We have briefly discussed an approach where we make some of these common elements available as abstractions and let people build "apps" on top of it. It would operate kind of similar to how Google's app store does in that the head use cases (email, photos, camera, etc) are first-party apps, but then anyone can build and publish a third-party app using the Android SDK.

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

teruakohatu · 2 years ago

Can a directory of PDFs be queried or does it only support a single document?

d7y · 2 years ago

Right now it supports selecting & uploading a _few_ PDFs on chat-creation. Those PDFs get indexed online -- i.e. while the user waits. So it doesn't scale well with the number of PDFs selected in a chat because you'd have to wait that long before the chat responds with your initial question/prompt.

We plan to make this indexing process offline, where you can create a document collection based on either a directory upload or an integrated data source like Google Drive, Notion, Confluence, etc. Then the system would start indexing that collection in the background and notify you once indexing is complete. Once a collection is indexed, users can select it when creating a new chat and query against it.

Let us know if you have any thoughts on this proposed solution.

d7y commented on Show HN: Open source alternative to ChatGPT and ChatPDF-like AI tools github.com/SecureAI-Tools... · Posted by u/d7y

abnry · 2 years ago

Is there a good ML tool for renaming PDFs? There are some tools out there but they assume a journal format.

d7y · 2 years ago

> renaming PDFs

Sorry, I didn't understand. Why do you need ML tool for renaming PDFs? or did you mean rephrasing or rewriting in a different format?