It is an open source, self-hosted search tool that allows you to ask questions and get answers across common workspace apps AND your personal documents (via file upload / web scraping)! Full demo here: https://www.youtube.com/watch?v=geNzY1nbCnU&t=2s.
The code (https://github.com/danswer-ai/danswer) is open source and permissively licensed (MIT). If you want to try it out, you can set it up locally with just a couple of commands (more details in our docs - https://docs.danswer.dev/introduction). We hope that someone out there finds this useful
We’d love to hear from you in our Slack (https://join.slack.com/t/danswer/shared_invite/zt-1u3h3ke3b-...) or Discord (https://discord.gg/TDJ59cGV2X). Let us know what other features would be useful for you!
I came across Danswer a few days ago as an option for this, so I spent a day building a connector [1]. I was pleasantly surprised how accurate the output was for something like this. I have a few pages detailing my servers and I could ask things like "Where is x server hosted"? and get a correct response accompanied with a link to the right source page.
Some things to be aware of specifically about Danswer: It only works with OpenAI right now, although the team said that open model support is important as a future focus. Additionally it felt fairly heavy to run and required a 30 minute docker build process but I think they've improved on this now with pre-built images, and I'm not familiar with the usual requirements/weight of this kind of tech. Otherwise, things were easy to start up and play around with, even for an AI noob like me. Both their web and text-upload source connectors worked without issue in my testing.
[1]: https://github.com/danswer-ai/danswer/pull/139
[0]: https://github.com/ggerganov/llama.cpp/issues/1602
Additionally, the indexing process is setup as a composable pipeline under the hood. It would be fairly trivial to plug in different chunkers for different sources as needed in the future.
Would you mind saying a few words on how Danswer approaches this?
For now I will stick to PrivateGPT and LocalGPT.
And, to share with you something: I saw somewhere a tool (maybe it was GPT4ALL itself) that had the ability to expose a OpenAI-compatible local API on localhost:8080... Ah, yes. Here it is. Actually, there are two. They are described as possible backends for Bavarder (that's a free access to multiple online models, API key is not required): https://bavarder.codeberg.page/help/local/
[^1]: https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-backen...
> Danswer provides Docker containers that you can run anywhere, so data never has to leave your PVC. The one exception is using GPT for inference but we are working on allowing for locally hosted generative models as well.
Look, you can plug this hole trivially for many companies, by adding support for Azure OpenAI API. It's almost identical to OpenAI API - the main difference is how you pass keys and specify the model to use. But that alone will make it possible to use Danswer with company data in places that signed a relevant contract with Microsoft.
Ex: If a connected gdrive document gets indexed, but then someone fixes the share settings in google docs for some item to be more restrictive.. How does Danswer avoid leaking that data? Dynamic check before returning any doc that the live federated auth settings safelist the requesting user reading that doc?
The immediate plan is to extend our current poll / push based connectors to also grab access information (+ add IdP integrations for cross-app identity). There will be some delay to grab access updates, which will be combatted by the dynamic check with the app / IdP itself at query time that you mentioned (still investigating exactly how this will work).
We are also considering adding support for group based access defined within Danswer itself for sources that don't provide APIs to get access information (default being all-public if not specified). Of course, for these, we will not be able to sync permissions.
Ticketing platforms like ServiceNow fall under a similar category, although a bit lower priority in my mind.