The problem: Traditional e-readers are passive. When you encounter something unclear, you have to context-switch to search for it. Your highlights and notes remain isolated, and you can't easily connect ideas across different books.
My solution: BookWith embeds an AI that maintains full context of what you're reading. It features:
- Context-aware AI chat: Ask questions about the current page/chapter and get instant answers
- AI podcast generation: Automatically converts book content into conversational podcasts using Google Cloud TTS
- Multi-layer memory system: Short-term (last 5 conversations), mid-term (summarized every 20), and long-term (vector search) memory that maintains continuity across reading sessions
- Smart annotations: 5-color highlighting system that AI can reference and analyze
Technical stack: Built as a fork of Flow (epub reader), with added LLM integration and vector database for semantic search. Supports multiple LLMs and languages (EN/JA/ZH).
Not to necessarily diss the work that was done on this, but the idea of actually wanting this for reading feels like it is a continuation of the lack of attention span that has seemed to get worse and worse. We already saw this with the oversimplification of television shows and movies. Many of them leaning more towards slapping you in the face with something instead of subtly.
I know way too many people that struggle to sit still for a half hour episode of some show now (like my partner, frustratingly) and have to be doing something else.
If you are struggling with absorbing the information you are reading that is likely a sign you should put down the book and come back to it later, obviously your mind wants to be doing something else. If it is a continued issue than practice reading something that you know you would like. Personally my "in" for my love of reading was reading video game books that expanded the lore and it grew from there, but I was already invested in the story so the book was easier to read.
Using this for a book feels more like a crutch than anything else. That is obviously before you get into whether or not the LLM is actually going to tell you the truth.
There is however one possible use case I could get into, but this is something that could be solved by just finding a video or something online. A refresher when it has been a long time between books coming out in a series.
I actually sympathize with you very much.
As you say, there is a non-zero chance that this app will contribute to a lack of concentration, but I cannot dismiss the possibility that the opposite will happen.
In my case, I have often found myself wanting the crutch of LLM due to lack of prerequisite knowledge when reading technical or philosophical books.
Also, I am an Asian whose English is not that good, and there are times when I have to read a book in its original language because there is no translation in my native language.
This application was created on an experimental basis to remove these panes, and the chat function with LLM is only one function. It should be used at the appropriate time depending on the user's use case.
Good job, OP. I wanted to build this myself.
Do you have plans for android and iOS support and syncing across devices?
Some people have deep knowledge, but don't have the skills to untangle context and lay out the right learning path for a reader. These people likely bell-curve around certain neurotypes, which perhaps know certain sorts of knowledge more strongly.
Right now, those people shouldn't publish. But if LLMs could augment poorly structured content (not incorrect content, just poorly structured), that perhaps open up more people to share their wisdom.
Anyhow, just thinking out loud here. I'm sure there are some massive downsides that are coming to mind for ppl reading :)
At some point I'll work on better integrating Emacs's nov.el EPUB reader with gptel to approximate something like this. Books are text, and I already have the ultimate text processing environment that I've invested quite a lot of time in.
That seems like a maybe a wee bit of an overstatement of possibilities.
What I meant from a technical perspective is that the system uses a Retrieval-Augmented Generation (RAG) approach. It has the entire book's content available in a vector database, and when you ask a question, it performs a semantic search to pull the most relevant passages in real-time to use as context for the LLM's answer.
So, from a user's perspective, the experience is designed to feel like you're conversing with an expert who can instantly recall any part of the book. I should have used more precise language. Thanks for keeping me honest!
What I've found interesting when doing similar experiments (feeding things like books to an LLM and asking questions) is that the output is almost always more bland than one would hope for. I suspect this may both be a result of LLMs being biased for the material they've been trained on and a reality I've suspected which is that the majority of books are mostly filler and aren't making points that are particularly profound. Most books, when you distill them down, fundamentally communicate ideas that are rather obvious, but the language around those points makes them sound a lot more profound than they really are. It's a kind of hypnosis, I think. In a sense, LLMs may be able to reveal how bereft a piece of written material is.
I disagree with the OP's statement that traditional e-readers being passive is actually a "problem". It's kind of like saying that cars are a problem because they can't fly. Maybe I'm being pedantic, but being alone with a book and one's own thoughts is hardly a problem; if anything, the problem is fewer and fewer people are comfortable without a constant barrage of thoughts other than their own.
The introduction video shows how easy it is to import an epub, and then "asks the ebook" to give them the Table of Contents. While the ToC was already available... no real added value compared to RAG