My father is T1 is uses the Libre CGM system for a couple years now. Libre users in the US and Europe can enjoy direct integration with their iOS devices, including constant updates and most importantly, notification alerts for dangerously high or low glucose levels, and it is even possible to share live updates of this with close family members or caretakers.
But none of this is available for my dad, as he lives in Brazil. Even though the product is same, he cannot download the iOS apps over the AppStore, as they are region locked.
Patents are difficult as they can include anything from abstract diagrams, chemical formulas, to mathematical equations, so it tends to be really tricky to prepare the data in a way that later can be used by an LLM.
The simplest approach I found was to “take a picture” of each page of the document, and ask for an LLM to generate a JSON explaining the content (plus some other metadata such as page number, number of visual elements, and so on)
If any complicated image is present, simply ask for the model to describe it. Once that is done, you have a JSON file that can be embedded into your vector store of choice.
I can’t say about the price-to-performance ration, but this approach seems to easier and more efficient than what is the author is proposing.