Readit News logoReadit News
thor-rodrigues commented on Don't bother parsing: Just use images for RAG   morphik.ai/blog/stop-pars... · Posted by u/Adityav369
thor-rodrigues · a month ago
I spent a good amount of time last year working on a system to analyse patent documents.

Patents are difficult as they can include anything from abstract diagrams, chemical formulas, to mathematical equations, so it tends to be really tricky to prepare the data in a way that later can be used by an LLM.

The simplest approach I found was to “take a picture” of each page of the document, and ask for an LLM to generate a JSON explaining the content (plus some other metadata such as page number, number of visual elements, and so on)

If any complicated image is present, simply ask for the model to describe it. Once that is done, you have a JSON file that can be embedded into your vector store of choice.

I can’t say about the price-to-performance ration, but this approach seems to easier and more efficient than what is the author is proposing.

thor-rodrigues commented on Continuous Glucose Monitoring   imperialviolet.org/2025/0... · Posted by u/zdw
thor-rodrigues · 2 months ago
What I find absolutely infuriating is that Abott (Freestyle Libre 1-3 devices) region locks their monitoring app depending on your region.

My father is T1 is uses the Libre CGM system for a couple years now. Libre users in the US and Europe can enjoy direct integration with their iOS devices, including constant updates and most importantly, notification alerts for dangerously high or low glucose levels, and it is even possible to share live updates of this with close family members or caretakers.

But none of this is available for my dad, as he lives in Brazil. Even though the product is same, he cannot download the iOS apps over the AppStore, as they are region locked.

thor-rodrigues commented on Valve takes another step toward making SteamOS a true Windows competitor   arstechnica.com/gaming/20... · Posted by u/austinallegro
constantcrying · 3 months ago
I know.

Every single person would be better served by just using Fedora. The addition layer is either so thin that the value added is zero or it creates a drastically worse user experience, as the layer around Fedora is poorly maintained by a few volunteers and very little testing is done. Bazzite is pointless and a total waste of time. Nobody should use it.

All of these "layer around actually popular distro" projects are pointless and make using Linux on the desktop a worse experience.

thor-rodrigues · 3 months ago
> Bazzite is pointless and a total waste of time. Nobody should use it.

Bazzite, as far as I am aware, is closest thing that you can get from a console-like experience for a HTPC*. Although it is possibly to configure Bazzite to launch directly into desktop mode, the key idea behind it is to launch in Big Picture mode, so you can manage the UI using a controller. Once that is done, you are pretty much into SteamOS/SteamDeck-like UI.

The appear of the Distro for me was always to get the convenience and console-like experience, while enjoying more powerful hardware and the benefits of the Steam platform.

thor-rodrigues commented on Valve takes another step toward making SteamOS a true Windows competitor   arstechnica.com/gaming/20... · Posted by u/austinallegro
thor-rodrigues · 3 months ago
If you're serious about using SteamOS for your gaming computer or home theater PC, I highly recommend Bazzite.

Bazzite is a Linux operating system, built on SteamOS, that's designed to make it easy to use with different hardware and controllers. It simplifies the installation process and works really well with other game launchers that aren't on Steam, and you can set it up to look like a game console with the SteamOS interface or a regular computer with a desktop.

The only real problem I had was with competitive multiplayer shooter games require kernel-level anti-cheat software, that doesn't work with Linux.

But if playing online multiplayer isn't your main thing and you’re sick of windows being as intrusive as it is, Bazzite is an outstanding choice for a gaming or home theater computer.

https://bazzite.gg/

thor-rodrigues commented on Googler... ex-Googler   nerdy.dev/ex-googler... · Posted by u/namukang
ajb · 4 months ago
Must do. I wonder if that's because of the wide access to weapons? Although I thought I'd heard that there still wasn't much gun violence in Switzerland.
thor-rodrigues · 4 months ago
I have the impression that (although I did not check for data beforehand to confirm my assumptions) that gun violence is very low in developed countries, with the USA being the outlier.

I believe the overall positive employer-employee relationship in Europe is much more of a product of legislature and cultural norms, than the threat of violence.

thor-rodrigues commented on Docs – Open source alternative to Notion or Outline   github.com/suitenumerique... · Posted by u/maelito
thor-rodrigues · 5 months ago
I really like the idea of shifting the business model for office software. Instead of the current model—where companies develop a tool, lock users into their ecosystem, and profit by bundling software with hosting and storage—we could move to a model where different providers compete to offer the best deployment solutions. This would foster competition based on factors like pricing, encryption, customer support, server location, and integration flexibility, rather than simply forcing users into long-term subscriptions.

That’s why I’m glad to see governments supporting Open Source alternatives to proprietary office software. Paying recurring subscription fees for low-maintenance tools like MS Office feels out of touch—especially when Microsoft once offered a one-time purchase model before shifting to SaaS to maximize profits. This change has made it difficult for individuals and businesses to retain long-term ownership of their tools without being tied to costly and recurring fees. The same trend has played out across the software industry, from design tools like Adobe Creative Cloud (which replaced one-time purchases with a mandatory subscription model) to communication platforms like Slack and Zoom, which lock companies into ongoing costs while limiting interoperability with other solutions.

thor-rodrigues commented on Docs – Open source alternative to Notion or Outline   github.com/suitenumerique... · Posted by u/maelito
wim · 5 months ago
We're working on an "IDE for notes/tasks" [1] in the space of Notion and so on where you can easily self-host the sync backend with a single binary.

The idea is that you can choose between cloud or self-host (and "eject" at any time to switch between the two if you ever change your mind). We hope that might be a good balance between some companies or individuals wanting to self-host but still making it accessible when you don't know how any of that works, which indeed can get complicated fast.

[1] https://thymer.com/

thor-rodrigues · 5 months ago
That looks VERY AWESOME. Really looking forward to try it :)
thor-rodrigues commented on Show HN: Documind – Open-source AI tool to turn documents into structured data   github.com/DocumindHQ/doc... · Posted by u/Tammilore
thor-rodrigues · 9 months ago
Very nice tool! Just last week, I was working on extracting information from PDFs for an automation flow I’m building. I used Unstructured (https://unstructured.io/), which supports multiple file types, not just PDFs.

However, my main issue is that I need to work with confidential client data that cannot be uploaded to a third party. Setting up the open-source, locally hosted version of Unstructured was quite cumbersome due to the numerous additional packages and installation steps required.

While I’m open to the idea of parsing content with an LLM that has vision capabilities, data safety and confidentiality are critical for many applications. I think your project would go from good to great if it would be possible to connect to Ollama and run locally,

That said, this is an excellent application! I can definitely see myself using it in other projects that don’t demand such stringent data confidentiality.”

thor-rodrigues commented on Update on Llama adoption   ai.meta.com/blog/llama-us... · Posted by u/meetpateltech
thor-rodrigues · a year ago
I think that focusing primarily on the discussion of what is or isn't open source software makes us miss an interesting point here, that Llama enables users to have a similar performance to frontier models in your own systems, without having to send data to third-party sources.

My company is building an application for an university client, regarding the examination of research data written in "human language" (mostly notes and docs).

Due the high confidentiality of the subjects, as often they deal with non-patented information, we couldn't risk using frontier models, as it could break the novelty of the invention, therefore losing patentability.

Now with Llama3.1, we can simply run these models locally, on systems that is not even connected to the internet. LLMs are mostly good in examining massive amount of research papers and information, at least for the application we are aiming at, saving thousands of hours of tiresome (and very boring) human labour.

I am trying to endorse Meta or Zuckerberg or anything like that, but at least in this aspect, I think Llama being "open-source" is a very good aspect.

u/thor-rodrigues

KarmaCake day280July 26, 2024View Original