keepsweet (u/keepsweet)

keepsweet commented on Stop over-thinking AI subscriptions steipete.me/posts/2025/st... · Posted by u/hboon

dgan · 3 months ago

Tried Claude yesterday to help me extract rows from a financial statement PDf. Let's automate boring stuff !! After multiple failures , I did it myself

keepsweet · 3 months ago

Most people don't realize that LLMs by design were not made for document processing, data extraction etc. For that, you would have to use a dedicated tool like Klippa DocHorizon, which built its own AI OCR from scratch. It also provides an API that you can use to send your documents and receive formatted data. It's less popular than, say, Textract or Tesseract, but it's far more accurate, especially if you're dealing with sensitive data that you wouldn't want an LLM to hallucinate.

keepsweet commented on Show HN: OCR Workbench: AI OCR for hard documents github.com/viking2917/ocr... · Posted by u/viking2917

keepsweet · 4 months ago

Interesting concept. I tried it with a text written in Church Slavonic, didn't work. I guess the documents don't have to be THAT old. It would also be nice if you could upload images individually instead of selecting everything from a folder. Either way, nice work.

keepsweet commented on Ask HN: Is there an OCR that might be able to handle field datasheets? · Posted by u/clamlady

keepsweet · 4 months ago

I've also tried tesseract in the past with handwritten notes, which didn't provide very accurate results. Then I started looking into some commercial solutions and stumbled upon many different tools, but the only one that could handle my handwriting was Klippa DocHorizon: https://www.klippa.com/en/ocr/ It uses machine learning and OCR instead of just plain OCR like tesseract does, so it might be an option to look into. You could also test it out at https://www.klippa.com/en/ocr/tools/

I've been using it for a while and would highly recommend it. hopefully it can work out for your use case

keepsweet commented on Ask HN: Ways to Automatically Scan and Extract Business Cards Information? · Posted by u/ksec

refferal · 5 months ago

Have a look at: https://www.klippa.com/en/dochorizon/. First, use a scanner with an automatic feeder to take pictures of the cards, and let Klippa read from a Google Drive folder, for example. Then, you can create your own workflow to determine what should happen with the results.

keepsweet · 5 months ago

Klippa user here. I've been using it for the past year or so and can also recommend OP use it for scanning and extracting data from business cards. The workflow builder is probably what OP is looking for: https://www.youtube.com/watch?v=1TZJxlaOiKo

Just for curiosity, how did you find out about Klippa?

keepsweet commented on NASA to launch space observatory that will map 450M galaxies nbcnews.com/science/space... · Posted by u/gmays

arisAlexis · 6 months ago

I call it here, Musk will cut it before they do it

keepsweet · 6 months ago

I highly doubt it, but is that plausible?

keepsweet commented on Show HN: Searchable Vim Cheat Sheet with Favorites (Open-Source) nvim-cheatsheet.vercel.ap... · Posted by u/lil_csom

lil_csom · 6 months ago

Hey! I am sure it only takes time to become a master of it :) I think what makes people stick is exactly the fact that its configurable, as the sheer process of messing with your setup is fun in itself, not just the flow it can possibly give you! I really do hope that indeed others will find it useful <3

keepsweet · 6 months ago

Absolutely! I'm very confident people will find it useful. Are you also perhaps going to include a light mode version of the website? It's currently hard to look at it when the sun is shining in my screen. I think other people would like it as well!

keepsweet commented on Show HN: Searchable Vim Cheat Sheet with Favorites (Open-Source) nvim-cheatsheet.vercel.ap... · Posted by u/lil_csom

keepsweet · 6 months ago

Love seeing vim and neovim being used in more open source projects. Despite the huge learning curve, once vim becomes second nature, it's hard to go back to using the mouse. At least, that's how it was for me. Thanks for creating this cheat sheet, I'm sure lots of beginners will find it useful!

keepsweet commented on Mistral OCR mistral.ai/fr/news/mistra... · Posted by u/littlemerman

codetrotter · 6 months ago

One use-case is digitising receipts from business related travels for expenses that employees paid for out of their own pocket and which they are submitting pictures to the business for reimbursement.

Bus travels, meals including dinners and snacks, etc. for which the employee has receipts on paper.

keepsweet · 6 months ago

Yeah, digitizing receipts is still a huge challenge for most companies, especially for expense reimbursements. Even though invoices are increasingly digital, employees still end up with physical receipts for work-related expenses. From what I've seen, there are some interesting contenders like Klippa that seem to solve exactly this problem [1].

Curious to know if anyone heard of or used their OCR or a similar tool. Apparently it's not an LLM in disguise but an actual AI trained on gazillions of documents so the risk of hallucination might be lower than these LLM OCR solutions like Mistral.

[1] https://www.klippa.com/en/ocr/ocr-api/

u/keepsweet

KarmaCake day3March 10, 2025View Original