octonaut (u/octonaut)

octonaut commented on Ingesting PDFs and why Gemini 2.0 changes everything sergey.fyi/articles/gemin... · Posted by u/serjester

lazypenguin · a year ago

I work in fintech and we replaced an OCR vendor with Gemini at work for ingesting some PDFs. After trial and error with different models Gemini won because it was so darn easy to use and it worked with minimal effort. I think one shouldn't underestimate that multi-modal, large context window model in terms of ease-of-use. Ironically this vendor is the best known and most successful vendor for OCR'ing this specific type of PDF but many of our requests failed over to their human-in-the-loop process. Despite it not being their specialization switching to Gemini was a no-brainer after our testing. Processing time went from something like 12 minutes on average to 6s on average, accuracy was like 96% of that of the vendor and price was significantly cheaper. For the 4% inaccuracies a lot of them are things like the text "LLC" handwritten would get OCR'd as "IIC" which I would say is somewhat "fair". We probably could improve our prompt to clean up this data even further. Our prompt is currently very simple: "OCR this PDF into this format as specified by this json schema" and didn't require some fancy "prompt engineering" to contort out a result.

Gemini developer experience was stupidly easy. Easy to add a file "part" to a prompt. Easy to focus on the main problem with weirdly high context window. Multi-modal so it handles a lot of issues for you (PDF image vs. PDF with data), etc. I can recommend it for the use case presented in this blog (ignoring the bounding boxes part)!

octonaut · a year ago

I'm interested to hear what your experience has been dealing with optional data. For example if the input pdf has fields which are sometimes not populated or nonexistent, is Gemini smart enough to leave those fields blank in the output schema? Usually the LLM tries to please you and makes up values here.

octonaut commented on The most breathtaking abandoned sites english.elpais.com/travel... · Posted by u/PaulHoule

octonaut · a year ago

I had the pleasure of visiting Gunkanjima a few years ago. If you're in Nagasaki, there are a few tour operators that ferry you to the island for a tour. Unfortunately you can't explore on your own for safety reasons, but still an amazing place to visit.

octonaut commented on Best Practices for Secure and Readable Code: Error Handling and Logging hjortberg.substack.com/p/... · Posted by u/logscope

octonaut · a year ago

I agree with everything said here, but I'd like to add a couple of extra points:

- Be aware of the volume of logs you generate. For e.g. if you're in a for loop iterating over a thousand elements, and have a try/catch on processing each element, you're going to generate a thousand lines of logs if there's an upstream issue (database connection error, network error etc.). In such cases, always exit early when you have an unrecoverable error. And I don't necessarily mean exit the whole program, just exit the loop early.

- If you're logging to file, make sure you have a log rotation policy in place with a maximum file config set. In a previous workplace we had a system wide outage because a core component got wiped out because it got too many connection errors, wrote the errors to disk, filled up all the space and took the box down.

octonaut commented on Ask HN: What to do about underperforming team member · Posted by u/buynlarge

octonaut · a year ago

It's hard to provide specific advice without knowing the exact nature of the underperformance, so I'll keep it general.

- Sometimes underperformance can stem from a lack of engagement because of a disconnect between the work they would like to do, and the work they've been given. In this instance you could try giving them a wider variety of tasks and see if they prosper in anything else.

- You could also change the level of work given to them. If it's too easy or repetitive, this can often cause people to switch off and lose discipline. On the other hand, if it's too hard they get overwhelmed and don't know what to do. If this is the case, then mentoring is the right approach, but only if the level of work is just beyond their abilities. If you give them something miles out of their league, no amount of mentoring is going to get them over the line.

- Something I've tried when I was in your position is to enforce tighter standards in the CI pipeline (test coverage, manual testing notes, linting etc.) to enforce discipline before they push code which forces them to fix their own issues before seeking a review.

- Try getting them more involved in code reviews. Reviews are a skill that should be taught just as much as writing code, and poking holes in other people's code might prompt them to do so on their own. Reviews are also a great opportunity for senior devs to perform mentoring asynchronously - long form explanations on why a design choice was made in a git diff for e.g. and sharing that across the team.

- Finally, they could just be outright negligent, but I'm assuming this is not the case with you. If they're resistant to taking advice and improving, then this might already be a lost battle.

Good luck!

octonaut commented on Show HN: Looking for work is a full time job – So I created this tool · Posted by u/Dallas_B

octonaut · a year ago

Will this let me filter for jobs in a specific region? For e.g. Europe or Australia?

octonaut commented on OWASP Non-Human Identities Top 10 owasp.org/www-project-non... · Posted by u/raskelll

octonaut · a year ago

TIL that OWASP has a bunch of Top 10 projects other than application security. Some others I found:

- Top 10 for LLMs - https://owasp.org/www-project-top-10-for-large-language-mode...

- Top 10 for OT - https://ot.owasp.org/

- Top 10 for Smart Contracts - https://owasp.org/www-project-smart-contract-top-10/

- Top 10 for Open Source Software - https://owasp.org/www-project-open-source-software-top-10/

octonaut commented on Durable execution should be lightweight dbos.dev/blog/what-is-lig... · Posted by u/KraftyOne

KraftyOne · a year ago

Thanks! DBOS is simpler not because it ignores complexity, but because it uses Postgres to deal with complexity. And Postgres is a very powerful tool for building reliable systems!

octonaut · a year ago

Temporal has the option of using postgres as the persistence backend. Presumably, the simplicity of DBOS comes from not having to spin up a webserver and workflow engine to orchestrate the functions?