Readit News logoReadit News
hugodutka commented on Blockdiff: We built our own file format for VM disk snapshots   cognition.ai/blog/blockdi... · Posted by u/cyanf
hugodutka · 3 months ago
Have you considered https://github.com/containerd/overlaybd? It seems to offer very similar features to blockdiff.
hugodutka commented on Show HN: AgentAPI – HTTP API for Claude Code, Goose, Aider, and Codex   github.com/coder/agentapi... · Posted by u/hugodutka
andrewfromx · 8 months ago
hugodutka · 8 months ago
I haven't used claude-task-master before, but based on the README, it looks like it's an AI agent that integrates well with IDEs. In contrast, AgentAPI lets you control other agents - like Claude Code or OpenAI Codex - using HTTP calls instead of typing commands into the terminal. For example, you could use AgentAPI to control Claude Code from a custom frontend, such as a native desktop application.
hugodutka commented on Show HN: Zerox – Document OCR with GPT-mini   github.com/getomni-ai/zer... · Posted by u/themanmaran
nbbaier · a year ago
> I extracted the embedded text from the PDF

What did you use to extract the embedded text during this step? Other than some other OCR tech

hugodutka · a year ago
PyMuPDF, a PDF library for Python.
hugodutka commented on Show HN: Zerox – Document OCR with GPT-mini   github.com/getomni-ai/zer... · Posted by u/themanmaran
sidmitra · a year ago
>frequency of character triples

What are character triples? Are they trigrams?

hugodutka · a year ago
I think so. I'd normalize the text first: lowercase it and remove all non-alphanumeric characters. E.g for the phrase "What now?" I'd create these trigrams: wha, hat, atn, tno, now.
hugodutka commented on Show HN: Zerox – Document OCR with GPT-mini   github.com/getomni-ai/zer... · Posted by u/themanmaran
hugodutka · a year ago
I used this approach extensively over the past couple of months with GPT-4 and GPT-4o while building https://hotseatai.com. Two things that helped me:

1. Prompt with examples. I included an example image with an example transcription as part of the prompt. This made GPT make fewer mistakes and improved output accuracy.

2. Confidence score. I extracted the embedded text from the PDF and compared the frequency of character triples in the source text and GPT’s output. If there was a significant difference (less than 90% overlap) I would log a warning. This helped detect cases when GPT omitted entire paragraphs of text.

u/hugodutka

KarmaCake day1186January 5, 2019
About
https://hugodutka.com/contact/
View Original