Excited to share Nanonets-OCR-s, a powerful and lightweight (3B) VLM model that converts documents into clean, structured Markdown. This model is trained to understand document structure and content context (like tables, equations, images, plots, watermarks, checkboxes, etc.). Key Features:
LaTeX Equation Recognition Converts inline and block-level math into properly formatted LaTeX, distinguishing between $...$ and $$...$$.
Image Descriptions for LLMs Describes embedded images using structured <img> tags. Handles logos, charts, plots, and so on.
Signature Detection & Isolation Finds and tags signatures in scanned documents, outputting them in <signature> blocks.
Watermark Extraction Extracts watermark text and stores it within <watermark> tag for traceability.
Smart Checkbox & Radio Button Handling Converts checkboxes to Unicode symbols like , , and for reliable parsing in downstream apps.
Complex Table Extraction Handles multi-row/column tables, preserving structure and outputting both Markdown and HTML formats.
Huggingface / GitHub / Try it out: https://huggingface.co/nanonets/Nanonets-OCR-s
Try it with Docext in Colab: https://github.com/NanoNets/docext/blob/main/PDF2MD_README.m...
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.
However, your "It's great that you're in medical school and very aware" is very patronizing and pointedly dismissive. Its a superficially polite acknowledgment that feels sarcastic rather than genuinely complimentary. I don't really mind, and I acknowledge the point you're trying to make. But if your goal is to curate a curious discussion and avoid snark you should model it too.
He had a biological hypothesis that the scientific community disagreed with and tested it on himself for a case study to get data. That case study was successful and then became a clinical trial. That trial was replicated and shown to work. He then won a Nobel prize for that work and the risk he took. This is an evidence-based process. EBM doesn’t mean you disregard a N=1, it means you expand N=1 into N=10, then N=100,… before you apply something to the general population. This is loosely how phase-1,2,3,4 trials work in the US.
Dismissing EBM because of Marshall is like dismissing all of math because someone disproved a popular conjecture like the local-to-global conjecture. Sure the community sentiment had it wrong, but the systematic logical approach of Math got it right. In Marshall’s case the community sentiment had it wrong, but the EBM approach eventually got it right. Half this thread doesn’t even know what they are arguing against.
I should know better by now than to trust doctors to act based on research and not gut feeling, but I hope this doesn't mean the last year of taking it was a wash...
Doctor’s have a wide discretion and often get things wrong. But in your case, that’s not what happened. If anything your doctor actually got it right either by chance or intuition.
> Cairo solved the so-called Mizohata-Takeuchi conjecture, a problem first proposed in the 1980s that had kept the harmonic analysis community had been working on for decades. The conjecture was widely believed to be true — if so, it would have automatically validated several other important results in the field — but the community greeted the new development with both enthusiasm and surprise: the author was a 17-year-old who hadn’t yet finished high school.