Readit News logoReadit News
samuell commented on Why DuckDB is my first choice for data processing   robinlinacre.com/recommen... · Posted by u/tosh
DangitBobby · a month ago
Being able to use SQL on CSV and json/jsonl files is pretty sweet. Of course it does much more than that, but that's what I do most often with it. Love duckdb.
samuell · a month ago
Indeed! I generally like awk a lot for simpler CSV/TSV processing, but when it comes to cases where you need things like combining/joining multiple CSV files or aggregating for certain columns, SQL really shines IME.
samuell commented on Don't fall into the anti-AI hype   antirez.com/news/158... · Posted by u/todsacerdoti
20k · a month ago
I wish LLMs were good at search. I've tried to evaluate them many times for their quality at answering research questions for astrophysics (specifically numerical relativity). If they were good at answering questions, I'd use them in a heartbeat

Without exception, every technical question I've ever asked an LLM that I know the answer to, has been substantially wrong in some fashion. This makes it just.. absolutely useless for research. In some cases I've spotted it straight up plagiarising from the original sources, with random capitalisation giving it away

The issue is that once you get even slightly into a niche, they fall apart because the training data just doesn't exist. But they don't say "sorry there's insufficient training data to give you an answer", they just make shit up and state it as confidently incorrect

samuell · a month ago
Did you try https://elicit.org ?

I have been impressed by its results.

I think this fact stems more from its initial search phase than its pure LLM processing power, but to me it seems the approach works really well.

samuell commented on Oral microbiome sequencing after taking probiotics   blog.booleanbiotech.com/o... · Posted by u/sethbannon
samuell · a month ago
Nice experiment and writeup!

On a tangent, nice to see Plasmidsaurus using Emu [1], which has been shown to work great for 16S ribosomal RNA analysis on ONT by basically everyone I've heard who tried it. It has a nice algorithm for predicting if variants are due to ONT sequencing errors or are true variants, based on an expectation maximization algorithm, and thus working around the somewhat limited accuracy in ONT reads. Pretty clever stuff.

And if you want to run your own analysis on the raw data using Emu, you might want to try out our Trana pipeline built around Emu in Nextflow [2]. Apart from running Emu, it does some of the preprocessing like filtering, as well as exporting as Krona diagrams etc.

We're just putting it through validation at the clinical microbiology lab at Karolinska here in Stockholm right now.

The main caveat worth mentioning is that the choice of database seems to be able to affect results quite a lot in some cases.

[1] https://github.com/treangenlab/emu

[2] https://github.com/genomic-medicine-sweden/TRANA

samuell commented on Flow5 released to open source   flow5.tech/docs/releaseno... · Posted by u/picture
samuell · a month ago
Curious that this hit the front page.

What kind of projects is this software used for?

samuell commented on Our new SAM audio model transforms audio editing   about.fb.com/news/2025/12... · Posted by u/ushakov
samuell · 2 months ago
I tried this to try to extract some speech from an audio track with heavy noise from wind (filmed out on a windy sea shore without mic windscreen), and the result unfortunately was less intelligible than the original.

I got much better results, though still not perfect, with the voice isolator in ElevenLabs.

samuell commented on Learn Prolog Now (2006)   lpn.swi-prolog.org/lpnpag... · Posted by u/rramadass
samuell · 3 months ago
I love Prolog, and have seen so many interesting use cases for it.

In the end though, it mostly just feels enough of a separate universe to any other language or ecosystem I'm using for projects that there's a clear threshold for bringing it in.

If there was a really strong prolog implementation with a great community and ecosystem around, in say Python or Go, that would be killer. I know there are some implementations, but the ones I've looked into seem to be either not very full-blown in their Prolog support, or have close to non-existent usage.

samuell commented on Reasoning models reason well, until they don't   arxiv.org/abs/2510.22371... · Posted by u/optimalsolver
WesolyKubeczek · 3 months ago
It’s because they generate a seeming of reasoning, and don’t actually reason!

(Slams the door angrily)

(stomps out angrily)

(touches the grass angrily)

samuell · 3 months ago
Yea, a bit like a cheating student rote memorizing and copying another students technique for solving a type of problem, and failing hard as soon as there's too much variation from the original problem.
samuell commented on Affinity Studio now free   affinity.studio/get-affin... · Posted by u/dagmx
monkeywork · 3 months ago
Wish they would properly support linux - the Affinity products are PAINFUL if not near impossible to get working in wine.
samuell · 3 months ago
I was going to ask about wine support. Anyone tried in Bottles (wine distribution)? I've had better luck with Bottles than plain Wine with other software. Hoping to try soon.

u/samuell

KarmaCake day2361May 13, 2013
About
Bioinformatician in Clinical Microbiology. Blogging at https://livesys.se

Before: Data Science & Engineering consultant @ Savantic https://savantic.se

PhD in bioinformatics from https://pharmb.io

Author of https://scipipe.org and https://github.com/pharmbio/sciluigi & some more.

View Original