Readit News logoReadit News
Posted by u/sethbarrettAU 7 days ago
Show HN: Latex-wc – Word count and word frequency for LaTeX projectsgithub.com/sethbarrett50/...
I was revising my proposal defense and kept feeling like I was repeating the same term. In a typical LaTeX project split across many .tex files, it’s awkward to get a quick, clean word-frequency view without gluing everything together or counting LaTeX commands/math as “words”.

So I built latex-wc, a small Python CLI that:

- extracts tokens from LaTeX while ignoring common LaTeX “noise” (commands, comments, math, refs/cites, etc.)

- can take a single .tex file or a directory and recursively scan all *.tex files

- prints a combined report once (total words, unique words, top-N frequencies)

Fastest way to try it is `uvx latex-wc [path]` (file or directory). Feedback welcome, especially on edge cases where you think the heuristic filters are too aggressive or not aggressive enough.

gucci-on-fleek · 5 days ago
Are you aware of the "texcount" program [0] that's distributed with TeX Live by default?

[0]: https://ctan.org/pkg/texcount?lang=en

mci · 5 days ago

  detex "$@" | wc
  detex "$@" | tr -cs '[:alnum:]' '\n' | grep . | tr '[:upper:]' '[:lower:]' | sort | uniq -c | sort -rn

dang · 7 days ago
We need a link!
jxmesth · 7 days ago
dang · 6 days ago
Added above. Thanks!