Readit News logoReadit News
kgeist commented on I tried every todo app and ended up with a .txt file   al3rez.com/todo-txt-journ... · Posted by u/al3rez
kgeist · 12 days ago
I've been using simple text files too, for 7 years now.

Except I create a new file for each new day, to have peace of mind (as opposed to having a million-line-long file). Instead of Ctrl+F, I use grep. The format is Markdown.

My typical TODO file has 3 sections: TODO, Pending, and DONE. If something is done for the day, I move it to the DONE section. When I create a new file for a new day, I copy over everything except the DONE section. The Pending section is for something I can't act on immediately (say, waiting for a coworker's response). I look there less often.

Every morning I also re-prioritize the items in the TODO section.

The only problem is that if I'm away from the work computer, I have to add items in a separate app on the phone (Notepad Free) and then manually copy them to the PC.

This system is something I naturally came to over those 7 years via trial and error, something that works well for me. I had other formats that didn't catch on.

Just my two cents.

Deleted Comment

kgeist commented on GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2   magazine.sebastianraschka... · Posted by u/ModelForge
littlestymaar · 13 days ago
I've read many times that MoE models should be comparable to dense models with a number of parameters equal to the geometric mean of the MoE's total number of parameters and active ones.

In the case of gpt-oss 120B that would means sqrt(5*120)=24B.

kgeist · 13 days ago
I've read that the formula is based on the early Mistral models and does not necessarily reflect what's going on nowadays.
kgeist commented on PHP compile time generics: yay or nay?   thephp.foundation/blog/20... · Posted by u/moebrowne
branko_d · 13 days ago
And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.

This is very important both for cache locality and for minimizing garbage collector pressure.

kgeist · 13 days ago
With reified generics, you can also do "new T[]" because the type is known at runtime. With type erasure, you can't do that.
kgeist commented on GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2   magazine.sebastianraschka... · Posted by u/ModelForge
cranberryturkey · 13 days ago
qwen3 is slow though. i used it. it worked, but it was slow and lacking features.
kgeist · 13 days ago
On my RTX 5090 with llama.cpp:

gpt-oss 120B - 37 tok/sec (with CPU offloading, doesn't fit in the GPU entirely)

Qwen3 32B - 65 tok/sec

Qwen3 30B-A3B - 150 tok/sec

(all at 4-bit)

kgeist commented on GPT-OSS vs. Qwen3 and a detailed look how things evolved since GPT-2   magazine.sebastianraschka... · Posted by u/ModelForge
Mars008 · 13 days ago
You call it fair? 32 / 5.1 > 6, it's takes 6 times more to compute each token. Put it other way, Qwen3 32B is 6 times slower than GPT OSS 120B.
kgeist · 13 days ago
>Qwen3 32B is 6 times slower than GPT OSS 120B.

Only if 120B fits entirely in the GPU. Otherwise, for me, with a consumer GPU that only has 32 GB VRAM, gpt-oss 120B is actually 2 times slower than Qwen3 32B (37 tok/sec vs. 65 tok/sec)

kgeist commented on Who got arrested in the raid on the XSS crime forum?   krebsonsecurity.com/2025/... · Posted by u/todsacerdoti
anticodon · 14 days ago
Can you tell a Russian from Ukrainian? Can you even tell an Ukrainian name from Russian name?
kgeist · 14 days ago
There's a bunch of phonetical differences:

  rus. Volk ~ ukr. Vovk
  rus. Aleksey ~ ukr. Oleksiy
Different suffixes:

   rus. Tarasov ~ ukr. Tarasenko, Tarasiuk
Ukrainian surnames often have no suffixes at all (more often than in Russia):

  rus. Melnikov vs. ukr. Melnyk
  rus. Kovalev vs. ukr. Koval
Although it doesn't say anything about a person's nationality for certain because of migrations.

kgeist commented on GPT-5: "How many times does the letter b appear in blueberry?"   bsky.app/profile/kjhealy.... · Posted by u/minimaxir
mdp2021 · 14 days ago
> they don't see individual letters

Yet they seem to be from many other tests (characters corrections or manipulation in texts, for example).

> The fact that reasoning models can count letters, even though they can't see individual letters

To a mind, every idea is a representation. But we want the processor to work reliably on them representations.

> If we don't allow a [mind] to base its reasoning on the training data it's seen, what should it base it on

On its reasoning and judgement over what it was told. You do not repeat what you heard, or you state that's what you heard (and provide sources).

> uses randomness

That is in a way a problem, a non-final fix - satisficing (Herb Simon) after random germs instead of constructing through a full optimality plan.

In the way I used the expression «chancey guesses» though I meant that guessing by chance when the right answer falls in a limited set ("how many letters in 'but'") is a weaker corroboration than when the right answer falls in a richer set ("how many letters in this sentence").

kgeist · 14 days ago
Most people act on gut instincts first as well. Gut instinct = first semi-random sample from experience (= training data). That's where all the logical fallacies come from. Things like the bat and the ball problem, where 95% people give an incorrect answer, because most of the time, people simply pattern-match too. It saves energy and works well 95% time. Just like reasoning LLMs, they can get to a correct answer if they increase their reasoning budget (but often they don't).

An LLM is a derivative of collective human knowledge, which is intrinsically unreliable itself. Most human concepts are ill-defined, fuzzy, very contextual. Human reasoning itself is flawed.

I'm not sure why people expect 100% reliability from a language model that is based on human representations which themselves cannot realistically be 100% reliable and perfectly well-defined.

If we want better reliability, we need a combination of tools: a "human mind model", which is intrinsically unreliable, plus a set of programmatic tools (say, like a human would use a calculator or a program to verify their results). I don't know if we can make something which works with human concepts and is 100% reliable in principle. Can a "lesser" mind create a "greater" mind, one free of human limitations? I think it's an open question.

kgeist commented on GPT-5: "How many times does the letter b appear in blueberry?"   bsky.app/profile/kjhealy.... · Posted by u/minimaxir
leptons · 14 days ago
Can it count all 3 letter words in a paragraph? Can it average number of words inside parentheses in a paragraph?
kgeist · 14 days ago
I asked Qwen3 32b (with reasoning) to count all 3 letter words in your post. Its answer was:

  There are 3 three-letter words in the given text:

  "Can" (first instance)
  "all"
  "Can" (second instance)
Sounds about right.

Then I took this answer and asked to count the average number of words inside parentheses. The answer was:

  The average number of words in the parentheses from the given example is 2 words per parenthesis. 

  Here's the breakdown:
  1. "Can" (first instance): Parentheses contain "first instance" (2 words).
  2. "all": No parentheses, so excluded.
  3. "Can" (second instance): Parentheses contain "second instance" (2 words).

  Total words in parentheses: 2 + 2 = 4  
  Number of parentheses: 2  
  Average: 4 ÷ 2 = 2.
Correct again. Everything on the first try.

kgeist commented on GPT-5: "How many times does the letter b appear in blueberry?"   bsky.app/profile/kjhealy.... · Posted by u/minimaxir
mdp2021 · 14 days ago
> Did you enable reasoning

Yep.

> gave me the correct answer

Try real-world tests that cannot be covered by training data or chancey guesses.

kgeist · 14 days ago
Counting letters is a known blindspot in LLMs because of how tokenization works in most LLMs - they don't see individual letters. I'm not sure it's a valid test to make any far-reaching conclusions about their intelligence. It's like saying a blind person is an absolute dumbass just because they can't tell green from red.

The fact that reasoning models can count letters, even though they can't see individual letters, is actually pretty cool.

>Try real-world tests that cannot be covered by training data

If we don't allow a model to base its reasoning on the training data it's seen, what should it base it on? Clairvoyance? :)

> chancey guesses

The default sampling in most LLMs uses randomness to feel less robotic and repetitive, so it’s no surprise it makes “chancey guesses.” That’s literally what the system is programmed to do by default.

u/kgeist

KarmaCake day3893March 7, 2021View Original