Readit News logoReadit News
tmalsburg2 commented on <template>: The Content Template element   developer.mozilla.org/en-... · Posted by u/palmfacehn
tmalsburg2 · 4 hours ago
What problem is this trying to solve and does it actually succeed at solving it? I‘m struggling to see the appeal given that the JS still needs to model the internal structure of the template in order to fill the slots.

Deleted Comment

tmalsburg2 commented on Linda Yaccarino is leaving X   nytimes.com/2025/07/09/te... · Posted by u/donohoe
amelius · 2 months ago
Winner takes all.
tmalsburg2 · 2 months ago
Also inflation.
tmalsburg2 commented on Bamba: An open-source LLM that crosses a transformer with an SSM   research.ibm.com/blog/bam... · Posted by u/shallow-mind
og_kalu · 4 months ago
You are mentioning avenues that are largely for entertainment. Sure you might not go back to re-attend for those. If you will be tested or are doing research, are you really looking at a large source once ?
tmalsburg2 · 4 months ago
It’s do easy to come up with serious non-entertainment examples, I‘m sure you don’t need my help finding them.
tmalsburg2 commented on Bamba: An open-source LLM that crosses a transformer with an SSM   research.ibm.com/blog/bam... · Posted by u/shallow-mind
og_kalu · 4 months ago
You can't always know what will be "relevant info" in the future. Even humans can't do this but whenever that's an issue, we just go back and re-read, re-watch etc.

None of these modern recurrent architecture have a way to do this.

tmalsburg2 · 4 months ago
How often do you go back an rewatch earlier parts of a movie? I hardly ever do this. In the cinema, theater, or when listening to the radio it’s simply impossible and it still works.
tmalsburg2 commented on Bamba: An open-source LLM that crosses a transformer with an SSM   research.ibm.com/blog/bam... · Posted by u/shallow-mind
anentropic · 4 months ago
> they added another trillion tokens and shrank the model from 18 GB to 9 GB through quantization, reducing its bit width from Mamba2’s 16-bit floating-point precision to 8-bits.

This sounds like what they call "Bamba-9B" is actually an 18B model quantised to 8 bits.

I thought generally we were naming models "nB" by their number of params and treating quantisation as a separate concern. Are there any other models that instead treat the name as an indicative memory requirement?

Is this an attempt to hide that it fares poorly vs other ~18B parameter models?

EDIT: no, I just misunderstood

tmalsburg2 · 4 months ago
Yeah, that's confusing, but the HuggingFace page says it has 9.78 B parameters.

https://huggingface.co/ibm-ai-platform/Bamba-9B-fp8

tmalsburg2 commented on Bamba: An open-source LLM that crosses a transformer with an SSM   research.ibm.com/blog/bam... · Posted by u/shallow-mind
quantadev · 4 months ago
Not be contrarian, but if the next word prediction happens to be someone's name or a place or something discussed multiple places in the book then often, yes, a knowledge of the full plot of the book is "required" just to predict the next word, as you get to the middle or end of a book.

For example you could never fill in the last chapter of any good book without having knowledge of every previous chapter. Not highly detailed knowledge, but still knowledge.

tmalsburg2 · 4 months ago
Isn't this exactly the point of this model? No need to memorize everything (which makes transfomers expensive), just keep the relevant info. SSM are essentially recurrent models.
tmalsburg2 commented on What Is Entropy?   jasonfantl.com/posts/What... · Posted by u/jfantl
tshaddox · 5 months ago
According to my perhaps naive interpretation of that, the "degree of surprise" would depend on at least three things:

1. the laws of nature (i.e. how accurately do the laws of physics permit measuring the system and how determined are future states based on current states)

2. one's present understanding of the laws of nature

3. one's ability to measure the state of a system accurately and compute the predictions in practice

It strikes me as odd to include 2 and 3 in a definition of "entropy."

tmalsburg2 · 5 months ago
OP is talking about information entropy. Nature isn't relevant there.
tmalsburg2 commented on 23andMe files for bankruptcy to sell itself   reuters.com/business/heal... · Posted by u/healsdata
shreezus · 5 months ago
Even if you haven't personally used their service, if any close relatives have, they already have a sizable amount of information on your genome. They maintain the equivalent of "shadow profiles" (https://en.wikipedia.org/wiki/Shadow_profile) as part of their data model for "ancestry" modeling purposes - for example inferring a paternal haplogroup based on data uploaded by genetic relatives.

I can only hope at the end of the day their data doesn't end up in the wrong hands. It is their most valuable asset, and this is a way bigger deal than it seems.

tmalsburg2 · 5 months ago
What would even be the right hands in this case? Seems like almost any hands would be the wrong hands.

u/tmalsburg2

KarmaCake day2480February 27, 2014View Original