marviel (u/marviel) - Readit News

lukebechtel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

cubefox · 8 months ago

As Mamba didn't make it, will H-Nets replace Transformers?

lukebechtel · 8 months ago

It's meant to replace the BPE tokenizer piece, so it isn't a full Language Model by itself.

In fact in Gu's blog post (linked in a post below) it's mentioned that they created a Mamba model that used this in place of the tokenizer.

marviel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

miven · 8 months ago

As far as I understand the "chunking" of input bytes is learned completely end to end, so it's basically up to the model to figure out how to most efficiently delineate and aggregate the information from the inputs according to the patterns provided to it during training.

Since it's end to end this allows them to apply this process not only to raw byte encodings but basically representations of any level, such as stacking two stages of aggregation one after another.

So in principle they could either let the model do its thing on raw bytes of an image or alternatively maybe cut it up into tiny patches ViT-style and feed that to their H-Net.

I wonder how hard would it be to adapt chunking to work in 2D and what would that even look like.

Some other notes on how multimodal inputs could be handled using this architecture are mentioned in Albert Gu's (one of the author's) blog, although only briefly, there's still much to figure out it would seem: https://goombalab.github.io/blog/2025/hnet-future/#alternati...

marviel · 8 months ago

Thanks for sharing this blog post is a great speculative deep-dive.

marviel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

gdiamos · 8 months ago

How does it handle images?

marviel · 8 months ago

it mentions native multimodality somewhere in either the Arxiv or post -- seems like it might handle it well?

lukebechtel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

cs702 · 8 months ago

I've only skimmed the paper, but it looks interesting and credible, so I've added it to my reading list.

Thank you for sharing on HN!

---

EDIT: The hierarchical composition and routing aspects of this work vaguely remind me of https://github.com/glassroom/heinsen_routing/ but it has been a while since I played with that. UPDATE: After spending a bit more time on the OP, it's different, but the ideas are related, like routing based on similarity.

lukebechtel · 8 months ago

No problem! I'm still parsing it myself, but it seems promising in theory, and the result curves are impressive.

lukebechtel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

lukebechtel · 8 months ago

> H-Net demonstrates three important results on language modeling:

> 1. H-Nets scale better with data than state-of-the-art Transformers with BPE tokenization, while learning directly from raw bytes. This improved scaling is even more pronounced on domains without natural tokenization boundaries, like Chinese, code, and DNA.

> 2. H-Nets can be stacked together to learn from deeper hierarchies, which further improves performance.

> 3. H-Nets are significantly more robust to small perturbations in input data like casing, showing an avenue for creating models that are more robust and aligned with human reasoning.

lukebechtel · 8 months ago

https://arxiv.org/pdf/2507.07955

paper

lukebechtel commented on Hierarchical Modeling (H-Nets) cartesia.ai/blog/hierarch... · Posted by u/lukebechtel

lukebechtel · 8 months ago

> H-Net demonstrates three important results on language modeling:

> 1. H-Nets scale better with data than state-of-the-art Transformers with BPE tokenization, while learning directly from raw bytes. This improved scaling is even more pronounced on domains without natural tokenization boundaries, like Chinese, code, and DNA.

> 2. H-Nets can be stacked together to learn from deeper hierarchies, which further improves performance.

> 3. H-Nets are significantly more robust to small perturbations in input data like casing, showing an avenue for creating models that are more robust and aligned with human reasoning.

marviel commented on Kiro: A new agentic IDE kiro.dev/blog/introducing... · Posted by u/QuinnyPig

NathanKP · 8 months ago

Hello folks! I've been working on Kiro for nearly a year now. Happy to chat about some of the things that make it unique in the IDE space. We've added a few powerful things that I think make it a bit different from other similar AI editors.

In specific, I'm really proud of "spec driven development", which is based on the internal processes that software development teams at Amazon use to build very large technical projects. Kiro can take your basic "vibe coding" prompt, and expand it into deep technical requirements, a design document (with diagrams), and a task list to break down large projects into smaller, more realistic chunks of work.

I've had a ton of fun not just working on Kiro, but also coding with Kiro. I've also published a sample project I built while working on Kiro. It's a fairly extensive codebase for an infinite crafting game, almost 95% AI coded, thanks to the power of Kiro: https://github.com/kirodotdev/spirit-of-kiro

marviel · 8 months ago

nice! Can't agree more on Vibe Speccing.

I wrote more about Spec Driven AI development here: https://lukebechtel.com/blog/vibe-speccing

marviel commented on A non-anthropomorphized view of LLMs addxorrol.blogspot.com/20... · Posted by u/zdw

grey-area · 8 months ago

On the contrary, anthropomorphism IMO is the main problem with narratives around LLMs - people are genuinely talking about them thinking and reasoning when they are doing nothing of that sort (actively encouraged by the companies selling them) and it is completely distorting discussions on their use and perceptions of their utility.

marviel · 8 months ago

how do you account for the success of reasoning models?

I agree these things don't think like we do, and that they have weird gaps, but to claim they can't reason at all doesn't feel grounded.

marviel commented on Usability barriers for liquid types [pdf] catarinagamboa.github.io/... · Posted by u/azhenley

marviel · 8 months ago

have often wanted something like "liquid types", but didn't know what it was called! Thanks for this