Transformer models: an introduction and catalog

I have recently written a paper on understanding transformer learning via the lens of coinduction & Hopf algebra.

The learning mechanism of transformer models was poorly understood however it turns out that a transformer is like a circuit with a feedback.

I argue that autodiff can be replaced with what I call in the paper Hopf coherence.

Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

I'm working on a next gen Hopf algebra based machine learning framework.

Join my discord if you want to discuss this further https://discord.gg/mr9TAhpyBW

amkkma · 3 years ago

Is there any hope of understanding this with just calc and linalg knowledge?

adamnemecek · 3 years ago

I think so. The main idea is the idea of Hopf coherence. The transformer/Hopf algebra update their internal state in order to enforce the Hopf coherence formula (you can find that in the paper).

The idea of streams (as in infinite lists) is related to this via coalgebras.

erichocean · 3 years ago

> Furthermore, if we view transformers as Hopf algebras, one can bring convolutional models, diffusion models and transformers under a single umbrella.

Have you written any more about this?

adamnemecek · 3 years ago

Look into the connection between diffusion and Hopf algebras.

erichocean · 3 years ago

Hi Adam, can you update your Discord invite? It's now invalid.

xamat · 3 years ago

Thanks for the feedback everyone. Here are the sources in case anyone wants to contribute (or fork): https://github.com/xamat/TransformerCatalog

amatecha · 3 years ago

I was so certain this was discussing Transformers like, the action figures, and have never been so confused looking at both a link and the comments section on HN before. Especially considering: https://github.com/xamat/TransformerCatalog/blob/main/02-01.... I'm just going to keep scrolling now :'D

visarga · 3 years ago

> 2.5.9 ChatGPT is also more than a model since it includes extensions for Memory Store and retrieval similar to BlenderBot3

I don't think this affirmation is factual. There are people who played with this idea, but it is not part of chatGPT.

sva_ · 3 years ago

> 2.5.5 BERT

> Extension:It can be seen as a generalization of BERT and GPT in that it combines ideas from both in the encoder and decoder

I believe this is an error? Text from BART. And a space missing.

I have a hunch they used LLM to compile the list.

DerSaidin · 3 years ago

It is a shame the figures 5,6,7,8 break up the content of 2.5 Catalog, just to fit the figures onto pages.

Are pages even needed anymore?

h_lezzaik · 3 years ago

Good timing, I've been trying to compile a list like this myself to keep track of everything released.

theredlancer · 3 years ago

Where's Cliffjumper and Ironside?

zndr · 3 years ago

I'm glad I'm not the only one looking for a taxonomy of refugees from the great Cybertron wars

sircastor · 3 years ago

When I was younger I would often encounter mentions of electrical transformers, and be quite disappointed when it wasn't related to the toys or the series. Even in my 40s I still have a bit of disappointment about it...

robertlagrant · 3 years ago

Don't get me started on when we learned about the Terminator on the moon.

Deleted Comment