Readit News logoReadit News
aghilmort commented on Show HN: I made a spreadsheet where formulas also update backwards   victorpoughon.github.io/b... · Posted by u/fouronnes3
aghilmort · 5 days ago
interesting. like Excel Solver? or OpenSolver, Gurobi, other optimizers? or different objective?
aghilmort commented on PRD-ware – freeware for vibe coding/engineering   pdrware.github.io... · Posted by u/consultutah
aghilmort · 2 months ago
link doesn’t work
aghilmort commented on Modular Manifolds   thinkingmachines.ai/blog/... · Posted by u/babelfish
glowcoil · 3 months ago
I'm sorry, but even if I am maximally charitable and assume that everything you are saying is meaningful and makes sense, it still has essentially nothing to do with the original article. The original article is about imposing constraints on the weights of a neural network, during training, so that they lie on a particular manifold inside the overall weight space. The "modular" part is about being able to specify these constraints separately for individual layers or modules of a network and then compose them together into a meaningful constraint for the global network.

You are talking about latent space during inference, not weight space during training, and you are talking about interleaving tokens with random Gaussian tokens, not constraining values to lie on a manifold within a larger space. Whether or not the thing you are describing is meaningful or useful, it is basically unrelated to the original article, and you are not using the term "modular manifold" to refer to the same thing.

aghilmort · 3 months ago
hmm / hear you. my point wasn't that we are applying modular manifolds in the same way it was that we are working on model reliability from two extremal ends using the same principle. there are various ways to induce modular manifolds in model at various levels of resolution / power. we started at outside / working in level and so it works with any black-box model out of the box and zero knowledge needed, dont even need to know token dictionary to show effect.

We're already working on pushing construction deeper into model both architecture and training. currently that's for fine-tuning and ultimately full architecture shrinkage / pruning and raw training vs. just fine-tuning etc.

& it was just great to see someone else using modular manifolds even if they are using them at the training stage vs. inference stage. they're exploiting modular form at training, we're doing it at inference. cool to see.

aghilmort commented on Chat GPT Lag: Solved   chromewebstore.google.com... · Posted by u/projectowba
aghilmort · 3 months ago
awesome had thought about doing this / great to see will try!!!!!!!!!!!
aghilmort · 3 months ago
wait why subscription?
aghilmort commented on Chat GPT Lag: Solved   chromewebstore.google.com... · Posted by u/projectowba
aghilmort · 3 months ago
awesome had thought about doing this / great to see will try!!!!!!!!!!!
aghilmort commented on Modular Manifolds   thinkingmachines.ai/blog/... · Posted by u/babelfish
snake_doc · 3 months ago
Wot? Is this what AI generated non-sense has come to? This is totally unrelated.
aghilmort · 3 months ago
Nope. Construction induces ECC-driven emergent modular manifolds in latent space during KVQ maths. Can't use any ole ECC / crux why works. More in another reply.
aghilmort commented on Modular Manifolds   thinkingmachines.ai/blog/... · Posted by u/babelfish
glowcoil · 3 months ago
The original article discusses techniques for constraining the weights of a neural network to a submanifold of weight space during training. Your comment discusses interleaving the tokens of an LLM prompt with Unicode PUA code points. These are two almost completely unrelated things, so it is very confusing to me that you are confidently asserting that they are the same thing. Can you please elaborate on why you think there is any connection at all between your comment and the original article?
aghilmort · 3 months ago
Our ECC construction induces an emergent modular manifold during KVQ computation.

Suppose we use 3 codeword lanes every codeword which is our default. Each lane of tokens is based on some prime, p, so collectively forms CRT-driven codeword (Chinese Remainder Theorem). This is discretely equivalent to labeling every k tokens with 1x globally unique indexing grammar.

That interleaving also corresponds to a triple of adjacent orthogonal embeddings since those tokens still retain a random gaussian embedding. The net effect is we similarly slice the latent space into spaced chain of modular manifolds within the latent space every k content tokens.

We also refer to that interleaving as Steifel frames for similar reasons as the post reads etc. We began work this spring or so to inject that net construction inside the model with early results in similar direction as post described. That's another way of saying this sort of approach lets us make that chained atlas (wc?) of modular manifolds as tight as possible within dimensional limits of the embedding, floating point precision, etc.

We somewhat tongue-in-cheek refer to this as the retokenization group at the prompt level re: renormalization group / tensor nets / etc. Relayering group is the same net intuition or perhaps reconnection group at architecture level.

aghilmort commented on Modular Manifolds   thinkingmachines.ai/blog/... · Posted by u/babelfish
aghilmort · 3 months ago
Interesting. Modular manifolds are precisely what hypertokens use for prompt compiling.

Specifically, we linearize the emergent KVQ operations of an arbitrary prompt in any arbitrary model by way of interleaving error-correcting code (ECC).

ECC tokens are out-of-band tokens, e.g., Unicode's Private Use Area (PUA), interleaved with raw context tokens. This construction induces an in-context associate memory.

Any sort of interleaved labeling basis, e.g., A1, quick brown fox, A2, jumped lazy dog, induces a similar effect to for chaining recall & reasoning more reliably.

This trick works because PUA tokens are generally untrained hence their initial embedding is still random Gaussian w.h.p. Similar effects can be achieved by simply using token combos unlikely to exist and are often in practice more effective since PUA tokens like emojis or Mandarin characters are often 2,3, or 4 tokens after tokenization vs. codeword combos like zy-qu-qwerty every k content tokens, where can be variable.

Building attention architecture using modular manifolds in white / gray-box models like this new work shows vs. prompt-based black box injection is a natural next step, and so can at least anecdotally validate what they're building ahead of next paper or two.

Which is all to say, absolutely great to see others building in this way!

Dead Comment

aghilmort commented on Belling the Cat   en.wikipedia.org/wiki/Bel... · Posted by u/walterbell
aghilmort · 3 months ago
oy, clicked thinking was Bell Inequality meets Schrondinger's cat post

u/aghilmort

KarmaCake day266February 20, 2011
About
Founder++

* https://sloop.ai/ * https://breezethat.com/ * https://drivespotter.com/

Socials: - x.com/SloopFX - x.com/DotDotJames - NYC, https://meet.hn/city/us-New-York

Interests: AI/ML, Science, Running, Space Tech, Startups, Technology, UI/UX Design, Physics

---

alum @riseofrest @techstars @tedx @usairforce

View Original