thoughtlede (u/thoughtlede)

thoughtlede commented on AWS CEO says using AI to replace junior staff is 'Dumbest thing I've ever heard' theregister.com/2025/08/2... · Posted by u/JustExAWS

simpaticoder · 2 days ago

In undergrad I took an abstract algebra class. It was very difficult and one of the things the teacher did was have us memorize proofs. In fact, all of his tests were the same format: reproduce a well-known proof from memory, and then complete a novel proof. At first I was aghast at this rote memorization - I maybe even found it offensive. But an amazing thing happened - I realized that it was impossible to memorize a proof without understanding it! Moreover, producing the novel proofs required the same kinds of "components" and now because they were "installed" in my brain I could use them more intuitively. (Looking back I'd say it enabled an efficient search of a tree of sequences of steps).

Memorization is not a panacea. I never found memorizing l33t code problems to be edifying. I think it's because those kinds of tight, self-referential, clever programs are far removed from the activity of writing applications. Most working programmers do not run into a novel algorithm problem but once or twice a career. Application programming has more the flavor of a human-mediated graph-traversal, where the human has access to a node's local state and they improvise movement and mutation using only that local state plus some rapidly decaying stack. That is, there is no well-defined sequence for any given real-world problem, only heuristics.

thoughtlede · 2 days ago

memorization + application = comprehension. Rinse and repeat.

Whether leet code or anything else.

thoughtlede commented on Reasoning models don't always say what they think anthropic.com/research/re... · Posted by u/meetpateltech

thoughtlede · 5 months ago

It feels to me that the hypothesis of this research was somewhat "begging the question". Reasoning models are trained to spit some tokens out that increase the chance of the models spitting the right answer at the end. That is, the training process is singularly optimizing for the right answer, not the reasoning tokens.

Why would you then assume the reasoning tokens will include hints supplied in the prompt "faithfully"? The model may or may not include the hints - depending on whether the model activations believe those hints are necessary to arrive at the answer. In their experiments, they found between 20% and 40% of the time, the models included those hints. Naively, that sounds unsurprising to me.

Even in the second experiment when they trained the model to use hints, the optimization was around the answer, not the tokens. I am not surprised the models did not include the hints because they are not trained to include the hints.

That said, and in spite of me potentially coming across as an unsurprised-by-the-result reader, it is a good experiment because "now we have some experimental results" to lean into.

Kudos to Anthropic for continuing to study these models.

thoughtlede commented on Supercharge vector search with ColBERT rerank in PostgreSQL blog.vectorchord.ai/super... · Posted by u/gaocegege

simonw · 7 months ago

> However, generating sentence embeddings through pooling token embeddings can potentially sacrifice fine-grained details present at the token level. ColBERT overcomes this by representing text as token-level multi-vectors rather than a single, aggregated vector. This approach, leveraging contextual late interaction at the token level, allows ColBERT to retain more nuanced information and improve search accuracy compared to methods relying solely on sentence embeddings.

I don't know what it is about ColBERT that affords such opaque descriptions, but this is sadly common. I find the above explanation incredibly difficult to parse.

I have my own explanation of ColBERT here but I'm not particularly happy with that either: https://til.simonwillison.net/llms/colbert-ragatouille

If anyone wants to try explaining ColBERT without using jargon like "token-level multi-vectors" or "contextual late interaction" I'd love to see a clear description of it!

thoughtlede · 7 months ago

tadkar did a good job at explaining ColBERT. I understood ColBERT well in the context of where it lies on the spectrum of choices.

On one side of the spectrum, you reduce each of the documents as well as the query to a lower-dimensional space (aka embeddings) and perform similarity. This has the advantage that the document embeddings could be precomputed. At query time, you only compute the query embedding and compare its similarity with document embeddings. The problem is that the lower-dimensional embedding acts as a decent, but not great, proxy for the documents as well as for the query. Your query-document similarity is only as good as the semantics that could be captured in those lower-dimensional embeddings.

On the other side of the spectrum, you consider the query with each document (as a pair) and see how much the query "attends" to each of the documents. The power of trained attention weights means that you get a much reliable similarity score. The problem is that this approach requires you to run attention-forward-pass as many times as there are documents -- for each query. In other words, this has a performance issue.

ColBERT sits in the middle of the spectrum. It "attends" to each of the documents separately and captures the lower-dimensional embedding for each token in each document. This we precompute. Once we have done that, we captured the essence of how tokens within a given document attend to each other, and is captured in the token embeddings.

Then, at query time, we do the same for each token in the query. And we see which query-token embedding is greatly similar to which document-token embedding. If we find that there is a document which has more tokens that are found to be greatly similar to the query tokens, then we consider that to the best document match. (The degree of similarity between each query-document token is used to score the ranking - it is called Sum of MaxSim).

Obviously, attention based similarity, like in the second approach, is better than reducing to token embeddings and scoring similarity. But ColBERT avoids the performance hit compared to the second approach. ColBERT also avoids the lower fidelity of "reducing the entire document to a lower-dimensional space issue" because it reduces each token in the document separately.

By the way, the first approach is what bi-encoders do. The second approach is cross-encoding.

thoughtlede commented on Building Effective "Agents" anthropic.com/research/bu... · Posted by u/jascha_eng

thoughtlede · 8 months ago

When thinking about AI agents, there is still conflation between how to decide the next step to take vs what information is needed to decide the next step.

If runtime information is insufficient, we can use AI/ML models to fill that information. But deciding the next step could be done ahead of time assuming complete information.

Most AI agent examples short circuit these two steps. When faced with unstructured or insufficient information, the program asks the LLM/AI model to decide the next step. Instead, we could ask the LLM/AI model to structure/predict necessary information and use pre-defined rules to drive the process.

This approach will translate most [1] "Agent" examples into "Workflow" examples. The quotes here are meant to imply Anthropic's definition of these terms.

[1] I said "most" because there might be continuous world systems (such as real world simulacrum) that will require a very large number of rules and is probably impractical to define each of them. I believe those systems are an exception, not a rule.

thoughtlede commented on CRDTs and Collaborative Playground cerbos.dev/blog/crdts-and... · Posted by u/emreb

thoughtlede · 8 months ago

> Beyond this, if you want to determine causality, e.g. whether events are "causally related" (happened before or after each other) or are "concurrent" (entirely independent of), you can look at Vector Clocks—I won't go down that rabbit-hole here, though.

If anyone want to go down that rabbit hole: https://www.exhypothesi.com/clocks-and-causality/

thoughtlede commented on Training LLMs to Reason in a Continuous Latent Space arxiv.org/abs/2412.06769... · Posted by u/omarsar

thoughtlede · 8 months ago

Perhaps these findings might be indicating that we need more NN layers/attention blocks for performing reasoning. This project circumvented the lack of more trained layers by looping the input through currently trained layers more than once.

Also we may have to look for better loss functions than ones that help us predict the next token to train the models if the objective is reasoning.

thoughtlede commented on Model Context Protocol anthropic.com/news/model-... · Posted by u/benocodes

thoughtlede · 9 months ago

If function calling is sync, is MCP its async counterpart? Is that the gist of what MCP is?

Open API (aka swagger) based function calling is standard already for sync calls, and it solves the NxM problem. I'm wondering if the proposed value is that MCP is async.

thoughtlede commented on My Python code is a neural network blog.gabornyeki.com/2024-... · Posted by u/gnyeki

scotchmi_st · a year ago

This is an interesting article if you read it like a howto for constructing a neural network for performing a practical task. But if you take it at face-value, and follow a similar method the next time you need to parse some input, then, well, I don't know what to say really.

The author takes a hard problem (parsing arbitrary input for loosely-defined patterns), and correctly argues that this is likely to produce hard-to-read 'spaghetti' code.

They then suggest replacing that with code that is so hard to read that there is still active research into how it works, (i.e a neural net).

Don't over-index something that's inscrutable versus something that you can understand but is 'ugly'. Sometimes, _maybe_, a ML model is what you want for a task. But a lot of the time, something that you can read and see why it's doing what it's doing, even if that takes some effort, is better than something that's impossible.

thoughtlede · a year ago

I think the mention of 'spaghetti code' is a red herring from the author. If the output from an algorithm cannot be defined precisely as a function of the input, but you have some examples to show, that's where machine learning (ML) is useful.

In the end, ML provides one more option to choose from. Whether it works or not for you depends on evaluations and how deterministic and explainability you need from the chosen algorithm/option.

The thing that struck me is if RNN is the right choice given that it would need to be trained and we need a lot of examples than what we might have. That said, maybe based on known 'rules', we can produce synthetic data for both +ve and -ve cases.

thoughtlede commented on Deliberative Consensus Protocols social-protocols.org/deli... · Posted by u/jwarden

thoughtlede · a year ago

Interesting. I didn’t know this body of work. I haven’t read the documentation other than the abstract.

If the protocol is about knowing what information to inject at what node in the network to achieve consensus, the protocol can (and will) be used to inject whatever truths the parties with knowledge on the protocol believe in. If there are enough parties with opposing beliefs, then the network cannot be gamed beyond the status quo.

How does this protocol “choose” the truth to propagate in the face of opposing truths?

What power does the protocol give to parties who know how to work the system?

Or rather what resilience does the protocol have against “inside traders”?