Despite determinism, we still do not understand LLMs.
Despite determinism, we still do not understand LLMs.
A really simple abstraction in mathematics is that of numeric basis (e.g. base 10) for representing numbers. Being able to use the symbol 3 is much more useful than needing to write III. Of course, numbers themselves are an abstraction- perhaps you and I can reason about 3 and 7 and 10,000 in a vacuum, but young children or people who have never been exposed to numbers without units struggle to understand. Seven… what? Dogs? Bottles? Days? Numbers are an abstraction, and Arabic digits are a particular abstraction on top of that.
Without that abstraction, we would have insufficient tools to do more complex things such as, say, subtract 1 from 1,000,000,000. This is a problem that most 12 year olds can solve, but the greatest mathematicians of the Roman empire could not, because they did not have the right abstractions.
So if there are abstractions that enable us to solve problems that were formerly impossible, this means there is something more going on than “hiding information”. In fact, this is what Dijkstra (a mathematician by training) meant when he said:
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise
When I use open(2), it’s because I’m operating at the semantic level of files. It’s not sensible to think of a “file” at a lower level: would it be on disk? In memory? What about socket files? But a “file” isn’t a real thing, it’s an abstraction created by the OS. We can operate on files, these made up things, and we can compose operations together in complex, useful ways. The idea of a file opens new possibilities for things we can do with computers.
I hope that explanation helps!
To continue with the idea of numbers, let’s say you asked someone to add 3 and 5. Is that encapsulation? What information are you hiding? You are not asking them to add coins or meters or reindeer. 3 and 5 are values independent of any underlying information. The numbers aren’t encapsulating anything.
Encapsulation is different. When you operate a motor vehicle, you concern yourself with the controls presented. This allows you, as the operator, to only need a tiny amount of knowledge to interact with an incredibly complex machine. This details have been encapsulated. There may be particular abstraction present, such as the notion of steering, acceleration, and breaking, but the way you interact with these will differ from vehicle to vehicle. Additionally, encapsulation is not concerned with the idea of steering, it is concerned with how to present steering in this specific case.
The two ideas are connected because using an abstraction in software often involves encapsulation. But they should not be conflated, out the likely result is bad abstractions and unwieldy encapsulation.
> You are thinking of assembly language which is a different thing. Initially there was no assembler, someone had to write one.
This is why I specifically mention opcodes. I've actually written assemblers! And...there's not much to them. It's mostly just replacing the names given to the opcodes in the datasheet back to the opcodes, with a few human niceties. ;)
> consider the same situation with 5 senators X of which have failed
Ohhhhhhhh, ok. I kind of see. Unfortunately, I don't see the difference between abstraction and encapsulation here. I see the abstraction as being speed as being the encapsulation of a set of sensors, ignoring irrelevant values.
I feel like I'm almost there. I may have edited my previous comment after you replied. My "no procrastination" setting kicked in, and I couldn't see.
I don't see how "The former is about semantic levels, the later about information hiding." are different. In my mind, semantic levels exist as compression and encapsulation of information. If you're saying encapsulation means "black box" then that could make sense to me, but "inaccessible" isn't part of the definition, just "containment".
A really simple abstraction in mathematics is that of numeric basis (e.g. base 10) for representing numbers. Being able to use the symbol 3 is much more useful than needing to write III. Of course, numbers themselves are an abstraction- perhaps you and I can reason about 3 and 7 and 10,000 in a vacuum, but young children or people who have never been exposed to numbers without units struggle to understand. Seven… what? Dogs? Bottles? Days? Numbers are an abstraction, and Arabic digits are a particular abstraction on top of that.
Without that abstraction, we would have insufficient tools to do more complex things such as, say, subtract 1 from 1,000,000,000. This is a problem that most 12 year olds can solve, but the greatest mathematicians of the Roman empire could not, because they did not have the right abstractions.
So if there are abstractions that enable us to solve problems that were formerly impossible, this means there is something more going on than “hiding information”. In fact, this is what Dijkstra (a mathematician by training) meant when he said:
The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise
When I use open(2), it’s because I’m operating at the semantic level of files. It’s not sensible to think of a “file” at a lower level: would it be on disk? In memory? What about socket files? But a “file” isn’t a real thing, it’s an abstraction created by the OS. We can operate on files, these made up things, and we can compose operations together in complex, useful ways. The idea of a file opens new possibilities for things we can do with computers.
I hope that explanation helps!
For one off, this is fine. For anything maintainable, that needs to survive the realities of time, this is truly terrible.
Related, my friend works in a performance critical space. He can't use abstractions, because the direct, bare metal, "exact fit" implementation will perform best. They can't really add features, because it'll throw the timing of others things off to much, so usually have to re-architect. But, that's the reality of their problem space.
That said, even then, there are a lot of business cases where you are not constrained by the time required to sort or traverse a custom data structure, because you spend more time waiting for an answer from a database (in which case you may want to tune the db or add a cache),or the time needed to talk to a server or another user, or a third party library, or a payment processing endpoint.
There are also use cases (think offline mobile apps) where the number of concurrent requests is basically 1, because each offline app serves a single user, so as long as you can process stuff before a user notices the app is sluggish (hundreds of milliseconds at least) you're good.
What do you do with those 4 thousand req/s? That's what makes the difference between "processing everything independently is fast enough for our purposes", "we need to optimize database or network latency", or "we need to optimize our data structures".
If a stretch of road was used by an average of 10 cars per minute over a 24 hour period, is it congested?
In both cases, you need more specific traffic data to size things properly.
You can take it one step further: imagine you live in a smallish country (10 million people).
If your market share is 10% of the population and they make 1 request per day, that is just 10 requests per second.
10% is a large market share for everyday use. So you can use 1% market share and 10 requests and it will still be just 10 reqs/sec.
In fact, 1% market share of 10 million people and you can use the number of requests each user makes as the number of requests that your server will get (on average) per second.
There is a lot of business in small countries that never need to scale (or business in narrow sectors, e.g. a lot of B2B).
But also worth noting that whenever you make an abstraction you run the risk that it's NOT going to turn out increase clarity and precision, either due to human limitation or due to changes in the problem. The author's caution is warranted because in practice this happens really a lot. I would rather work with code that has insufficient abstraction than inappropriate abstraction.
I think a lot of becoming a good programmer is about developing the instincts around when it’s worth it and in what direction. To add to the complexity, there is a meta dimension of how much time you should spend trying to figure it out vs just implement something and correct it later.
As an aside, I’m really curious to see how much coding agents shift this balance.
Clarity is likely the most important aspect of making maintainable, extendable code. Of course, it’s easy to say that, it’s harder to explain what it looks like in practice.
I wrote a book that attempts to teach how to write clear code: https://elementsofcode.io
> 11. Abstractions don’t remove complexity. They move it to the day you’re on call.
This is true for bad abstractions.
> The purpose of abstraction is not to be vague, but to create a new semantic level in which one can be absolutely precise. (Dijkstra)
If you think about abstraction in those terms, the utility becomes apparent. We abstract CPU instructions into programming languages so we can think about our problems in more precise terms, such as data structures and functions.
It is obviously useful to build abstractions to create even higher levels of precision on top of the language itself.
The problem isn’t abstraction, it is clarity of purpose. Too often we create complex behavioral models before actually understanding the behavior we are trying to model. It’s like a civil engineer trying to build a bridge in a warehouse without examining the terrain where it must be placed. When it doesn’t fit correctly, we don’t blame the concept of bridges.
In other words, some people actually write like this.