I like this. Very much falls into the "make bad state unrepresentable".
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.
Yep. For this reason, I wish more languages supported bound integers. Eg, rather than saying x: u32, I want to be able to use the type system to constrain x to the range of [0, 10).
This would allow for some nice properties. It would also enable a bunch of small optimisations in our languages that we can't have today. Eg, I could make an integer that must fall within my array bounds. Then I don't need to do bounds checking when I index into my array. It would also allow a lot more peephole optimisations to be made with Option.
Weirdly, rust already kinda supports this within a function thanks to LLVM magic. But it doesn't support it for variables passed between functions.
I proposed a primitive for this in TypeScript a couple of years ago [1].
While I'm not entirely convinced myself whether it is worth the effort, it offers the ability to express "a number greater than 0". Using type narrowing and intersection types, open/closed intervals emerge naturally from that. Just check `if (a > 0 && a < 1)` and its type becomes `(>0)&(<1)`, so the interval (0, 1).
type
Foo = range[1 .. 10]
Bar = range[0.0 .. 1.0] # float works too
var f:Foo = 42 # Error: cannot convert 42 to Foo = range 1..10(int)
var p = Positive 22 # Positive and Natural types are pre-defined
This can be done in typescript. It’s not super well known because of typescripts association with frontend and JavaScript. But typescript is a language with one of the most powerful type systems ever.
Among the popular languages like golang, rust or python typescript has the most powerful type system.
How about a type with a number constrained between 0 and 10? You can already do this in typescript.
You can even programmatically define functions at the type level. So you can create a function that outputs a type between 0 to N.
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
The issue here is that it’s a bit awkward you want these types to compose right? If I add two constrained numbers say one with max value of 3 and another with max value of two the result should be max value of 5. Typescript doesn’t support this by default with default addition. But you can create a function that does this.
// Build a tuple of length L
type BuildTuple<L extends number, T extends unknown[] = []> =
T['length'] extends L ? T : BuildTuple<L, [...T, unknown]>;
// Add two numbers by concatenating their tuples
type Add<A extends number, B extends number> =
[...BuildTuple<A>, ...BuildTuple<B>]['length'];
// Create a union: 0 | 1 | 2 | ... | N-1
type Range<N extends number, A extends number[] = []> =
A['length'] extends N ? A[number] : Range<N, [...A, A['length']]>;
function addRanges<
A extends number,
B extends number
>(
a: Range<A>,
b: Range<B>
): Range<Add<A, B>> {
return (a + b) as Range<Add<A, B>>;
}
The issue is to create these functions you have to use tuples to do addition at the type level and you need to use recursion as well. Typescript recursion stops at 100 so there’s limits.
Additionally it’s not intrinsic to the type system. Like you need peanno numbers built into the number system and built in by default into the entire language for this to work perfectly. That means the code in the function is not type checked but if you assume that code is correct then this function type checks when composed with other primitives of your program.
But wouldn't that also require code execution? For example even though the compiler already knows the size of an array and could do a bounds check on direct assigment (arr[1] = 1) in some wild nested loop you could exceed the bounds that the compiler can't see.
Otherwise you could have
type level asserts more generally. Why stop at a range check when you could check a regex too? This makes the difficulty more clear.
For the simplest range case (pure assignment) you could just use an enum?
You can do this quite easily in Rust. But you have to overload operators to make your type make sense. That's also possible, you just need to define what type you get after dividing your type by a regular number and vice versa a regular number by your type. Or what should happen if when adding two of your types the sum is higher than the maximum value. This is quite verbose. Which can be done with generics or macros.
ATS does this. Works quite well since multiplication by known factors and addition of type variables + inequalities is decidable (and in fact quadratic).
> 1 + "1"
(irb):1:in 'Integer#+': String can't be coerced into Integer (TypeError)
from (irb):1:in '<main>'
from <internal:kernel>:168:in 'Kernel#loop'
from /Users/george/.rvm/rubies/ruby-3.4.2/lib/ruby/gems/3.4.0/gems/irb-1.14.3/exe/irb:9:in '<top (required)>'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in 'Kernel#load'
from /Users/george/.rvm/rubies/ruby-3.4.2/bin/irb:25:in '<main>'
The “Stop at first level of type implementation” is where I see codebases fail at this. The example of “I’ll wrap this int as a struct and call it a UUID” is a really good start and pretty much always start there, but inevitably someone will circumvent the safety. They’ll see a function that takes a UUID and they have an int; so they blindly wrap their int in UUID and move on. There’s nothing stopping that UUID from not being actually universally unique so suddenly code which relies on that assumption breaks.
This is where the concept of “Correct by construction” comes in. If any of your code has a precondition that a UUID is actually unique then it should be as hard as possible to make one that isn’t. Be it by constructors throwing exceptions, inits returning Err or whatever the idiom is in your language of choice, the only way someone should be able to get a UUID without that invariant being proven is if they really *really* know what they’re doing.
(Sub UUID and the uniqueness invariant for whatever type/invariants you want, it still holds)
> This is where the concept of “Correct by construction” comes in.
This is one of the basic features of object-oriented programming that a lot of people tend to overlook these days in their repetitive rants about how horrible OOP is.
One of the key things OO gives you is constructors. You can't get an instance of a class without having gone through a constructor that the class itself defines. That gives you a way to bundle up some data and wrap it in a layer of validation that can't be circumvented. If you have an instance of Foo, you have a firm guarantee that the author of Foo was able to ensure the Foo you have is a meaningful one.
Of course, writing good constructors is hard because data validation is hard. And there are plenty of classes out there with shitty constructors that let you get your hands on broken objects.
But the language itself gives you direct mechanism to do a good job here if you care to take advantage of it.
Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention. But it's a pattern in those languages where it's a language feature in OO languages. (And as any functional programmer will happily tell you, a design pattern is just a sign of a missing language feature.)
I've recently been following red-green-refactor but instead of with a failing test, I tighten the screws on the type system to make a production-reported bug cause the type checker to fail before making it green by fixing the bug.
I still follow TDD-with-a-test for all new features, all edge cases and all bugs that I can't trigger failure by changing the type system for.
However, red-green-refactor-with-the-type-system is usually quick and can be used to provide hard guarantees against entire classes of bug.
I like this approach, there are often calls for increased testing on big systems and what they really mean is increased rigor. Don't waste time testing what you can move into the compiler.
It is always great when something is so elegantly typed that I struggle to think of how to write a failing test.
What drives me nuts is when there are testing left around basically testing the compiler that never were “red” then “greened” makes me wonder if there is some subtle edge case I am missing.
I found myself following a similar trajectory, without realizing that’s what I was doing. For a while it felt like I was bypassing the discipline of TDD that I’d previously found really valuable, until I realized that I was getting a lot of the test-first benefits before writing or running any code at all.
Now I just think of types as the test suite’s first line of defense. Other commenters who mention the power of types for documentation and refactoring aren’t wrong, but I think that’s because types are tests… and good tests, at almost any level, enable those same powers.
As you mentioned correctly: if you go for strongly typed types in a library you should go all the way.
And that means your strong type should provide clear functions for its conversion to certain other types. Some of which you nearly always need like conversion to a string or representation as a float/int.
The danger of that is of course that you provide a ladder over the wall you just built and instead of
They now go the shortcut route via numeric representation and may forget the conversion factor. In that case I'd argue it is best to always represent temperature as one unit (Kelvin or Celsius, depending on the math you need to do with it) and then just add a .display(Unit:: Fahrenheit) method that returns a string. If you really want to convert to TemperatureF for a calculation you would have to use a dedicated method that converts from one type to another.
One thing to consider as well is that you can mix up absolute values ("it is 28°C outside") and temperature deltas ("this is 2°C warmer than the last measurement"). If you're controlling high energy heaters mixing those up can ruin your day, which is why you could use different types for absolutes and deltas (or a flag within one type). Datetime libraries often do that as well (in python for example you have datetime for absolute and timedelta for relative time)
Union types!! If everything’s a type and nothing works together, start wrapping them in interfaces and define an über type that unions everything everywhere all at once.
Welcome to typescript. Where generics are at the heart of our generic generics that throw generics of some generic generic geriatric generic that Bob wrote 8 years ago.
Because they can’t reason with the architecture they built, they throw it at the type system to keep them in line. It works most of the time. Rust’s is beautiful at barking at you that you’re wrong. Ultimately it’s us failing to design flexibility amongst ever increasing complexity.
Remember when “Components” where “Controls” and you only had like a dozen of them?
Remember when a NN was only a few hundred thousand parameters?
As complexity increases with computing power, so must our understanding of it in our mental model.
However you need to keep that mental model in check, use it. If it’s typing, do it. If it’s rigorous testing, write your tests. If it’s simulation, run it my friend. Ultimately, we all want better quality software that doesn’t break in unexpected ways.
union types are great. But alone they are not sufficient for many cases. For example, try to define a datastructure that captures a classical evaluation-tree.
You might go with:
type Expression = Value | Plus | Minus | Multiply | Divide;
interface Value { type: "value"; value: number; }
interface Plus { type: "plus"; left: Expression; right: Expression; }
interface Minus { type: "minus"; left: Expression; right: Expression; }
interface Multiply { type: "multiply"; left: Expression; right: Expression; }
interface Divide { type: "divide"; left: Expression; right: Expression; }
And so on.
That looks nice, but when you try to pattern match on it and have your pattern matching return the types that are associated with the specific operation, it won't work. The reason is that Typescript does not natively support GADTs. Libs like ts-pattern use some tricks to get closish at least.
And while this might not be very important for most application developers, it is very important for library authors, especially to make libraries interoperable with each other and extend them safely and typesafe.
An adjacent point is to use checked exceptions and to handle them appropriate to their type. I don't get why Java checked exceptions were so maligned. They saved me so many headaches on a project where I forced their use as I was the tech lead for it.
Everyone hated me for a while because it forced them to deal with more than just the happy path but they loved it once they got in the rhythm of thinking about all the exceptional cases in the code flow. And the project was extremely robustness even though we were not particularly disciplined about unit testing
I think most complaints about checked exceptions in Java ultimately boil down to how verbose handling exceptions in Java is. Everytime the language forces you to handle an exception when you don't really need to makes you hate it a bit more.
First, the library author cannot reasonably define what is and isn't a checked exception in their public API. That really is up to the decision of the client. This wouldn't be such a big deal if it weren't so verbose to handle exceptions though: if you could trivially convert an exception to another type, or even declare it as runtime, maybe at the module or application level, you wouldn't be forced to handle them in these ways.
Second, to signature brittleness, standard advice is to create domain specific exceptions anyways. Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose... see above.
Ultimately, I love checked exceptions. I just hate the ergonomics around exceptions in Java. I wish designers focused more on fixing that than throwing the baby out with the bathwater.
If only Java also provided Either<L,R>-like in the standard library...
Personally I use checked exceptions whenever I can't use Either<> and avoid unchecked like a plague.
Yeah, it's pretty sad Java language designer just completely deserted exception handling. I don't think there's any kind of improvement related to exceptions between Java 8 and 24.
> Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose
The problem just compounds too. People start checking things that they can’t handle from the functions they’re calling. The callers upstream can’t possibly handle an error from the code you’re calling, they have no idea why it’s being called.
I also hate IOException. It’s so extremely unspecific. It’s the worst way to do exceptions. Did the entire disk die or was the file not just found or do I not have permissions to write to it? IOException has no meaning.
Part of me secretly hopes Swift takes over because I really like its error handling.
I think checked exceptions were maligned because they were overused. I like that Java supports both checked and unchecked exceptions. But IMO checked exceptions should only be used for what Eric Lippert calls "exogenous" exceptions [1]; and even then most of them should probably be converted to an unchecked exception once they leave the library code that throws them. For example, it's always possible that your DB could go offline at any time, but you probably don't want "throws SQLException" polluting the type signature all the way up the call stack. You'd rather have code assuming all SQL statements are going to succeed, and if they don't your top-level catch-all can log it and return HTTP 500.
Put another way: errors tend to either be handled "close by" or "far away", but rarely "in the middle".
So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).
It's fine to let exceptions percolate to the top of the call stack but even then you likely want to inform the user or at least log it in your backend why the request was unsuccessful. Checked exceptions force both the handling of exceptions and the type checking if they are used as intended. It's not a problem if somewhere along the call chain an SQLException gets converted to "user not permitted to insert this data" exception. This is how it was always meant to work. What I don't recommend is defaulting to RuntimeException and derivatives for those business level exceptions. They should still be checked and have their own types which at least encourages some discipline when handling and logging them up the call stack.
Sometimes I feel like I actually wouldn't mind having any function touching the database tagged as such. But checked exceptions are such a pita to deal with that I tend to not bother.
Setting aside the objections some have to exceptions generally: Checked exceptions, in contrast to unchecked, means that if a function/method deep in your call stack is changed to throw an exception, you may have to change many function (to at least denote that they will throw that exception or some exception) between the handler and the thrower. It's an objection to the ergonomics around modifying systems.
Think of the complaints around function coloring with async, how it's "contagious". Checked exceptions have the same function color problem. You either call the potential thrower from inside a try/catch or you declare that the caller will throw an exception.
And as with async, the issue is a) the lack of the ability to write generic code that can abstract over the async-ness or throw signature of a function and b) the ability to type erase asyncness (by wrapping with stackful coroutines) or throw signature (by converting to unchecked exceptions).
Incidentally, for exceptions, Java had (b), but for a long time didn't have (a) (although I think this changed?), leading to (b) being abused.
That's a valid point but it's somewhere on a spectrum of "quick to write/change" vs "safe and validated" debate of strictly vs loosely typed systems. Strictly typed systems are almost by definition much more "brittle" when it comes to code editing. But the strictness also ensures that refactoring is usually less perilous than in loosely typed code.
> Checked exceptions, in contrast to unchecked, means that if a function/method deep in your call stack is changed to throw an exception, you may have to change many function (to at least denote that they will throw that exception or some exception) between the handler and the thrower.
That's the point! The whole reason for checked exceptions is to gain the benefit of knowing if a function starts throwing an exception that it didn't before, so you can decide how to handle it. It's a good thing, not a bad thing! It's no different from having a type system which can tell you if the arguments to a function change, or if its return type does.
> Setting aside the objections some have to exceptions generally: Checked exceptions, in contrast to unchecked, means that if a function/method deep in your call stack is changed to throw an exception, you may have to change many function (to at least denote that they will throw that exception or some exception) between the handler and the thrower. It's an objection to the ergonomics around modifying systems.
And if you change a function deep in the call stack to return a different type on the happy path? Same thing. Yet, people don't complain about that and give up on statically type checking return values.
I honestly think the main reason that some people will simultaneously enjoy using Result/Try/Either types in languages like Rust while also maligning checked exceptions is because of the mental model and semantics around the terminology. I.e., "checked exception" and "unchecked exception" are both "exceptions", so our brains lumped those two concepts together; whereas returning a union type that has a success variant and a failure variant means that our brains are more willing lump the failure return and the successful return together.
To be fair, I do think it's a genuine design flaw to have checked and unchecked exceptions both named and syntactically handled similarly. The return type approach is a better semantic model for modelling expected business logic "failure" modes.
Checked exceptions were a reasonable idea, but the Java library implementation & use of these was totally wrong.
Checked exceptions work well for occasional major "expectable" failures -- opening a file, making a network connection.
They work extremely poorly when required for ongoing access or use of IO/ network resources, since this forces failures which are rare & impossible to usefully recover from to be explicitly declared/ caught/ rethrown with great verbosity and negative value added.
All non-trivial software is composition, so the idea of calling code "recovering" from a failure is at odds with encapsulation. What we end up with is business logic which can fail anywhere, can't recover anything, yet all middle layers -- not just the outer transaction boundary -- are forced to catch or declare these exceptions.
Requiring these "technical exceptions" to be pervasively handled is thus not just substantially invalid & pointless, but actually leads to serious rates of faulty error-handling. Measured experience in at least a couple of large codebases is that about 5-10% of catch clauses have fucked implementations either losing the cause or (worse) continue execution with null or erroneous results.
With Java, there are a lot of usability issues with checked types. For example streams to process data really don't play nicely if your map or filter function throws a checked exception. Also if you are calling a number of different services that each have their own checked exception, either you resort to just catching generic Exception or you end up with a comically large list of exceptions
C# went with properly typed but unchecked exceptions. IMO it gives you a clean error stacks without too much of an issue.
I also think its a bit cleaner to have a nicely pattern matched handler blocks than bespoke handling at every level. That said, if unwrapped error results have a robust layout then its probably pretty equivalent.
That is why I am happy that rich errors (https://xuanlocle.medium.com/kotlin-2-4-introduces-rich-erro...) are coming to Kotlin. This expresses the possible error states very well, while programming for the happy path and with some syntactic sugar for destucturing the errors.
I rarely have more than handful of try..catch blocks in any application. These either wrap around an operation that can be retried in the case of temporary failure or abort the current operation with a logged error message.
Checked exceptions feel like a bad mix of error returns and colored functions to me.
For anyone who dislikes checked exceptions due to how clunky they feel: modern Java allows you to construct custom Result-like types using sealed interfaces.
Hard not to agree with the general idea. But also hard to ignore all of the terrible experiences I've had with systems where everything was a unique type.
In general, I think this largely falls when you have code that wants to just move bytes around intermixed with code that wants to do some fairly domain specific calculations. I don't have a better way of phrasing that, at the moment. :(
There are cases where you have the data in hand but now you have to look for how to create or instantiate the types before you can do anything with it, and it can feel like a scavenger hunt in the docs unless there's a cookbook/cheatsheet section.
One example is where you might have to use createVector(x, y, z): Vector when you already have { x, y, z }. And only then can you createFace(vertices: Vector[]): Face even though Face is just { vertices }. And all that because Face has a method to flip the normal or something.
Another example is a library like Java's BouncyCastle where you have the byte arrays you need, but you have to instantiate like 8 different types and use their methods on each other just to create the type that lets you do what you wish was just `hash(data, "sha256")`.
This stuff gets unbearable very fast. We have custom types for geometries at my work. We also use a bunch of JS libraries for e.g. coordinate conversions. They output as [number, number, number], whereas our internal type are number[].
"Phantom types" are useful for what you describe: that's where we add a parameter to a type (i.e. making it generic), but we don't actually use that parameter anywhere. I used this when dealing with cryptography in Scala, where everything is just an array of bytes, but phantom types prevented me getting them mixed up. https://news.ycombinator.com/item?id=28059019
The article is written in Go, in which - iirc - it's fairly easy and cheap to convert a type alias back to its original type (e.g. an AccountID to an int).
Using the right architecture, you could make it so your core domain type and logic uses the strictly typed aliases, and so that a library that doesn't care about domain specific stuff converts them to their higher (lower?) type and works with that. Clean architecture style.
Unfortunately, that involves a lot of conversion code.
Ideally though, the compiler lowers all domain specific logic into simple byte-moving, just after having checked that types add up. Or maybe I misunderstood what you meant?
Type systems, like any other tool in the toolbox, have an 80/20 rule associated with them. It is quite easy to overdo types and make working with a library extremely burdensome for little to no to negative benefit.
I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Maybe an elaborate type system worth it, but maybe not (especially if there are good tests.)
> I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Presumably you need to know what an Account and a User are to use that software in the first place. I can't imagine a reasonable person easily understanding a getAccountById function which takes one argument of type UUID, but having trouble understanding a getAccountById function which takes one argument of type AccountId.
UserID and AccountID could just as well be integers.
What he means is that by introducing a layer of indirection via a new type you hide the physical reality of the implementation (int vs. string).
The physical type matters if you want to log it, save to a file etc.
So now for every such type you add a burden of having to undo that indirection.
At which point "is it worth it?" is a valid question.
You made some (but not all) mistakes impossible but you've also introduced that indirection that hides things and needs to be undone by the programmer.
> I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Yes, that’s exactly the point. If you don’t know how to acquire an AccountID you shouldn’t just be passing a random string or UUID into a function that accepts an AccountID hoping it’ll work, you should have acquired it from a source that gives out AccountIDs!
I now know I never know whenever "a UUID" is stored or represented as a GUIDv1 or a UUIDv4/UUIDv7.
I know it's supposed to be "just 128 bits", but somehow, I had a bunch of issues running old Java servlets+old Java persistence+old MS SQL stack that insisted, when "converting" between java.util.UUID to MS SQL Transact-SQL uniqueidentifier, every now and then, that it would be "smart" if it flipped the endianess of said UUID/GUID to "help me". It got to a point where the endpoints had to manually "fix" the endianess and insert/select/update/delete for both the "original" and the "fixed" versions of the identifiers to get the expected results back.
(My educated guess it's somewhat similar to those problems that happens when your persistence stack is "too smart" and tries to "fix timezones" of timestamps you're storing in a database for you, but does that wrong, some of the time.)
They are generated with different algorithms, if you find these distinctions to be semantically useful to operations, carry that distinction into the type.
I'd much rather deal with the 2nd version than the first. It's self-documenting and prevents errors like calling "foo(userId, accountId)" letting the compiler test for those cases. It also helps with more complex data structures without needing to create another type.
> I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is.
It's literally the opposite. A string is just a bag of bytes you know nothing about. An AccountID is probably... wait for it... an ID of an Account. If you have the need to actually know the underlying representation you are free to check the definition of the type, but you shouldn't need to know that in 99% of contexts you'll want to use an AccountID in.
> Now I need to know what those are (and how to make them, etc. as well) to use your software.
You need to know what all the types are no matter what. It's just easier when they're named something specific instead of "a bag of bytes".
Linking to that masterpiece is borderline insulting. Such a basic and easy to understand usage of the type system is precisely what the grug brain would advocate for.
I think the example is just not very useful, because it illustrates a domain separation instead of a computational one, which is almost always the wrong approach.
It is however useful to return a UUID type, instead of a [16]byte, or a HTMLNode instead of a string etc. These discriminate real, computational differences. For example the method that gives you a string representation of an UUID doesn't care about the surrounding domain it is used in.
Distinguishing a UUID from an AccountID, or UserID is contextual, so I rather communicate that in the aggregate. Same for Celsius and Fahrenheit. We also wouldn't use a specialized type for date times in every time zone.
There are a few languages where this is not too tedious (although other things tend to be a bit more tedious than needed in those)
The main problem with these is how do you actually get the verification needed when data comes in from outside the system. Check with the database every time you want to turn a string/uuid into an ID type? It can get prohibitively expensive.
My team recently did this to some C++ code that was using mixed numeric values. It started off as finding a bug. The bug was fixed but the fixer wanted to add safer types to avoid future bugs. They added them, found 3 more bugs where the wrong values were being used unintentionally.
My friend Lukas has written about this before in more detail, and describes the general technique as "Safety Through Incompatibility". I use this approach in all of my golang codebases now and find it invaluable — it makes it really easy to do the right thing and really hard to accidentally pass the wrong kinds of IDs around.
readonly struct Id32<M> {
public readonly int Value { get; }
}
Then you can do:
public sealed class MFoo { }
public sealed class MBar { }
And:
Id32<MFoo> x;
Id32<MBar> y;
This gives you integer ids that can’t be confused with each other. It can be extended to IdGuid and IdString and supports new unique use cases simply by creating new M-prefixed “marker” types which is done in a single line.
I’ve also done variations of this in TypeScript and Rust.
I've done something like that too. I also noticed that enums are even lower-friction (or were, back in 2014) if your IDs are integers, but I never put this pattern into real code because I figured it might be too confusing: https://softwareengineering.stackexchange.com/questions/3090...
Have you used this in production? It seems appealing but seems so anti-thetical to the common sorts of engineering cultures I've seen where this sort of rigorous thinking does not exactly abound.
I've actually seen this before and didn't realize this is exactly what the goal was. I just thought it was noise. In fact, just today I wrote a function that accepted three string arguments and was trying to decide if I should force the caller to parse them into some specific types, or do so in the function body and throw an error, or just live with it. This is exactly the solution I needed (because I actually don't NEED the parsed values.)
This is going to have the biggest impact on my coding style this year.
The issues I see with this approach is when developers stop at this first level of type implementation. Everything is a type and nothing works well together, tons of types seem to be subtle permutations of each other, things get hard to reason about etc.
In systems like that I would actually rather be writing a weakly typed dynamic language like JS or a strongly typed dynamic language like Elixir. However, if the developers continue pushing logic into type controlled flows, eg:move conditional logic into union types with pattern matching, leverage delegation etc. the experience becomes pleasant again. Just as an example (probably not the actual best solution) the "DewPoint" function could just take either type and just work.
This would allow for some nice properties. It would also enable a bunch of small optimisations in our languages that we can't have today. Eg, I could make an integer that must fall within my array bounds. Then I don't need to do bounds checking when I index into my array. It would also allow a lot more peephole optimisations to be made with Option.
Weirdly, rust already kinda supports this within a function thanks to LLVM magic. But it doesn't support it for variables passed between functions.
While I'm not entirely convinced myself whether it is worth the effort, it offers the ability to express "a number greater than 0". Using type narrowing and intersection types, open/closed intervals emerge naturally from that. Just check `if (a > 0 && a < 1)` and its type becomes `(>0)&(<1)`, so the interval (0, 1).
I also built a simple playground that has a PoC implementation: https://nikeee.github.io/typescript-intervals/
[1]: https://github.com/microsoft/TypeScript/issues/43505
Among the popular languages like golang, rust or python typescript has the most powerful type system.
How about a type with a number constrained between 0 and 10? You can already do this in typescript.
You can even programmatically define functions at the type level. So you can create a function that outputs a type between 0 to N. The issue here is that it’s a bit awkward you want these types to compose right? If I add two constrained numbers say one with max value of 3 and another with max value of two the result should be max value of 5. Typescript doesn’t support this by default with default addition. But you can create a function that does this. The issue is to create these functions you have to use tuples to do addition at the type level and you need to use recursion as well. Typescript recursion stops at 100 so there’s limits.Additionally it’s not intrinsic to the type system. Like you need peanno numbers built into the number system and built in by default into the entire language for this to work perfectly. That means the code in the function is not type checked but if you assume that code is correct then this function type checks when composed with other primitives of your program.
Otherwise you could have type level asserts more generally. Why stop at a range check when you could check a regex too? This makes the difficulty more clear.
For the simplest range case (pure assignment) you could just use an enum?
https://news.ycombinator.com/item?id=42367644
A month before that:
https://news.ycombinator.com/item?id=41630705
I've given up since then.
This is where the concept of “Correct by construction” comes in. If any of your code has a precondition that a UUID is actually unique then it should be as hard as possible to make one that isn’t. Be it by constructors throwing exceptions, inits returning Err or whatever the idiom is in your language of choice, the only way someone should be able to get a UUID without that invariant being proven is if they really *really* know what they’re doing.
(Sub UUID and the uniqueness invariant for whatever type/invariants you want, it still holds)
This is one of the basic features of object-oriented programming that a lot of people tend to overlook these days in their repetitive rants about how horrible OOP is.
One of the key things OO gives you is constructors. You can't get an instance of a class without having gone through a constructor that the class itself defines. That gives you a way to bundle up some data and wrap it in a layer of validation that can't be circumvented. If you have an instance of Foo, you have a firm guarantee that the author of Foo was able to ensure the Foo you have is a meaningful one.
Of course, writing good constructors is hard because data validation is hard. And there are plenty of classes out there with shitty constructors that let you get your hands on broken objects.
But the language itself gives you direct mechanism to do a good job here if you care to take advantage of it.
Functional languages can do this too, of course, using some combination of abstract types, the module system, and factory functions as convention. But it's a pattern in those languages where it's a language feature in OO languages. (And as any functional programmer will happily tell you, a design pattern is just a sign of a missing language feature.)
I still follow TDD-with-a-test for all new features, all edge cases and all bugs that I can't trigger failure by changing the type system for.
However, red-green-refactor-with-the-type-system is usually quick and can be used to provide hard guarantees against entire classes of bug.
It is always great when something is so elegantly typed that I struggle to think of how to write a failing test.
What drives me nuts is when there are testing left around basically testing the compiler that never were “red” then “greened” makes me wonder if there is some subtle edge case I am missing.
Now I just think of types as the test suite’s first line of defense. Other commenters who mention the power of types for documentation and refactoring aren’t wrong, but I think that’s because types are tests… and good tests, at almost any level, enable those same powers.
The danger of that is of course that you provide a ladder over the wall you just built and instead of
They now go the shortcut route via numeric representation and may forget the conversion factor. In that case I'd argue it is best to always represent temperature as one unit (Kelvin or Celsius, depending on the math you need to do with it) and then just add a .display(Unit:: Fahrenheit) method that returns a string. If you really want to convert to TemperatureF for a calculation you would have to use a dedicated method that converts from one type to another.The unit thing is of course an example, for this finished libraries like pythons pint (https://pint.readthedocs.io/en/stable/) exist.
One thing to consider as well is that you can mix up absolute values ("it is 28°C outside") and temperature deltas ("this is 2°C warmer than the last measurement"). If you're controlling high energy heaters mixing those up can ruin your day, which is why you could use different types for absolutes and deltas (or a flag within one type). Datetime libraries often do that as well (in python for example you have datetime for absolute and timedelta for relative time)
You can always enforce nominal types if you really need it.
Welcome to typescript. Where generics are at the heart of our generic generics that throw generics of some generic generic geriatric generic that Bob wrote 8 years ago.
Because they can’t reason with the architecture they built, they throw it at the type system to keep them in line. It works most of the time. Rust’s is beautiful at barking at you that you’re wrong. Ultimately it’s us failing to design flexibility amongst ever increasing complexity.
Remember when “Components” where “Controls” and you only had like a dozen of them?
Remember when a NN was only a few hundred thousand parameters?
As complexity increases with computing power, so must our understanding of it in our mental model.
However you need to keep that mental model in check, use it. If it’s typing, do it. If it’s rigorous testing, write your tests. If it’s simulation, run it my friend. Ultimately, we all want better quality software that doesn’t break in unexpected ways.
You might go with:
And so on.That looks nice, but when you try to pattern match on it and have your pattern matching return the types that are associated with the specific operation, it won't work. The reason is that Typescript does not natively support GADTs. Libs like ts-pattern use some tricks to get closish at least.
And while this might not be very important for most application developers, it is very important for library authors, especially to make libraries interoperable with each other and extend them safely and typesafe.
Deleted Comment
First, the library author cannot reasonably define what is and isn't a checked exception in their public API. That really is up to the decision of the client. This wouldn't be such a big deal if it weren't so verbose to handle exceptions though: if you could trivially convert an exception to another type, or even declare it as runtime, maybe at the module or application level, you wouldn't be forced to handle them in these ways.
Second, to signature brittleness, standard advice is to create domain specific exceptions anyways. Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose... see above.
Ultimately, I love checked exceptions. I just hate the ergonomics around exceptions in Java. I wish designers focused more on fixing that than throwing the baby out with the bathwater.
Personally I use checked exceptions whenever I can't use Either<> and avoid unchecked like a plague.
Yeah, it's pretty sad Java language designer just completely deserted exception handling. I don't think there's any kind of improvement related to exceptions between Java 8 and 24.
https://news.ycombinator.com/item?id=44551088
https://news.ycombinator.com/item?id=44432640
> Your code probably shouldn't be throwing IOExceptions. But Java makes converting exceptions unnecessarily verbose
The problem just compounds too. People start checking things that they can’t handle from the functions they’re calling. The callers upstream can’t possibly handle an error from the code you’re calling, they have no idea why it’s being called.
I also hate IOException. It’s so extremely unspecific. It’s the worst way to do exceptions. Did the entire disk die or was the file not just found or do I not have permissions to write to it? IOException has no meaning.
Part of me secretly hopes Swift takes over because I really like its error handling.
[1] https://ericlippert.com/2008/09/10/vexing-exceptions/
So Java's checked exceptions force you to write verbose and pointless code in all the wrong places (the "in the middle" code that can't handle and doesn't care about the exception).
Deleted Comment
A problem easily solved by writing business logic in pure java code without any IO and handling the exceptions gracefully at the boundary.
Think of the complaints around function coloring with async, how it's "contagious". Checked exceptions have the same function color problem. You either call the potential thrower from inside a try/catch or you declare that the caller will throw an exception.
Incidentally, for exceptions, Java had (b), but for a long time didn't have (a) (although I think this changed?), leading to (b) being abused.
That's the point! The whole reason for checked exceptions is to gain the benefit of knowing if a function starts throwing an exception that it didn't before, so you can decide how to handle it. It's a good thing, not a bad thing! It's no different from having a type system which can tell you if the arguments to a function change, or if its return type does.
And if you change a function deep in the call stack to return a different type on the happy path? Same thing. Yet, people don't complain about that and give up on statically type checking return values.
I honestly think the main reason that some people will simultaneously enjoy using Result/Try/Either types in languages like Rust while also maligning checked exceptions is because of the mental model and semantics around the terminology. I.e., "checked exception" and "unchecked exception" are both "exceptions", so our brains lumped those two concepts together; whereas returning a union type that has a success variant and a failure variant means that our brains are more willing lump the failure return and the successful return together.
To be fair, I do think it's a genuine design flaw to have checked and unchecked exceptions both named and syntactically handled similarly. The return type approach is a better semantic model for modelling expected business logic "failure" modes.
In fact, at each layer, if you want to propagate an error, you have to convert it to one specific to that layer.
Checked exceptions work well for occasional major "expectable" failures -- opening a file, making a network connection.
They work extremely poorly when required for ongoing access or use of IO/ network resources, since this forces failures which are rare & impossible to usefully recover from to be explicitly declared/ caught/ rethrown with great verbosity and negative value added.
All non-trivial software is composition, so the idea of calling code "recovering" from a failure is at odds with encapsulation. What we end up with is business logic which can fail anywhere, can't recover anything, yet all middle layers -- not just the outer transaction boundary -- are forced to catch or declare these exceptions.
Requiring these "technical exceptions" to be pervasively handled is thus not just substantially invalid & pointless, but actually leads to serious rates of faulty error-handling. Measured experience in at least a couple of large codebases is that about 5-10% of catch clauses have fucked implementations either losing the cause or (worse) continue execution with null or erroneous results.
https://literatejava.com/exceptions/checked-exceptions-javas...
I also think its a bit cleaner to have a nicely pattern matched handler blocks than bespoke handling at every level. That said, if unwrapped error results have a robust layout then its probably pretty equivalent.
Checked exceptions feel like a bad mix of error returns and colored functions to me.
But for one, Java checked exceptions don't work with generics.
In general, I think this largely falls when you have code that wants to just move bytes around intermixed with code that wants to do some fairly domain specific calculations. I don't have a better way of phrasing that, at the moment. :(
There are cases where you have the data in hand but now you have to look for how to create or instantiate the types before you can do anything with it, and it can feel like a scavenger hunt in the docs unless there's a cookbook/cheatsheet section.
One example is where you might have to use createVector(x, y, z): Vector when you already have { x, y, z }. And only then can you createFace(vertices: Vector[]): Face even though Face is just { vertices }. And all that because Face has a method to flip the normal or something.
Another example is a library like Java's BouncyCastle where you have the byte arrays you need, but you have to instantiate like 8 different types and use their methods on each other just to create the type that lets you do what you wish was just `hash(data, "sha256")`.
Using the right architecture, you could make it so your core domain type and logic uses the strictly typed aliases, and so that a library that doesn't care about domain specific stuff converts them to their higher (lower?) type and works with that. Clean architecture style.
Unfortunately, that involves a lot of conversion code.
I know what a UUID (or a String) is. I don't know what an AccountID, UserID, etc. is. Now I need to know what those are (and how to make them, etc. as well) to use your software.
Maybe an elaborate type system worth it, but maybe not (especially if there are good tests.)
https://grugbrain.dev/#grug-on-type-systems
Presumably you need to know what an Account and a User are to use that software in the first place. I can't imagine a reasonable person easily understanding a getAccountById function which takes one argument of type UUID, but having trouble understanding a getAccountById function which takes one argument of type AccountId.
What he means is that by introducing a layer of indirection via a new type you hide the physical reality of the implementation (int vs. string).
The physical type matters if you want to log it, save to a file etc.
So now for every such type you add a burden of having to undo that indirection.
At which point "is it worth it?" is a valid question.
You made some (but not all) mistakes impossible but you've also introduced that indirection that hides things and needs to be undone by the programmer.
Yes, that’s exactly the point. If you don’t know how to acquire an AccountID you shouldn’t just be passing a random string or UUID into a function that accepts an AccountID hoping it’ll work, you should have acquired it from a source that gives out AccountIDs!
I now know I never know whenever "a UUID" is stored or represented as a GUIDv1 or a UUIDv4/UUIDv7.
I know it's supposed to be "just 128 bits", but somehow, I had a bunch of issues running old Java servlets+old Java persistence+old MS SQL stack that insisted, when "converting" between java.util.UUID to MS SQL Transact-SQL uniqueidentifier, every now and then, that it would be "smart" if it flipped the endianess of said UUID/GUID to "help me". It got to a point where the endpoints had to manually "fix" the endianess and insert/select/update/delete for both the "original" and the "fixed" versions of the identifiers to get the expected results back.
(My educated guess it's somewhat similar to those problems that happens when your persistence stack is "too smart" and tries to "fix timezones" of timestamps you're storing in a database for you, but does that wrong, some of the time.)
They are generated with different algorithms, if you find these distinctions to be semantically useful to operations, carry that distinction into the type.
Seems like 98% of the time it wouldn’t matter.
I'd much rather deal with the 2nd version than the first. It's self-documenting and prevents errors like calling "foo(userId, accountId)" letting the compiler test for those cases. It also helps with more complex data structures without needing to create another type.
It's literally the opposite. A string is just a bag of bytes you know nothing about. An AccountID is probably... wait for it... an ID of an Account. If you have the need to actually know the underlying representation you are free to check the definition of the type, but you shouldn't need to know that in 99% of contexts you'll want to use an AccountID in.
> Now I need to know what those are (and how to make them, etc. as well) to use your software.
You need to know what all the types are no matter what. It's just easier when they're named something specific instead of "a bag of bytes".
> https://grugbrain.dev/#grug-on-type-systems
Linking to that masterpiece is borderline insulting. Such a basic and easy to understand usage of the type system is precisely what the grug brain would advocate for.
It is however useful to return a UUID type, instead of a [16]byte, or a HTMLNode instead of a string etc. These discriminate real, computational differences. For example the method that gives you a string representation of an UUID doesn't care about the surrounding domain it is used in.
Distinguishing a UUID from an AccountID, or UserID is contextual, so I rather communicate that in the aggregate. Same for Celsius and Fahrenheit. We also wouldn't use a specialized type for date times in every time zone.
The main problem with these is how do you actually get the verification needed when data comes in from outside the system. Check with the database every time you want to turn a string/uuid into an ID type? It can get prohibitively expensive.
Deleted Comment
https://lukasschwab.me/blog/gen/deriving-safe-id-types-in-go...
https://lukasschwab.me/blog/gen/safe-incompatibility.html
I’ve also done variations of this in TypeScript and Rust.
[1] enum class from C++11, classic enums have too many implicit conversions to be of any use.
The name means "Value Object Generator" as it uses Source generation to generate the "Value object" types.
That readme has links to similar libraries and further reading.
This is going to have the biggest impact on my coding style this year.