Dynamic type systems are not inherently more open

(defun defset-expander (env macform name params newval setform) (with-gensyms (getter setter args gpf-pairs gpr-pairs ext-pairs pgens rgens egens all-pairs agens nvsym) (let* ((ap (analyze-params params)) (exp-params (car ap)) (total-syms (cadr ap)) (fp (new fun-param-parser form macform syntax exp-params)) (fixpars (append fp.req fp.(opt-syms))) (restpar (if (symbol-package fp.rest) fp.rest)) (extsyms [keep-if symbol-package (diff total-syms (cons restpar fixpars))]) (xsetform ^^(alet ((,',nvsym ,,newval)) ,,(expand ^(symacrolet ((,newval ',nvsym)) ,setform) env)))) ^(defplace (,name . ,args) body (,getter ,setter (tree-bind ,params ,args (let* ((,gpf-pairs (mapcar (op (fun list) (gensym)) (list ,*fixpars))) (,gpr-pairs (if ',restpar (if (consp ,restpar) (mapcar (op (fun list) (gensym)) ,restpar) (list (list (gensym) ,restpar))))) (,ext-pairs (mapcar (op (fun list) (gensym)) (list ,*extsyms))) (,pgens (mapcar (fun car) ,gpf-pairs)) (,rgens (mapcar (fun car) ,gpr-pairs)) (,egens (mapcar (fun car) ,ext-pairs)) (,all-pairs (append ,gpf-pairs ,gpr-pairs ,ext-pairs)) (,agens (collect-each ((a ,args)) (let ((p (pos a ,all-pairs (fun eq) (fun cadr)))) (if p (car (del [,all-pairs p])) a))))) ^(alet (,*,gpf-pairs ,*,gpr-pairs ,*,ext-pairs) ,(expand ^(symacrolet (,*(zip ',fixpars (mapcar (ret ^',@1) ,pgens)) ,*(zip ',extsyms (mapcar (ret ^',@1) ,egens)) ,*(if ,gpr-pairs (if (consp ,restpar) ^((,',restpar ',,rgens)) ^((,',restpar ',(car ,rgens)))))) (macrolet ((,,getter () ^(,',',name ,',*,agens)) (,,setter (,',newval) ,',xsetform)) ,body)) ,env)))))))))

I think there's a perhaps irreconcilable disconnect between 2 camps here, but I also think that's ok and different people are allowed to like different things.

My experience, having gone from dynamic typing to static typing and now pining for more expressive type features in my chosen language is that static typing changes where you have to spend the cognitive complexity budget.

To my mind if I return to a piece of dynamically typed code written any time other than in the same session I have to load a whole mental model of the entire code, what types all the variables are, what expectations are attached to them, what mutation, if any, each method call performs. I suppose the trade-off is that advocates say you can prototype things more easily when requirements are not known.

With static typing I only need to load 2 things, the high level goal, what I need to achieve, and the local context where I need to make the change. The rest is taken care of by static typing, I can encode all that cognitive complexity in the types and I never need to think about it again. But as I say, that's how my brain works, yours might work differently and that's fine.

pron · 6 years ago

Not all untyped languages are the same. E.g. Clojure now encourages specifying assumptions with spec annotations, some of them are stronger than those that can be expressed by Haskell's type system: i.e. they can succinctly express more about the "entire code" than Haskell's types. Types are not the only way to write formal assertions and assumptions about code. The difference between the two is in the level of sound inference that can be made (stronger with types) and the strength and expressivity of the assertions (stronger with contracts).

Of course, one could say we should have both, and some languages -- like Java with JML, C with ACSL, and SPARK, as well as some research languages like Whiley and Dafny -- do. But this could add complexity, and the interesting question is, if you could just have one, what are the relative merits of either? My guess is that types would be better for enforcing code organization and for tooling (more efficient AOT code generation, safe automatic refactoring, code completion), because automatic transformations require automatic and sound inference, while contracts are better for correctness because they can more easily express stronger properties of interest, and require less effort to verify them (types require formal deductive proofs, while contracts can use those or a range of other techniques that vary in their cost and their soundness).

In short, types are not the only form of formal specification, but the different forms do have different practical characteristics. What matters is not just what we want to know about the rest of the code, but how we want to use that information and how certain we need to be. Unfortunately, theoretical results tell us that we cannot have a perfect system that is always correct, cheap to use, and can allow us to express rich properties.

chrisdone · 6 years ago

Techniques like Ghost of Departed Proofs are the most exciting thing to me as a casual correctness enthusiast. It acts as a means of combining static types with contracts.

Types are sound mostly because they are a stupid mini language, so easily enforced, like you said. Contracts are practical and convenient because you tend to write them in the language or a subset thereof that you’re checking. That’s hard to ignore.

GODP has you write a runtime check in regular code, and return as a result a proof value. Put that code in a module boundary. Then you have upstream functions expect a proof argument along with the value the proof is about. They’re tied together with a unique type variable (rank n or existential are the mechanisms of delivery for Haskell, but may differ in other languages).

head :: NonEmpty n -> Named n (List a) -> a

In this way you started from a dynamic, runtime piece of code, and ended up using a static type system to just ensure everything is passed around correctly which is trivial.

A single proof in this technique is nominal in the type (e.g. not null or positive or sorted ascending/descending), but combining them is done at the value level so there’s flexibility.

The bang for buck potential is large. I’m more interested in stuff like this than the dependent types direction that people are pushing for in GHC. If I wanted dependent types I have Agda, Coq, Isabelle, Idris, etc. to play with.

Silhouette · 6 years ago

My guess is that types would be better for enforcing code organization and for tooling ... while contracts are better for correctness

I see what you're getting at, but there is one other crucial difference: the types we're talking about here are usually checked at compile time, whereas the contracts we're talking about here are usually checked at run time. Although the latter may be more expressive if other things are equal, other things are very much not equal in this respect.

How much that matters depends on the nature of the software and how it is to be deployed and used. If you're writing a program for your own use, it might not matter much at all. If you're writing a program that is going to control a satellite and software maintenance after launch is extremely expensive if not impossible, it might matter a great deal.

capableweb · 6 years ago

Also, the following part isn't necessarily because of dynamic typing vs static

> what types all the variables are, what expectations are attached to them, what mutation, if any, each method call performs

But more about what the language offers. Clojure for example, goes beyond types with it's abstractions (like seq, that can be applied to strings, lists, maps and so on) and normally stays away from mutation, but when needed, makes it really explicit and easy to find those mutations.

Deleted Comment

nhumrich · 6 years ago

Your comment is spot on. But the opposite can also be true. I totally have the same issues you do at times by trying to understand types (which I feel type hinting, editors, and good variables names go a long way to help with). But I also find myself using a lot of cognitive load trying to get a correct type definition, etc (for the non simple tasks). Whether its trying to figure out how to get a certain object, because of the interfaces, factories, etc. Or trying to create a new object and trying to map things correctly. What if I just want to have a hash map full of varying object structures? (I am not saying its not possible, I am saying its a use of cognitive load)

UglyToad · 6 years ago

Agreed, perhaps it's something like the difference between top-down vs bottom-up learning in teaching. People don't all use the same mental models and allowances need to exist for both groups. But I suspect this is true of most contentious topics in programming.

For what it's worth I definitely agree there can be a lot of overhead added by typed code especially in Java or C#, but I suspect it might have less to do with the type system and more with the enterprise-ness or otherwise of those code-bases. Designing a good typed API for other people maintaining the same code-base is difficult

amelius · 6 years ago

Yes. Perhaps this is an indication that we actually need automated type annotation. E.g. you initially write/prototype your program in a dynamically typed form, then you click a "magic" button, and a tool converts your program into statically typed style.

chongli · 6 years ago

This already exists in Haskell. Firstly, Haskell has type inference so you can write entire programs without any type annotations. Second of all, since type annotations actually help to document your code, it’s recommended as good Haskell style to annotate all top-level definitions.

To aid you in the latter task, you can press a key in your editor to fetch an automatically-inferred type for the name under the cursor and insert it into your code, then refine it as necessary if you find it to be too general (inference always gives the most general type possible).

zitterbewegung · 6 years ago

Using Type inference[1] on a language could be able to figure out types in most situations.

[1] https://en.wikipedia.org/wiki/Type_inference

pron · 6 years ago

Unless your type system is very weak -- and in effect, an untyped language can be considered to already be doing what you said only in a type system that has a single type -- this is generally impossible (halting problem and its extensions). So you have the choice of something that's not very useful or something that's impossible -- take your pick.

Type inference, as others have suggested, does not do what you want. If your program is well-typed according to a particular type system which you must have in mind while writing the program, they'll save you from writing explicit type annotations, but they cannot come up with "correct" type for a correct program that's not written for that particular type system.

IggleSniggle · 6 years ago

TypeScript can come close to this, especially with the latest release with Assertion support. You make assertions about your code, and the downstream types are inferred based on your assertions. Thus, the runtime behavior of your code gives you progressively enhanced type-checking. If you do these assertions at the borders, your "correctness" is dependent on the strength of your runtime assertions.

ken · 6 years ago

You mean a compiler?

zmmmmm · 6 years ago

> To my mind if I return to a piece of dynamically typed code written any time other than in the same session I have to load a whole mental model of the entire code, what types all the variables are

This is one reason my favorite model is the "egg model" - hard outer shell (static typed signature) of a function with a soft gooey center - dynamic code internally. It lets you use the convenience of dynamic typing but confines the scope of the mental budget to understanding just a few lines of dynamically typed code. Languages like groovy are pretty good for this type of code.

kazinator · 6 years ago

If you don't have to think about what mutation each method performs, if any, when switching to static typing, you're not just switching to static typing.

I've come to the conclusion that the benefit dynamic typing brings to the table is to allow more technical debt. Now of course technical debt should be repaid at an appropriate moment but that appropriate moment isn't always "as soon as possible". Let me illustrate, say you're adding a new feature and create lots of bugs in the process. Static typing will force you to fix some of these bugs before you can test out the feature. Then while testing out the feature you decide that it was a bad idea after all or that the feature should be implemented completely differently. So you scrap the implementation. In this case fixing those bugs was a waste of time. Dynamic typing allows you to postpone fixing those bugs after you're more certain that the feature and its implementation will stay.

chongli · 6 years ago

It isn’t even a benefit, really. Statically typed languages don’t force you to model things with types; you’re completely free to use strings and doubles everywhere and take on all the technical debt you want.

People who are only used to dynamic languages may feel like a static language’s type checker is an overbearing teacher (from the 1950s) standing over your shoulder, just waiting to jump on you for every silly mistake! While that feeling is common and valid as you’re learning the language, when you become fluent you see the compiler as more of an ally. You can call on it with a keystroke and it will find many problems that would otherwise only happen at runtime, possibly months or even years in the future!

Moreover, in more advanced languages (such as dependent typed languages), the compiler can actually infer the body of your functions which leads to a totally different style of programming that feels like a snippet plugin on steroids.

arminiusreturns · 6 years ago

What about JIT things like lisps? Maybe the benefits of dynamic typing are more geared towards things with a repl in particular?

pchiusano · 6 years ago

It’s true that with static typing, you are usually forced to propagate your changes to “your whole codebase” to get it compiling before you can run any of it. That stinks. However, it turns out you can lift this restriction in static languages, and this is what Unison does:

https://www.unisonweb.org/docs/refactoring

In Unison you can implement a change and propagate it just far enough to run one little experiment. There’s no need to upgrade all your code at once.

simias · 6 years ago

I haven't given Unison a try so I won't dismiss it outright but to me having to explicitly propagate your change to your entire codebase before you can run anything doesn't suck at all, it's actually the killer feature of statically typed languages.

I've written big applications in python, every time I made a significant architectural change (which might not even be huge code-wise, just far-reaching) I feel super uncomfortable. I know that I have to follow up with an intense testing session to make sure that I go through all code paths and everything still works correctly. Even then sometimes the testing is not thorough enough and you end up with a regression in production.

As a result I tend to avoid these types of changes as much as possible, even I believe that the application would benefit from them.

Meanwhile in Rust it's a breeze. Yesterday I decided to make a small but far-reaching change in an emulator I'm writing: instead of signaling interrupts by returning booleans, I'd return an "IrqState" enum with a Triggered and an Idle variants. It's more explicit that way, and I can mark the enum "must_use" so that the compiler generates a warning if some code forgets to check for an interrupt.

It's like 4 lines of code but it means making small changes all over the codebase since interrupts can be triggered in many situations. In Python that'd be a pain, in Rust I can just make the change in one place and then follow the breadcrumbs of the compiler's error messages. Once the code builds without warnings I'm fairly certain that my code is sound and I can move on after only very superficial testing.

NicoJuicy · 6 years ago

I actually rely on static typing to make architectural code changes.

Why I would want to go to dynamic typing is beyond me. My performance/efficiency would greatly decrease. .

The next change I will implement will likely be to not allow null in a property, which can be done in c# 8

yakshaving_jgt · 6 years ago

It stinks how? You can see those type errors at compile time, or your users can see those type errors at runtime.

One of those is worse.

adrusi · 6 years ago

Thanks for pointing me to Unison, I have yet to dig into the details, but its core idea is something I've been thinking about lately and it's fantastic to see that it exists!

hopia · 6 years ago

To be fair most of the time that type change propagation is just manual work guided by the compiler. Some smart IDE would even be able to make the changes for you.

dan00 · 6 years ago

Even this hasn’t to be only a feature of dynamically typed languages, like Haskells deferred typing shows. (https://downloads.haskell.org/~ghc/latest/docs/html/users_gu...)

hhas01 · 6 years ago

Yep. Being able to work fast and dirty is a huge boon when testing ideas and figuring out the initial design, and it is deeply annoying that so many typed languages refuse to permit this. The faster you can write code, the faster you can throw it away again; and arriving at good code almost invariably requires† writing lots of bad code first. If writing bad code is made expensive (another example of Premature Optimization), authors may be reluctant to try different ideas or throw anything away.

Ideally, you want a language that starts out completely untyped, efficient for that initial “sketching in code” phase. Then, as your program takes shape, it can start running type inference with warnings only, to flag obvious bug/coverage issues without being overly nitpicky. Then finally, as you refine your implementation and type-annotate your public APIs (which does double-duty in documenting them too), it can be switched to reporting type issues as errors.

Unfortunately, a lot of typed languages behave like martinets, and absolutely refuse to permit any such flexibility, dragging the entire process of development down their level of pedantry and generally making early development an almighty ballache. Conversely, untyped languages provide sod all assistance when tightening up implementation correctness and sharpening performance during late-stage development; it’s pure pants-down programming right to the bitter end.

Mind you, I think this says a lot more about programmers (particularly ones who make programming languages) than it does about languages themselves.

† (Unless, like, you’re Donald Knuth.)

IggleSniggle · 6 years ago

This is why TypeScript is so popular. JavaScript is very loose, but you can progressively enhance it using TypeScript with stricter and stricter types, many of which can be inferred, and since 3.7 can be derived by runtime assertions, and the strictness of which can be controlled with compiler flags.

Additionally, it is fairly typical in js to use functional programming styles, making type inference a breeze / provable / not introducing much boilerplate by adding types.

neilparikh · 6 years ago

Statically typed languages can, and do permitted “fast and dirty” code.

For example, in Haskell: https://downloads.haskell.org/~ghc/latest/docs/html/users_gu...

brabel · 6 years ago

If you think it helps to quickly write some code without types (I don't really agree with that in general, maybe in some specific cases), but you want to, when you're happy with the design, just add the types, there are many languages that support that!

On the top of my head:

* Dart (just don't give a type and the variable is dynamic)

* Groovy (add @CompileStatic or @TypeChecked when you're ready)

* Racket (start using typed-racket)

thrwaway69 · 6 years ago

One point based on my observation. Programmers coming from languages with good type system seems to have low probability of being a glueman. While good type system doesn't need to have static typing, in general they tend to have a way to explicitly declare things when you need to.

The current dynamic language ground is riddled with linters and additional tooling that static languages have solved already. This costs developer time (god, the amount of time people spend on writing an eslint config and selling it.)

chrisrhoden · 6 years ago

What's a glueman?

radicalbyte · 6 years ago

It's not so much allowing debt, it's that the barrier to entry is extremely low. You can hack around and get a feature "working" without having a great understanding of what you're actually doing.

Statically typed languages will confront you with a lot of obscure errors and just generally get in the way.

Eventually the good developers will eventually learn the language and get burnt by float32 / sql injection / xss / dates / not validating input and they'll learn enough to become extremely productive.

Static languages have evolved a lot; now the standard error messages are far better and the tooling is amazing. At the same time a lot of dynamic features - usually reflection based - have come along and provided their own breed of arcane and nutty rules and errors which seem designed to catch out only the strongest developers (the ones which break SOLID are the best!).

Deleted Comment

nearbuy · 6 years ago

The trade-off is that the bugs flagged by static type checking are usually quick to fix (eg: function expected an int but was passed a float). With dynamic typing, the subtle bug that arises from the function unexpectedly truncating your float could take an hour to track down and outweigh any time saved from dynamic typing. This applies even when trying out new experimental features.

hhas01 · 6 years ago

Depends on the language. I do agree that lossy coercions are a Bad Idea; a [mis]feature added with the best of intentions and inadequate forethought (e.g. AppleScript, JavaScript, PHP).

OTOH, a good [weak] untyped language should accept a float where an int is expected as long as it has no fractional part (otherwise it should throw a runtime type error, not implicitly round). Depending on how weak the language is, it could also accept a string value, as long as it can be reliably parsed into an int (this assumes a canonical representation, e.g. "1.0"; obviously localized representations demand explicit parsing, e.g. "12,000,000.00").

A good rule of thumb, IMO, is: “If a value looks right, it generally† is.” This is particularly true in end-user languages where conceptual overheads need kept to a bare minimum. (Non-programmers have enough trouble with the “value” vs “variable” distinction without piling it with type concepts too.)

† Ideally this qualifier wouldn’t be needed, but there may be awkward corner cases where a simpler data type can be promoted to a more complex one without affecting its literal representation. That raises the problem of how to treat the latter when passed to an API that expects the former: throw a type error or break the “no lossy conversion” rule? There’s no easy answer to this. (Personally, I treat it as a type error, taking care in the error message to state the value’s expected vs actual type in addition to displaying the value itself.)

dboreham · 6 years ago

You're supposed to have written a test that catches that bug. Which of course also takes more time and effort than simply using a type checker in the compiler of a typed language..

pepper_sauce · 6 years ago

Static types enforced by a compiler catch more bugs in logic that can be encoded in the type system at build time. How costly are these bugs? It probably depends on context. For a business app/SaaS, is the compiler going to prevent your broken business rules from allowing -100 items to be added to a basket, crediting the "buyer"? I would say a developer who knows how to interpret requirements is more important here. On the other hand, a compiler is probably an amazing place for static types, but I don't write compilers and I'd wager most jobbing devs don't either.

Predicate assertions instrumented to run during development catch an equivalent amount and more, since they can evaluate data at run-time.

Dynamic types combined with argument destructuring allows for very loosely coupled modules. I can see it being similar to row polymorphism, but then you have to ask whether it's worth the extra time? In many business apps a significant portion of LoC is mapping JSON/XML to DTOs to Entities to SQL and back. If everything starts and ends with JSON coming from an unverified source, forcing it into a "safe space" statically typed business program is almost ignoring the forest for the trees, possibly even giving a false sense of security. It's (over) optimising one segment of the system; it's not necessarily a waste but it's probably time which can be better spent elsewhere.

rightbyte · 6 years ago

You could to a temporary inherited class and later delete it and merge the change into the parent class, in a static OOP language.

danmaz74 · 6 years ago

To reinforce your point: static typing doesn't just force you to fix some bugs earlier, but it also forces you to spend more time doing design upfront, often when you still don't have clear specs. That can be a big drag when you need to do a lot of experimentation/iteration.

realharo · 6 years ago

But you don't need to have a final design straight from the beginning. You can start with only what you need right now and evolve the types/schema over time.

In fact, proponents of static typing would argue that the types make your code easier to refactor later, because you will be aware of all the usages, and able to move things around with confidence.

The drawback is that you need to make the entire codebase correct (or commented out) every time, not just some isolated piece you're experimenting on.

david_draco · 6 years ago

Interesting, I would have said static typing allows more technical debt. Illustrating example: Lets say you pass a double variable from some part of your code through 13 layers of APIs until it is actually looked at and acted upon. Now you realise that you not only need a double, but also a boolean. In dynamic typing, you can make a tuple containing both and only modifying the beginning and end points. In static typing you have to alter/refactor the type everywhere.

codeflo · 6 years ago

It’s often considered bad practice to pass around raw values for that very reason. Introduce a (minimal) abstraction, that is, give the thing you’re passing through those layers a name. Then you can change the endpoints at will and still get static checking (as a plus, you can‘t accidentally pass the wrong bool, or mix up the tuple order).

I agree with the GP‘s point that static typing forces you to do that kind of design work earlier.

(Edit: You raise a good point, though. I think a lot of people run into this kind of problem with static typing.)

sethammons · 6 years ago

You'd love Perl. Just pass @_ through your callstack. Want a new value available 20 functions deep that is available at the top? Just toss it in @_ at the top and you are done.

The problem? Every one of your 20 levels of functions/subroutines has an unnamed grab bag of variables. You get to keep that in your head. If you want to know if you have a value in a given branch of code, your best bet, aside from reading the code of the entire callstack, is to dump @_ in a print statement and run the entire program and get it to call the function you are in. Oh, and if one of those values contains a few screens worth of data, you will need to filter that out manually. Even "documentation" in the form of comments or tests will be unreliable due to comment-rot or mocked test assumptions.

Even in Python, I'll often have to go up the callstack to know what a named parameter actually is. And if similar shenanigans are going on, I again have to pull out a debugger or print statements to know what I can do with an argument.

With a static type, I see plain as the text on my screen what type I have as a parameter and I immediately know what I can do with it. As weak as Go's type system is, it is worlds better than Perl and Python for maintaining and creating large, non-trivial codebases. The price is passing it around at the time of writing.

toastal · 6 years ago

And on layer 10 you missed that you were using and treating the argument as if it were the double. Now you have a bug that the type system would have solved. Or you can alias that input early if you know with a good certainty it has possiblity to change, and now you just update the alias and everything gets checked all the way down without a refractor.

caseymarquis · 6 years ago

I think at that point you probably need a major refactor, regardless of the language being used. While I don't want to be in that situation, I'll typically use a helper class if I'm forced to pass something through 13 layers (a class containing all the arguments).

However, that's pretty much the textbook example for designing with dependency injection in mind. Static typing won't fix a terrible architecture.

_pmf_ · 6 years ago

> Lets say you pass a double variable from some part of your code through 13 layers of APIs until it is actually looked at and acted upon. Now you realise that you not only need a double, but also a boolean. In dynamic typing, you can make a tuple containing both and only modifying the beginning and end points. In static typing you have to alter/refactor the type everywhere.

Adding a double and boolean field to a Context object, I don't have to touch anything at the intermediary API layers. Just as with your tuple/dict.

sweeneyrod · 6 years ago

You could also only modify the endpoints in a statically typed language with type inference.

marcosdumay · 6 years ago

Yet, dynamic types make it much more to refactor your software. So that technical debt costs way more to fix.

The_rationalist · 6 years ago

According to this, languages statically typed but offering a dynamic type (any in typescript) allow best of both world. And using the right types is a matter of refactoring.

zozbot234 · 6 years ago

The hard part is not offering a "dynamic" type (which is generally just a variant type with a bunch of RuntimeType cases) but making the "static" and "dynamic" portions of the language truly, tightly interoperable. Generally, this implies that the compiler and runtime system should be able to correctly assign "blame" to some dynamic part of the program for a runtime type error that impacts "static" code, and this can be quite non-trivial in some cases.

Vinnl · 6 years ago

I agree, and I think that's one of the reasons TypeScript works so well, and why it's a shame that some people dismiss it beforehand because they generalise from other statically typed languages to TypeScript. Not only are you able to consciously take up some technical debt somewhere, but it's also clearly marked by the type system.

roenxi · 6 years ago

Which is entirely plausible. If all the decisions about what data should be flowing where in a program have been made there is no particular reason not to have static typing. It won't make things worse, will enforce discipline and will probably catch bugs.

As far as static types are feasible they are great to have. Clojure's spec is the gold standard I work to; if anyone has a better system it needs a lot more publicity.

svat · 6 years ago

Six notable things I took away from this post:

- Structural typing, i.e. instead of "you eagerly write a schema for the whole universe", just limit to what you need (basically, encode only the same kinds of assumptions you would make in a dynamically-typed language).

- It’s easy to discover the assumptions of the Haskell program [...] In the dynamically-typed program, we’d have to audit every code path — Left implicit in many debates about these topics are the "weights" that one attaches to these things, the importance and frequency of attempting such activities. Surely they vary, depending on everything from the application (Is it "code once and throw it away", or does it need to be maintained when the original programmers have left?) and down to the individual programmer's approach to life (What is the cost of an error: how bad would it be to have a bug? Etc).

- "This is a case of improper data modeling, but the static type system is not at fault—it has simply been misused." — To me this shows that bugs can exist in either the data modeling or the code: in a dynamically-typed language the two tend to coincide (with much less of the former) and in a statically-typed language they tend to be separate. (Is this good or bad? On the one hand you can think about them separately, on the other hand you have to look for bugs in two places / switch between two modes of thought, but then again maybe you need to do that anyway?)

- Structural versus nominal typing, where the latter involves giving a name to each new type ("If you wish to take a subselection of a struct’s fields, you must define an entirely new struct; doing this often creates an explosion of awkward boilerplate").

- "consider Python classes, which are quite nominal despite being dynamic, and TypeScript interfaces, which are structural despite being static." — This is highly illuminating, and just this bit (and the next) elaborated with some examples would make for a useful blog post on its own.

- "If you are interested in exploring static type systems with strong support for structural typing, I would recommend taking a look at any of TypeScript, Flow, PureScript, Elm, OCaml, or Reason [...] What I would not recommend for this purpose is Haskell, which [is] aggressively nominal"

For what it's worth, my opinion is that posts like this, on hotly debated topics, would do well to start with concrete examples and be written in the mode of conveying interesting information/ideas (of which there are a lot here) rather than being phrased as an argument for some position, which seems to elicit different sorts of responses — already most of the HN comments here are about static versus dynamic type systems in general, rather than about any specific ideas advanced by this post.

After giving it enough time and rereading, I feel I have a better understanding. The crux IMO is here:

> The above JavaScript code makes all the same assumptions our Haskell code does: it assumes event payloads are JSON objects with an event_type field, and it assumes signup payloads include data.user.name and data.user.email fields.

What the post points out is that these exact same assumptions, and the same behaviour of what to do with an unknown event type, can be encoded into the Haskell types and code. Similarly, in the case of “val = pickle.load(f)” in Python, the moment we try to do anything with “val” we'll inevitably make some assumptions about it, and those assumptions are its type. (For example, if all we assume that `val` has a `.foo()` which returns a string, then that type can be expressed in a static type system, even in Java.) And for proxying unknown data, just don't specify more in your type than you actually need.

So the post repeatedly emphasizes that static typing does not mean “classifying the world” or “pinning down the structure of every value in a system” (and conversely with dynamic typing we never can process truly unknown data: whatever we assume, that's the type). Overall, the key thesis (“static type systems are not fundamentally worse than dynamic type systems at processing data with an open or partially-known structure”) seems (to me) convincingly demonstrated, and it seems everyone who understands the argument can get on board with her conclusion:

> There are many patterns in dynamically-typed languages that are genuinely difficult to translate into a statically-typed context, and I think discussions of those patterns can be productive. The purpose of this blog post is to clarify why one particular discussion is not productive

----

However, I think the word “inherently” or “fundamentally” is doing a lot of work here, to the extent that if the words “are inherently” were to be replaced with “tend to be”, then a good argument could be made afresh. The post admits to much the same thing, around “These two styles facilitate very different flavors of programming”:

> many dynamically typed languages idiomatically reuse simple data structures like hashmaps to represent what in statically-typed languages are often represented by bespoke datatypes (usually defined as classes or structs). [...] A JavaScript or Clojure program may represent a record as a hashmap from string or symbol keys to values, written using object or hash literals and manipulated using ordinary functions from the standard library that manipulate keys and values in a generic way.

and points out that this has many advantages (“the practical, pragmatic advantage of a more structural approach to data modeling”) to the way things are done in Haskell, and even more so in “all mainstream, statically-typed OOP languages”. These advantages are coming to “modern statically-typed languages” (see list).

So does this defeat the point of the whole post, if ultimately the approach usually taken in dynamic languages has advantages over the approach usually taken in ("typical") statically-typed languages? No, it's an matter of clarity — it's worth it to be clear when the issue is structural-versus-nominal-typing, not static-versus-dynamic-typing. So all this:

> may give programmers from a dynamically-typed background who have historically found statically-typed languages much more frustrating to work with a better understanding of the real reason they feel that way

Now for the flip side. IMO the post also contains "enough rope" to hang static typing with. The picture that emerges is that programming in a statically typed language, as the author conceives it, consists of separate phases of "data modelling" (making assumptions about the external world, and at each module/function level about the inputs and outputs), and writing code. Often the former is the hard part to get clear, and this explains the phenomenon familiar to programmers in a statically typed language like Haskell, that once you get the types exactly right, the code basically writes itself: the types guide you towards the right code, like gravity.

But unanswered are questions like:

1. Should they be separate? In a dynamically typed language the two are blended, you write and run code (with representative inputs) to discover and refine your assumptions. You don't have to switch between two modes of thought. In a statically typed language, bugs in your assumptions will be pointed out by the compiler, and bugs in your logic will be pointed out by the program's execution.

2. Is it always worth being explicit about your assumptions? One can often get useful work done even with unclear, undefined, or even inconsistent assumptions.

3. Fine, the same set of assumptions can be encoded in both kinds of languages, but how different is the experience of arriving at the right assumptions?

Look at these quotes from the post, and think about the words "only", "just", "simply":

> static type systems only make already-present assumptions explicit

> We just have to be explicit about ignoring it.

> A static type system doesn’t require you eagerly write a schema for the whole universe, it simply requires you to be up front about the things you need.

Being explicit or "up front" in this way surely has a cost. At the same time, not being explicit also has a cost (of bugs). What the trade-off? (One does not always operate in environments where bugs have a high cost. For example, even code that does the right thing for 90% of users, but something catastrophic for 10% of them--like lose all their data--may be ok, depending on the importance of the "data" (previous scores in your game?), what expectations you've made clear, etc.)

In fact, too-enthusiastic data modeling can also lead to its own kinds of bugs:

> Suppose we’re consuming an API that returns user IDs, and suppose those IDs happen to be UUIDs. A straightforward interpretation of “parse, don’t validate” might suggest we represent user IDs in our Haskell API client using a UUID type [..] this representation is overstepping our bounds. [...] This is a case of improper data modeling, but the static type system is not at fault—it has simply been misused.

The claimed benefits are also all similar, and may not often be wanted:

> easy to discover the assumptions of the Haskell program just by looking at the type definitions [...] In the dynamically-typed program, we’d have to audit every code path

> If we want to ensure it [UserId field inside SignupPayload type] isn’t actually needed [..] we need only delete that record field; if the code typechecks, we can be confident...

> ensure application logic doesn’t accidentally assume too much

> the type system helped us here: it caught the fact that we’re assuming the payload is a JSON object, not some other JSON value, and it made us handle the non-object cases explicitly.

> The runtime representation of a UserId is really just a string, but the type system does not allow you to accidentally use it like it’s a string

> the parsing style of programming has helped us out, since if we didn’t “parse” the JSON value into an object by matching on the Object case explicitly, our code would not compile, and if we left off the fallthrough case, we’d get a warning about inexhaustive patterns.

Whether this is really a help is probably the big question. Incidentally, while I was rereading and thinking about this, came across a related paragraph in an unrelated article:

> requires covering 100% of the conditions, not just 99%. Edge cases must have applicable code, even when the programmer doesn’t yet know what the happy path should do. During early development, these edge cases can often be addressed by causing the program to crash, and then rigorous error handling can be added at a later point. This is a different workflow than in languages such as Ruby, where developers often try out code in a REPL and then move that to a prototype without considering error cases at all.

(From https://stackoverflow.blog/2020/01/20/what-is-rust-and-why-i... )

Ultimately, there's selection bias:

> programmers in statically-typed languages are perfectly happy to supply their assumptions up front

-- of course: programmers who are happy to supply their assumptions up front tend to be happier with statically-typed languages.

sethev · 6 years ago

I thought the article was good and addressed a real point of confusion, as evidenced by the two included comments (from Reddit and HN). You can consume arbitrary data using a program written in a statically typed or dynamically typed language. Whether it's decoupled from changes in the data depends on how the code is written and the data model, which have nothing to do with static vs dynamic typing.

To me the stronger argument is that the boundary between programs is dynamically typed (interpreted and checked at runtime). This is true in the statically typed example as well - the JSON is interpreted and checked at runtime, not at compile time. There's nothing your compiler can prove in advance about what's in the JSON that you'll receive at runtime.

If systems that extend beyond a single program require dynamic typing, doesn't it make sense to invest more in ways to do dynamic typing better?

nickbauman · 6 years ago

I think it's a question of efficacy and productivity. People routinely make mistakes using type systems. While static typing tends to be really good at solving problems it itself introduces (change a type and, wow! My IDE knows where that type is everywhere and can change it for me! Nevermind that only the producer and final receiver should care, it's everywhere now...)

There hasn't been a lot of study on this topic* but what little there is shows that 3% of errors found can be mitigated with type systems, where they do not exist, fixing these classes of errors takes less time than it took to use the type system.

* https://www.infoq.com/presentations/dynamic-static-typing/

mokus · 6 years ago

> If systems that extend beyond a single program require dynamic typing, doesn't it make sense to invest more in ways to do dynamic typing better?

Isn’t dynamic typing more or less a default state of not knowing anything at compile time about the values your data will take?

One could equally well ask “given that the boundaries of our systems are necessarily characterized by unpredictability, doesn’t it make sense to invest in more ways to isolate that unpredictability better (e.g. by use of static analysis)?”

erik_seaberg · 6 years ago

> the boundary between programs is dynamically typed

Many interfaces have declared, enforced static types. I don't have to write any code to handle

  SELECT birthdate FROM employees WHERE id = ?

returning "fish" because the database would never let it happen.

> You can consume arbitrary data using a program written in a statically typed or dynamically typed language.

Really? Okay, below is something that meets the definition of "arbitrary data". Can some static program process it and evoke its full meaning without the programmer having to develop an ad-hoc dynamic typing system?

ema · 6 years ago

youerbt · 6 years ago

This "be liberal in what you accept" idea, applied to modern programming, always struck me as strange.

Yes, taking an unknown structure in your program is the easy part. Programming against an unknown structure is where the problem lies.

I'd love to hear more examples of programming against such input that are beneficial over "parse don't validate" idea.

bitwize · 6 years ago

It's called Postel's law, and it's one of the burdensome idiocies Unix programmers have saddled us with, along with text-file formats and protocols, null-terminated strings, fork(2), and the assumption that I/O is synchronous by default.

Of course, once you adopt a "follow the spec or GTFO" stance, you reap other benefits as well; for example you are free to adopt a sensible binary format :)

_8ljf · 6 years ago

The problem right there is in the definition of “liberal”.

A cautious, forward-thinking designer-developer would interpret it as “allow for unknowns”. Whereas sloppy-ass cowboys think it means “accept any old invalid crap” and pass their garbage accordingly.

One of these philosophies gave us HTTP; the other HTML. And no prizes for knowing which one is an utter horror to consume (an arrangement, incidentally, that works swimmingly well for its entrenched vendors and absolutely no-one else).

fanf2 · 6 years ago

Posted was not a Unix programmer when he formulated his law

nickik · 6 years ago

In Clojure you generally accept any kind of map, do some operation and return 'copy' of that map, or pass it on.

A simple example is a ring web stack, where request flow threw different functions that transform the http request.

Each function just assumes the keys its needed are there, or to do input validation, you just validate that the incoming map has the keys that you require but does not care what else is in there. Clojure Spec will also do validation of the value in that key, even if it is an arbitrary complex map.

What this gets you is that you can pass around generic structures, that can be used with all standard Clojure functions and most libraries, validation is applied as needed on those keys that you need.

But I believe this is also with the spirit of the article. Your functions still assume something about input and are polymorphic over everything else in it.

AFAIK this can also be done in a type safe manner in languages which support extensible records/row polymorphism.

There is no reason why the same would not be possible in a statically-typed language.

MaxBarraclough · 6 years ago

> This "be liberal in what you accept" idea, applied to modern programming, always struck me as strange.

Agreed. There's a reason that serious correctness-oriented languages, like Ada, do not use this approach.

paganel · 6 years ago

Almost nobody uses Ada, though, while one of the most read programming language-related websites (lambda-the-ultimate) is written in a dynamic language. Also, in the day-to-day interactions of the real world we almost never carry out “validation”, we stop at “parsing” most of the time, otherwise almost nothing will ever be done.

In other words “walks like a duck/quacks like a duck” is enough for most of the open world operations, no need for Platonic-like ideals or Aristotle-like categorizations. In the real world we can still use any piece of wood or stone as a table as long as food doesn’t fall off it, because said piece of wood/stone quacks and walks like a table, while Plato or Aristotle or most of the static type proponents would argue that we shouldn’t do that as that piece of stone or wood doesn’t have 4 legs and as such is not really a table/doesn’t correspond to the idea/type of a real table.

yen223 · 6 years ago

In fairness, there's also a reason why correctness-oriented languages don't see a lot of use in the industry.

slifin · 6 years ago

Most Clojure teams I speak to spec their input using Clojure spec, then they get custom recursive error reporting, validation, coercion and generators

Clojure specifications can also be converted into other forms of specification like graphql schema, database schema, json schema etc

The best I've seen a type system do on user input is blow up at runtime and that's not good enough for me

> The best I've seen a type system do on user input is blow up at runtime and that's not good enough for me

Then you haven’t been looking closely enough. The article addresses exactly this wrong argument.

cobbzilla · 6 years ago

Who’s applying Postel’s Law to programming? Citation please.

It was intended for network protocols/distributed systems communications — situations with extremely loose coupling between components. It’s useful in this very specific paradigm, not for arbitrarily programming anything.

iamflimflam1 · 6 years ago

I have developed and maintained several large systems in both dynamic and statically typed languages.

From a maintenance point of view, the statically typed ones definitely win out.

Trying to reason about bits of code with no idea what was supposed to be passed in. Working out which bits of code are actually dead and can be safely removed. Refactoring bits of code. All incredibly difficult in the dynamically typed systems.

Given the choice, I would not embark on a large complex system without the benefit of a strongly typed language.

sparkie · 6 years ago

There are two aspects to maintenance. One is in maintaining a codebase, which is what you're referring to. This greatly benefits from static typing as you can have good guarantees about your code before you deploy it.

The other aspect of maintenance is in keeping a running system up without any downtime. There are plenty of of use cases where recompiling code and relaunching an application is not a viable solution. You need to be able to patch a running system. Dynamic typing is beneficial here because you need the old running code to be able to call into the new code which it knew nothing about when it was originally compiled.

pizlonator · 6 years ago

Dynamic type systems are inherently more open.

This article really bends over in strange ways to say otherwise.

Fact is: dynamic typing is all about making fewer claims in your code about what you expect about the world around you. With dynamic types, to load a property you might just have to say the property name and receiver. With static types, you usually also have to say the type of all other properties of the receiver (usually by calling out the receiver’s nominal type). Hence as systems experience evolution, the probability that the dynamic property lookup will fail due to something changing is lower than the probability of the static lookup failing.

The heading “you can’t process what you don’t know” is particularly confused. You totally can and dynamic typing is all about doing that. In the dynamic typing case, you can process “o.f” without knowing the type of “o.g” or even the full type of “o.f” (only have to know about the type of o.f enough to know what you can do to it). Static type systems will typically require you to state the full type of o. So, this is an example of dynamic types being a tool that enables you to process what you don’t know.

ookdatnog · 6 years ago

> Static type systems will typically require you to state the full type of o.

This is not the case for languages with support for structural typing (as the article mentions in the appendix), and most modern statically typed languages have some degree of support for abstract interfaces of some sort which also support writing functions with only partially known information about the type.

I think one of the core insights in the article is that your code will always make at least some assumptions about what it is processing (unless you're implementing the identity function, or a function which does not inspect its argument at all), and these assumptions are your type. So if you expect "o" to have a field called "f", and this field should return something on which a "+" operator is defined, then these properties form your type (in structural typing this is easy and boilerplate-free to express, with interfaces there's some boilerplate but it's definitely expressible).

In that light, the difference between static and dynamic typing isn't how many assumptions you make in your code, but rather how explicit you are about them, and to what extent you make your assumptions amenable to automated reasoning.

The difference between static and dynamic typing isn't about how explicit you are about the assumptions in your code. You can be equivalently explicit in dynamic and static languages. The fundamental difference lies in when you want the types to be verified. In static typing, that is before the program is run. In dynamic typing, that is before the code is executed if you are explicit, or when the code is executed if you are not explicit enough.

An example of dynamic typing being used pervasively in what we commonly call statically typed languages is the downcast. (SubType)superTypeObject is a dynamic typing construct. It is saying "defer unification of these types until runtime," because there is insufficient information at compilation type to determine the real type of superTypeObject.

Of course, such downcasts are discouraged in statically typed languages without explicitly checking the type of superTypeObject before performing the downcast, but not all type systems are capable of asserting these checks are all in place at compile time. Some statically typed languages don't even have a downcast.

You can be completely explicit about checking types before using them in a dynamically typed language, and have the proper error handling in place in the case that the type unification you're expecting doesn't happen.

Structural typing doesn’t save you unless you add other stuff to it.

Example:

    var o = ...; // Value comes from somewhere, doesn’t matter where so long as it’s opaque
    if (p)
        ... = o.f
    else
        ... = o.g

In dynamic typing, we only assert that o has f on the then, and only assert that o has g on the else.

Structural typing or abstract base classes or whatever would only save you here if you were very careful about giving o some top type and then casting. But that’s more work - so the tendency is that the static types version of this program will assert that o has both f and g at the point of assignment. And that represents a tighter coupling to whatever produces o.

scarmig · 6 years ago

Structural static typing lets you do exactly what you claim static type systems are incapable of letting you do. It's fair to say those systems are not common--Typescript is the only one I know of--and I wonder why that is, but it's not an inherent failure of static type systems.

mrkeen · 6 years ago

I think the author is offering "you can’t process what you don’t know" as an example of something that dynamic proponents claim that statically-typed languages can't do, followed by a refutation of it.

Wrong. There are statically-typed languages with polymorphism.

phoe-krk · 6 years ago

Disclosure: dynamic typer here.

The way I understand it, a part of the issue is about possibly deferring the time at which the type of data is known to the time at which the data is operated upon, since calling a numeric addition function on not-numbers, e.g. strings, is a type violation, no matter which programming language you are writing in. The question is where you want your parsing-or-validating to occur, since this decision makes piece of the software either more tightly coupled or more heterogenous in which kinds of data they accept. The author makes and proves a claim that this decision, which is naturally postponable in JS due to its highly dynamic and schemaless-by-default object nature, is similarly postponable in Haskell.

The other part about ignoring unknown keywords is very simple and understandable to me - you can indeed allow a statically typed program to ignore unknown schema keywords as you can allow a dynamically typed one to error on them.

The author actually argues against Haskell as an example of their philosophy, because Haskell has poor support for open static types. They generally have to be faked in unintuitive ways by relying on the typeclasses feature, and something like OCaml's support for structural, extensible records and variants seems to be entirely off limits out-of-the-box.

I don’t think that’s correct. The author’s point is that structural typing is possible in statically-typed languages, but isn’t a thing in Haskell.

To learn more, Google for “Haskell row polymorphism”.

The author correctly points out that structural vs nominal is a separate argument from static vs dynamic.

StandardFuture · 6 years ago

I actually really enjoy your views on dynamic typing.

But, if we agree on this definition then is it not an admittance that "dynamic type systems" don't actually exist and instead we are simply talking about evaluation/validation being programatically specified by the programmer instead of embedded evaluation/validation by the compiler and/or runtime? And since it is being specified by the programmer, then we run into the possibility of that validation being incorrectly implemented or possibly forgotten entirely.

Any type system that had better verification of self-programmed type validation code from the programmer would solve this issue, no?

> "dynamic type systems" don't actually exist

One definition of a "dynamic type system" that I am aware of is that there is runtime-existent information about types of particular pieces of data - or, somewhat equivalently, that types are information associated with values, not variables. Therefore, one could imply that e.g. Java has a dynamic type system.

The way I understand the question you're asking is: if a language has an "eval" function that accepts arbitrary code as input, what prevents it from having static-type-system-class parsing and type safety guarantees as a part of its functioning? It essentially means that the compiler for a language must both be permanently present in memory (like for Javascript, or Lisp, or Smalltalk) and it must implement the static type checks for all code that it parses (like for Haskell).

I think that these two requirements are orthogonal to each other and therefore they don't really clash with each other. Either there will be languages which implement both of these paradigms at the same time, or there already are such languages, and I am not yet aware of their existence.

andybak · 6 years ago

Having moved from Python to C# I'm still coming to terms with how I feel.

It's really hard to express but I do have a nagging feeling that something has been lost that other commentators haven't quite put into words either. You just write different code in dynamic languages - even ignoring type declarations. And in many cases it feels like better, more humane code.

One piece of evidence for this elusive difference is my observation that API's in Python tend to much nicer and much less verbose (again - ignoring the obvious differences based directly on type declarations). APIs tend to feel like they were designed for usability rather than mapping directly on to library code and user be damned.

Is this a cultural difference or does something about static typing lead to different patterns of thought?

jacobsenscott · 6 years ago

Unfortunately C# does not have a very nice type system compared to more modern typed languages - elm, haskell, ruby's sorbet. That is probably part of the reason you are feeling that way.

“my observation that API's in Python tend to much nicer and much less verbose”

I trust you don’t mean Python’s stdlib APIs, which are an inconsistent mess. (Python 3 did very little to improve this, unfortunately.)

There’s also a lot to be said for only having to annotate interfaces once and once only, and have that information reliably applied at compile/run time (depending on language) and across all documentation and tooling too. Docstring schemes are the pits (though most typed languages are little/no better than untyped ones here).

yawaramin · 6 years ago

Have you used OCaml? It's a quite different experience than C#. See http://roscidus.com/blog/blog/2014/02/13/ocaml-what-you-gain...