Readit News logoReadit News
LegionMammal978 · a month ago
> How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception? Did you ever find yourself surprised when eventually it did happen? I know I did, since then I at least add some logs even if I think I'm sure that it really cannot happen.

I'm not sure what the author expects the program to do when there's an internal logic error that has no known cause and no definite recovery path. Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.

At some level, the simplest thing to do is to give up and crash if things are no longer sane. After all, there's no guarantee that 'unreachable' recovery paths won't introduce further bugs or vulnerabilities. Logging can typically be done just fine within a top-level exception handler or panic handler in many languages.

thatoneengineer · a month ago
Ideally, if you can convince yourself something cannot happen, you can also convince the compiler, and get rid of the branch entirely by expressing the predicate as part of the type (or a function on the type, etc.)

Language support for that varies. Rust is great, but not perfect. Typescript is surprisingly good in many cases. Enums and algebraic type systems are your friend. It'll never be 100% but it sure helps fill a lot of holes in the swiss cheese.

Because there's no such thing as a purely internal error in a well-constructed program. Every "logic error" has to bottom out in data from outside the code eventually-- otherwise it could be refactored to be static. Client input is wrong? Error the request! Config doesn't parse? Better specify defaults! Network call fails? Yeah, you should have a plan for that.

dmurray · a month ago
Not every piece of logic lends itself to being expressed in the type system.

Let's say you're implementing a sorting algorithm. After step X you can be certain that the values at locations A, B, and C are sorted such that A <= B <= C. You can be certain of that because you read the algorithm in a prestigious journal, or better, you read it in Knuth and you know someone else would have caught the bug if it was there. You're a diligent reader and you've convinced yourself of its correctness, working through it with pencil and paper. Still, even Knuth has bugs and perhaps you made a mistake in your implementation. It's nice to add an assertion that at the very least reminds readers of the invariant.

Perhaps some Haskeller will pipe up and tell me that any type system worth using can comfortably describe this PartiallySortedList<A, B, C>. But most people have to use systems where encoding that in the type system would, at best, make the code significantly less expressive.

josephg · a month ago
Yes, this has been my experience too! Another tool in the toolbox is property / fuzz testing. Especially for data structures, and anything that looks like a state machine. My typical setup is this:

1. Make a list of invariants. (Eg if Foo is set, bar + zot must be less than 10)

2. Make a check() function which validates all the invariants you can think of. It’s ok if this function is slow.

3. Make a function which takes in a random seed. It initializes your object and then, in a loop, calls random mutation functions (using a seeded RNG) and then calls check(). 100 iterations is usually a good number.

4. Call this in an outer loop, trying lots of seeds.

5. If anything fails, print out the failing seed number and crash. This provides a reproducible test so you can go in and figure out what went wrong.

If I had a penny for every bug I’ve found doing this, I’d be a rich man. It’s a wildly effective technique.

TruePath · a month ago
There is no inherent benefit in going and expressing that fact in a type. There are two potential concerns:

1) You think this state is impossible but you've made a mistake. In this case you want to make the problem as simple to reason about as possible. Sometimes types can help but other times it adds complexity when you need to force it to fit with the type system.

People get too enamored with the fact that immutable objects or certain kinds of types are easier to reason about other things being equal and miss the fact that the same logic can be expressed in any Turing complete language so these tools only result in a net reduction in complexity if they are a good conceptual match to the problem domain.

2) You are genuinely worried about the compiler or CPU not honoring it's theoretical guarantees -- in this case rewriting it only helps if you trust the code compiling those cases more for some reason.

wakawaka28 · a month ago
Sometimes the "error" is more like, "this is a case that logically could happen but I'm not going to handle it, nor refactor the whole program to stop it from being expressable"
cozzyd · a month ago
Until you have a bit flip or a silicon error. Or someone changed the floating point rounding mode.
skydhash · a month ago
A comment "this CANNOT happen" has no value on itself. Unless you've formally verified the code (including its dependencies) and have the proof linked, such comments may as well be wishes and prayers.

Yes, sometimes, the compiler or the hardware have bugs that violate the premises you're operating on, but that's rare. But most non pure algorithms (side effects and external systems) have documented failure cases.

JohnFen · a month ago
> A comment "this CANNOT happen" has no value on itself.

I think it does have some value: it makes clear an assumption the programmer made. I always appreciate it when I encounter comments that clarify assumptions made.

threethirtytwo · a month ago
False it has value. It’s actually even better to log it or throw an exception. print(“this cannot happen.”)

If you see it you immediately know the class of error is purely a logic error the programmer made a programming mistake. Logging it makes it explicit your program has a logic bug.

What if you didn’t log it? Then at runtime you will have to deduce the error from symptoms. The log tells you explicitly what the error is.

AnimalMuppet · a month ago
Worse: You may created the proof. You may have linked to the proof. But if anyone has touched any of the code involved since then, it still has no value unless someone has re-done the proof and linked that. (Worse, it has negative value, because it can mislead.)
CupricTea · a month ago
>Further down the article, the author suggests bubbling up the error with a result type, but you can only bubble it up so far before you have to get rid of it one way or another. Unless you bubble everything all the way to the top, but then you've just reinvented unchecked exceptions.

Not necessarily. Result types are explicit and require the function signature to be changed for them.

I would much prefer to see a call to foo()?; where it's explicit that it may bubble up from here, instead of a call to foo(); that may or may not throw an exception my way with no way of knowing.

Rust is absolutely not perfect with this though since any downstream function may panic!() without any indication from its function signature that it could do so.

svantana · a month ago
> At some level, the simplest thing to do is to give up and crash if things are no longer sane.

The problem with this attitude (that many of my co-workers espouse) is that it can have serious consequences for both the user and your business.

- The user may have unsaved data - Your software may gain a reputation of being crash-prone

If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.

Calavar · a month ago
> If a valid alternative is to halt normal operations and present an alert box to the user saying "internal error 573 occurred. please restart the app", then that is much preferred IMO.

You can do this in your panic or terminate handler. It's functionally the same error handling strategy, just with a different veneer painted over the top.

lmm · a month ago
Crashing is bad, but silently continuing in a corrupt state is much worse. Better to lose the last few hours of the user's work than corrupt their save permanently, for example.
Krssst · a month ago
> Your software may gain a reputation of being crash-prone

Hopefully crashing on unexpected state rather than silently running on invalid state leads to more bugs being found and fixed during development and testing and less crash-prone software.

saagarjha · a month ago
So you don't get a crash log? No, thanks.
SAI_Peregrinus · a month ago
- The user may have unsaved data

That should not need to be a consideration. Crashing should restore the state from just before the crash. This isn't the '90s, users shouldn't have to press "save" constantly to avoid losing data.

the__alchemist · a month ago
This is what rust's `unreachable()!` is for... and I feel hubris whenever I use it.
tialaramex · a month ago
You should prefer to write unreachable!("because ...") to explain to some future maintenance engineer (maybe yourself) why you believed this would never be reached. Since they know it was reached they can compare what you believed against their observed facts and likely make better decisions.

But at least telling people that the programmer believed this could never happen short-circuits their investigation considerably.

GabrielBRAA · a month ago
Heh, recently I had to fix a bug in some code that had one of these comments. Feels like a sign of bad code or laziness. Why make a path that should not happen? I can get it when it's on some while loop that should find something to return, but on a if else sequence it feels really wrong.
kccqzy · a month ago
Strong disagree about laziness. If the dev is lazy they will not make a path for it. When they are not lazy they actually make a path and write a comment explaining why they think this is unreachable. Taking the time to write a comment is not a sign of laziness. It’s the complete opposite. You can debate whether the comment is detailed enough to convey why the dev thinks it’s unreachable, but it’s infinitely better than no comment and leaving the unreachability in their head.
t-writescode · a month ago
Before sealed classes and ultra-robust type checking, sometimes private functions would have, say, 3 states that should be possible, but 3 years later, a new state is added but wasn’t checked because the compiler didn’t stop it because the language didn’t support it at that time.
bccdee · a month ago
It's much better to have a `panic!("this should never happen")` statement than to let your program get into an inconsistent state and then keep going. Ideally, you can use your type system to make inconsistent states impossible, but type systems can only express so much. Even Haskell can't enforce typeclass laws at the compiler level.

A program that never asserts its invariants is much more likely to be a program that breaks those invariants than a program that probably doesn't.

supermdguy · a month ago
> A common pattern would be to separate pure business logic from data fetching/writing. So instead of intertwining database calls with computation, you split into three separate phases: fetch, compute, store (a tiny ETL). First fetch all the data you need from a database, then you pass it to a (pure) function that produces some output, then pass the output of the pure function to a store procedure.

Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design? I've heard a lot about it, contrived examples make it seem like something I'd want, but I often find it's much more difficult in real-world cases.

Random example from my codebase: I have a function that periodically sends out reminders for usage-based billing customers. It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type). The code feels very messy and procedural, with business logic mixed with side effects, but I'm not sure where a natural separation point would be -- there's no way to "fetch all the data" up front.

bambax · a month ago
What I'm currently doing could be called compute-fetch-store: the compute part is done entirely in the database with SQL views stacked one on top of the other. Then the program just fetches the result of the last view and stores it where it needs to be stored.

Stacked views are sometimes considered an anti-pattern, but I really like them because they're purely functional, have no side-effects whatsoever and cannot break (they either work or they don't, but they can't start breaking in the future). And they're also stateless: they present a holistic view of the data that avoids iterations and changes how you think about it. (Data is never really 'transformed', it's simply 'viewed' from a different perspective.)

Not saying that's the only way, or the best way, or even a good way! But it works for me.

I think it would apply well to the example: you could have a view, or a series of views, that compute balance top-ups based on a series of criteria; then the program would read that view and send email without doing any new calculation.

mkleczek · a month ago
This.

In-RDBMS computation specified in declarative language with generic, protocol/technology specific adapters handling communication with external systems.

Treating RDBMS as a computing platform (and not merely as dumb data storage) makes systems simple and robust. Model your input as base relations (normalized to 5NF) and output as views.

Incremental computing engines such as https://github.com/feldera/feldera go even further with base relations not being persistent/stored.

lmm · a month ago
Sometimes you really can't separate the business logic from the imperative operations; in that case you use monads and at least make it a bit more testable and refactorable (e.g. https://michaelxavier.net/posts/2014-04-27-Cool-Idea-Free-Mo...).

That said:

> It pulls customer metadata, checks the customer type, and then based on that it computes their latest usage charges, and then based on that it may trigger automatic balance top-ups or subscription overage emails (again, depending on the customer type).

So compute those things, and store them somewhere (if only an in-memory queue to start with)? Like, I can already see a separation between an ETL stage that computes usage charges, which are probably worth recording in a datastore, and then another ETL stage that computes which top-ups and emails should be sent based on that, which again is probably worth recording for tracing purposes, and then two more stages to actually send emails and execute payment pulls, which it's actually quite nice to have separated from the figuring out which emails to send part (if only so you can retry/debug the latter without sending out actual emails)

sltr · a month ago
> Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design?

I can recommend Grokking Simplicity by Eric Normand. https://www.manning.com/books/grokking-simplicity

AdieuToLogic · a month ago
> Does anyone have any good resources on how to get better at doing "functional core imperative shell" style design?

Hexagonal architecture[0] is a good place to start. The domain model core can be defined with functional concepts while also defining abstract contracts ( abstractly "ports", concretely interface/trait types) implemented in "adapters" (usually technology specific, such as HTTP and/or SMTP in your example).

0 - https://en.wikipedia.org/wiki/Hexagonal_architecture_(softwa...

grayhatter · a month ago
> there's no way to "fetch all the data" up front.

this is incorrect

I assume there's more nuance and complexity as for why it feels like there's no way. Probably involving larger design decisions that feel difficult to unwind. But data collection, decisions, and actions can all be separated without much difficulty with some intent to do so.

I would suggest caution, before implementating this directly: but imagine a subroutine that all it did was lock some database table, read the current list of pending top up charges required, issue the charge, update the row, and unlock the table. An entirely different subroutine wouldn't need to concern itself with anything other than data collection, and calculating deltas, it has no idea if a customer will be charged, all it does is calculate a reasonable amount. Something smart wouldn't run for deactivated/expiring accounts, but why does this need to be smart? It's not going to charge anything, it's just updating the price, that hypothetically might be used later based on data/logic that's irrelevant to the price calculation.

Once any complexity got involved, this is closer to how I would want to implement it, because this also gives you a clear transcript about which actions happened why. I would want to be able to inspect the metadata around each decision to make a charge.

supermdguy · a month ago
That's a good point, thinking about it some more, I think the business logic feels so trivial that it would make the code harder to reason about if it were separated from the effects. Currently, I have one giant function that pulls data, filters it, conditionally pulls more data, and then maybe has one line of effectful code.

I could have one function that pulls the wallet balance for all users, and then passes it to a pure function that returns an object with flags for each user indicating what action to take. Then another function would execute the effects based on the returned flags (kind of like the example you gave of processing a pending charges table).

The value of that level of abstraction is less clear though. Maybe better testability? But it's hard to justify what would essentially be tripling the lines of code (one function to pull the data, one pure function to compute actions, one function to execute actions).

Additionally, there's a performance cost to pulling all relevant data, instead of being able to progressively filter the data in different ways depending on partial results (example: computing charges for all users at once and then passing it to a pure function that only bills customers whose billing date is today).

Would be great to see some more complex examples of "functional core imperative shell" to see what it looks like in real-world applications, since I'm guessing the refactoring I have in my head is a naive way to do it.

t-writescode · a month ago
They can until they can’t.

Sometimes you might need to operate on a result from an external function, or roll back a whole transaction because the last step failed, or the DB could go down midway through.

The theory is good, but stuff happens and it goes out the window sometimes.

movpasd · a month ago
If your required logic separates nicely into steps (like "fetch, compute, store"), then a procedural interface makes sense, because sequential and hierarchical control flow work well with procedural programming.

But some requirements, like yours, require control flow to be interwoven between multiple concerns. It's hard to do this cleanly with procedural programming because where you want to draw the module boundaries (e.g.: so as to separate logic and infrastructure concerns) doesn't line up with the sequential or hierarchical flow of the program. In that case you have to bring in some more powerful tools. Usually it means polymorphism. Depending on your language that might be using interfaces, typeclasses, callbacks, or something more exotic. But you pay for these more powerful tools! They are more complex to set up and harder to understand than simple straightforward procedural code.

In many cases judicious splitting of a "mixed-concern function" might be enough and that should probably be the first option on the list. But it's a tradeoff. For instance, you then could lose cohesion and invariance properties (a logically singular operation is now in multiple temporally coupled operations), or pay for the extra complexity of all the data types that interface between all the suboperations.

To give an example, in "classic" object-oriented Domain-Driven Design approaches, you use the Repository pattern. The Repository serves as the interface or hinge point between your business logic and database logic. Now, like I said in the last paragraph, you could instead design it so the business logic returned its desired side-effects to the co-ordinating layer and have it handle dispatching those to the database functions. But if a single business logic operation naturally intertwines multiple queries or other side-effectful operations then the Repository can sometimes be simpler.

brickers · a month ago
This stuff is quite new to me as I’ve been learning F#, so take this with a pinch of salt. Some of the things you’d want are: - a function to produce a list of customers

- a function or two to retrieve the data, which would be passed into the customer list function. This allows the customer list function to be independent of the data retrieval. This is essentially functional dependency injection

- a function to take a list of customers and return a list of effects: things that should happen

- this is where I wave my hands as I’m not sure of the plumbing. But the final part is something that takes the list of effects and does something with them

With the above you have a core that is ignorant of where its inputs come from and how its effects are achieved - it’s very much a pure domain model, with the messy interfaces with the outside world kept at the edges

jimbokun · a month ago
Sounds like a chain of “fetch compute store” stages, where the output of one is used as input to the next, where you then decide what other data needs to be fetched. So a pipeline instead of just a single shell and a single core.
raegis · a month ago
Maybe check out Scott Wlaschin's videos on YouTube. There is one talk for his book "Domain Modeling Made Functional" which, if I remember, was very clear and easy to follow.
pdmccormick · a month ago
Conceptually, can you break your processing up into a more or less "pure" functional core, surrounded by some gooey, imperative, state-dependent input loading and output effecting stages? For each processing stage, implement functions of well-defined inputs and outputs, with any global side effects clearly stated (i.e. updating a customer record, sending an email) Then factor all the imperative-ish querying (that is to say, anything dependent on external state such as is stored in a database) to the earlier phases, recognizing that some of the querying is going to be data-dependent ("if customer type X, fetch the limits for type X accounts"). The output of these phases should be a sequence of intermediate records that contain all the necessary data to drive the subsequent ones.

Whenever there is an action decision point ("we will be sending an email to this customer"), instead of actually performing that step right then and there, emit a kind of deferred-intent action data object, e.g. "OverageEmailData(customerID, email, name, usage, limits)". Finally, the later phases are also highly imperative, and actually perform the intended actions that have global visibility and mutate state in durable data stores.

You will need to consider some transactional semantics, such as, what if the customer records change during the course of running this process? Or, what if my process fails half-way through sending customer emails? It is helpful if your queries can be point-in-time based, as in "query customer usage as-of the start time for this overall process". That way you can update your process, re-run it with the same inputs as of the last time you ran it, and see what your updates changed in terms of the output.

If those initial querying phases take a long time to run because they are computationally or database query heavy, then during your development, run those once and dump the intermediate output records. Then you can reload them to use as inputs into an isolated later phase of the processing. Or you can manually filter those intermediates down to a more useful representative set (i.e. a small number of customers of each type).

Also, its really helpful to track the stateful processing of the action steps (i.e. for an email, track state as Queued, Sending, Success, Fail). If you have a bug that only bites during a later step in the processing, you can fix it and resume from where you left off (or only re-run for the affected failed actions). Also, by tracking the globally affecting actions you can actually take the results of previous runs into account during subsequent ones ("if we sent an overage email to this customer within the past 7 days, skip sending another one for now"). You now have a log of the stateful effects of your processing, which you can also query ("how many overage emails have been sent, and what numbers did they include?")

Good luck! Don't go overboard with functional purity, but just remember, state mutations now can usually be turned into data that can be applied later.

kridsdale1 · a month ago
I really like modern Swift. It makes a lot of what this author is complaining about, impossible.

The worst file I ever inherited to work on was the ObjC class for Instagram’s User Profile page. It looked like it’d been written by a JavaScript fan. There were no types in the whole file, everything was an ‘id’ (aka void*) and there were ‘isKindOfClass’ and null checks all over the place. I wanted to quit when I saw it. (I soon did).

JackYoustra · a month ago
Modern swift makes this technically possible but so cluttered that it's effectively impossible, especially compared with typescript.

Swift distinguishes between inclusive and exclusive / exhaustive unions with enum vs protocols and provides no easy or simple way to bridge between the two. If you want to define something that typescript provides as easy as the vertical bar, you have to write an enum definition, a protocol bridge with a type identifier, a necessarily unchecked cast back (even if you can logically prove that the type enum has a 1:1 mapping), and loads of unnecessary forwarding code. You can try and elide some of it with (iirc, its been a couple years) @dynamicMemberLookup, but the compiler often chokes on this, it kills autocomplete, and it explodes compile times because Swift's type checker degrades to exponential far more frequently than other languages, especially when used in practice, such as in SwiftUI.

tizio13 · a month ago
I think you’re conflating 'conciseness' with 'correctness.' The 'clutter' you're describing in Swift like, having to explicitly define an Enum instead of using a vertical bar |, is exactly what makes it more robust than TS for large-scale systems.

In TypeScript a union like string | number is structural and convenient, but it lacks semantic meaning. In Swift, by defining an Enum, you give those states a name and a purpose. This forces you to handle cases exhaustively and intentionally. When you're dealing with a massive codebase 'easy' type bridging is often how you end up back in 'id' or 'any' hell. Swift’s compiler yelling at you is usually it trying to tell you that your logic is too ambiguous to be safely compiled which, in a safety first language, is the compiler doing its job.

glenjamin · a month ago
Any advice on how to learn modern Swift?

When I tried to do learn some to put together a little app, every search result for my questions was for a quick blog seemingly aimed at iOS devs who didn’t want to learn and just wanted to copy-paste the answer - usually in the form of an extension method

WalterBright · a month ago
> How many times did you leave a comment on some branch of code stating "this CANNOT happen" and thrown an exception?

My code is peppered with `assert(0)` for cases that should never happen. When they trip, then I figure out why it happened and fix it.

This is basic programming technique.

doug_durham · a month ago
Typing is great, presuming that the developer did a thorough job of defining their type system. If they get the model wrong, or it is incomplete then you aren't really gaining much out of a strictly typed language. Every change is a fight. You are likely to hack the model to make the code compile. There is a reason that Rust is most successful at low level code. This is where the models are concrete and simple to create. As you move up the stack, complexity increases and the ability to create a coherent model goes beyond human abilities. That's why coding isn't math or religion. Different languages and approaches for different domains.
einpoklum · a month ago
Well, you could argue whether or not coding is math or not, but coding is _certainly_ religion. Complete with wars among sects, inquisition, excommunication, priesthood and crusades.
ZebusJesus · a month ago
This was a great breakdown and very well written. I think you made one of the better arguments for rust Ive read on the internet but you also made sure to acknowledge that large code bases are just a different beast all together. Personally I will say that AI has made making code proofs or "formal verification" more accessible. Actually writing a proof for your code or code verification is very hard to do for most programmers which is why it is not done by most programmers, but AI is making it accessible and with formal verification of code you prevent so many problems. It will be interesting to see where programming and compliers go when "formal verification" becomes normal.
smj-edison · a month ago
> Rust makes it possible to safely manage memory without using a garbage collector, probably one of the biggest pain points of using low-level languages like C and C++. It boils down to the fact that many of the common memory issues that we can experience, things like dangling pointers, double freeing memory, and data races, all stem from the same thing: uncontrolled sharing of mutable state.

Minor nit: this should be mutable state and lifetimes. I worked with Rust for two years before recently working with Zig, and I have to say opt-in explicit lifetimes without XOR mutability requirements would be a nice combo.

lifeisstillgood · a month ago
There is a trade off here (of course) as in anything.

You can write the type heavy language with the nullable-type and the carefully thought through logic. Or you can use the dynamic language with the likelihood that it will crash. The issue is not “you are a bad coder, and should be guilty” but that there is a cost to a crash and a cost to moving wholesale to Haskell or perhaps more realistically to typed python, and those costs are quantifiable- and perhaps sometimes the throwaway code that has made it to production is on the right side of the cost curve.