From what I understand, in JavaScript at least, putting `await foo()` inside an async function, splits the calling function in two, with the 2nd half being converted to a callback. (Pretty sure this is full of errors so please correct me where I'm wrong)
Why can't non-async functions use await()?
I've also read that await() basically preempts the currently running function. How does this work?
Update: I'm re-reading http://journal.stuffwithstuff.com/2015/02/01/what-color-is-y... and I think the answer lies somewhere in this paragraph, but I can't wrap my head around it yet:
> The fundamental problem is “How do you pick up where you left off when an operation completes”? You’ve built up some big callstack and then you call some IO operation. For performance, that operation uses the operating system’s underlying asynchronous API. You cannot wait for it to complete because it won’t. You have to return all the way back to your language’s event loop and give the OS some time to spin before it will be done. Once operation completes, you need to resume what you were doing. The usual way a language “remembers where it is” is the callstack. That tracks all of the functions that are currently being invoked and where the instruction pointer is in each one. But to do async IO, you have to unwind and discard the entire C callstack. Kind of a Catch-22. You can do super fast IO, you just can’t do anything with the result! Every language that has async IO in its core—or in the case of JS, the browser’s event loop—copes with this in some way. [...]
I don't get the "You cannot wait for it to complete because it won’t." or the "But to do async IO, you have to unwind and discard the entire C callstack" parts.
I'm also using these resources, they help but I'm not there yet:
- https://stackoverflow.com/questions/47227550/using-await-ins...
I think what OP is assuming is a multi-threaded or multi-process environment, where the calling function can just block whatever execution context it's running in and wait until the async function returns.
The problem is that many environments are effectively non-multi-threaded, especially the (usually single) thread/queue/process that draws the UI and responds to user input. So if you block the UI thread, your whole app (at least from the standpoint of the user) stops responding.
Still, this should work in principle, but threads are more costly in terms of memory and context switching time than continuations, so it makes sense to allow the thread to continue to handle other tasks while waiting for e.g. I/O to complete.
This is what the author of the classic https://journal.stuffwithstuff.com/2015/02/01/what-color-is-... settles on as the best way of handling async tasks, but I recall there being some pushback on that here on HN.
> The problem is that many environments are effectively non-multi-threaded, especially the (usually single) thread/queue/process that draws the UI and responds to user input
Can you elaborate on why "especially" UI threads/queues/processes are "usually single"?
I don't think there's a particularly insightful answer to this question here, it's just that UI frameworks are almost universally written with the assumption that they are only used from one thread. UI frameworks can also use functionality spread across multiple components/libraries/systems, and making an UI kit thread-safe would likely require a lot of effort for dubious benefit (user input normally has to be processed strictly in order because clicking on a button and then pressing "enter" is different from pressing enter and then clicking on a button).
There are some specific scenarios where multiple threads are safe in a UI. For example, it's relatively common to be able to pass off an OpenGL context to another thread... so you can do OpenGL rendering in a thread separate from the main UI thread, if you like. Some UI frameworks specifically support this use case, e.g., certain methods on the OpenGL widgets are described as thread-safe. Individual OpenGL contexts are also not thread-safe and must be used from a single thread at a time (and they usually involve some thread-local context).
The typical way you make a responsive UI is by doing only UI work in the UI thread, and passing off all long-running computations to background threads.
Making a function support being paused and resumed requires changes in how the function is run and the data that is maintained while it is executing. In addition to behaving differently than sync functions when they execute, the results are also handled differently. Async functions always return a promise - whether they `return` a literal value, return another promise or throw an exception.
Due to these differences, it makes sense that the special nature of async functions and generators must be declared up-front, with the `async` keyword or `*` for generators. It would be possible to design the language such that the function type was determined by looking at whether it contains `yield` or `await` keywords. Python does this with generator functions. However this makes an important aspect of behavior less explicit.
1. Rewrite the async function to be a generator function with a wrapper that calls `next` on the generator whenever `await`-ed Promises resolve
2. Rewrite the generator function to be a big switch statement, together with a wrapper function that drives execution.
You can play around with generators and async functions in the TypeScript playground, with TSConfig set to target ES5, to see how this works: https://www.typescriptlang.org/play?target=1#code/GYVwdgxgLg...
Can you expand on what the actual differences are?
That starts to make sense to me: normal functions/subroutines will execute from start to finish, while these async functions (coroutines?) can be paused right in the middle of its execution. This is something fundamentally different then, which could explain why we need the `async` keyword, because those "asynchronous" functions are special.
I'd be interested to know the actual differences.
In C#, you can't use the "await" keyword, but you can use the result of a Task<T> in a non-async function with ".Result" or ".GetAwaiter().GetResult()". https://stackoverflow.com/a/47648318/5107208
I don't know if it extends to other language implementations, but C# does not seem to have a reliable sync-over-async story.
In the first case I am telling an executor what it has to do (it has to execute the thing then continue inside my function)
In the second case I am myself blocked into a state that will be unblocked when the thing returns.
That is in the first case my function must be of a type that can be stopped and continued and I need to save its state somewhere.
In the second case I don't need the ability to stop and continue my function.
That makes for two kinds of functions at some level. Low level languages will expose that. High level languages will hide it.
But there is a growing request for functions that can be both. I think that zig has them
And that state is saved in the heap, I presume? Can you provide a concrete example perhaps?
I get that the state in case of green threads (Goroutines for example) is saved in the call stack itself. Where all this is saved in async/await?
Well, I would say that it's the opposite of preemption. It's cooperative multitasking. Your async task is split into two tasks, with the await in the middle. When you call await, the first task finishes. The second task starts running once the await is done.
> Why can't non-async functions use await()?
In C#, they can. You can call `.RunSynchronously()` on a `Task`. C# supports lots of different TaskSchedulers that handle this differently.
C# is almost pathologically flexible here. Other languages typically assume that there is only one async task scheduler, and it's handled by the runtime. This task scheduler may be less flexible, and there are various design tradeoffs.
When you call await the function doesn't necessarily have to be preempted (maybe the async function you called already returned) but if the async function hasn't returned then it has to be.
In most systems like that there is a scheduler that keeps a list of things that are being awaited on and keeps track of which ones are ready to return. When you call await on one function the scheduler looks at that list and chooses an await that is ready to continue and executes it.
Why would it have to return a promise?
Normal functions can't do that. If they wait for something they block, they can't give control back to the run loop.
And in a single-threaded environment, we _want_ this capability so that the loop is able to continue working despite the fact that the thing we're `await`ing for hasn't complete yet.