This is awesome, I'm happy that Cloudflare is adding more attention into running Python via WebAssembly at the Edge.
I'll try to summarize on how they got it running and what are the drawbacks that they have from their current approach (note: I have deep context on running Python with WebAssembly at the Edge as part of my work in Wasmer).
Cloudflare Workers are enabling Python at the Edge by using Pyodide [1] (Python compiled to WebAssembly via Emscripten).
They bundled Pyodide into Workerd [2], and then use V8 snapshots [3] to try to accelerate startup times.
On their best case, cold starts of Python in Cloudflare Workers are about 1 second.
While this release is great as it allows them to measure the interest of running Python at the Edge, it has some drawbacks. So, what are those?
* Being tied to use only one version of Python/Pyodide (the one that Workerd embeds)
* Package resolution is quite hacky and tied to workerd. Only precompiled "native packages" will be allowed to be used at runtime (eg. using a specific version of numpy will turn to be challenging)
* Architecturally tied to the JS/v8 world, which may show some challenges as they aim to reduce cold start times (in my opinion, it will be quite hard for them to achieve <100ms startup time with their current architecture).
In any case, I welcome this initiative with my open hands and look forward all the cool apps that people will now build with this!
I believe that your summary misunderstands how we will handle versioning. The pyodide /package versions will be controlled by the compatibility date, and we will be able to support multiple in production at once. For packages like langchain (or numpy as you mentioned) the plan is to update quite frequently.
Could you expand on why you believe V8 will be a limiting factor? It is quite a powerful Wasm runtime, and most of the optimizations we have planned don’t really depend on the underlying engine.
Edit: Also just want to clarify that this is not a POC, it is a Beta that we will continue improving on and eventually GA.
> pyodide /package versions will be controlled by the compatibility date
That's exactly the issue that I'm mentioning. Ideally you should be able to pin any Python version that you want to use in your app: 2.7, 3.8 or 3.9 regardless of a Workerd compatibility date. Some packages might work in Python 3.11 but not in 3.12, for example.
Unfortunately, Python doesn't have the full transpiler architecture that JS ecosystem has, and thus "packaging" Python applications into different "compatibility" bundles will prove much more challenging (webpack factor).
> Could you expand on why you believe V8 will be a limiting factor?
Sure thing! I think we probably all agree that V8 is a fantastic runtime. However, the tradeoffs that make V8 great for a browser use case, makes the runtime more challenging for Edge environments (where servers can do more specialized workloads on trusted environments).
Namely, those are:
* Cold starts: V8 Isolates are a bit heavy to initialize. On it's current form it can add up from ~2-5ms in startup just by initializing an Isolate
* Snapshots can be quite heavy to save and restore
* Not architected with the Edge use case in mind: there are many tricks that you can do if you skip the JS middleware and go all in into a Wasm runtime, that are hard to do with the current V8/Workerd architecture.
In any case, I would love to be proven wrong on the long term and I cheer for <100ms cold starts when running Python in Cloudflare Workers. Keep up the good work!
As a side note, Wasmer offers an Edge product that has none of the drawbacks commented when running Python in Cloudflare Workers, providing incredibly fast cold-start times:
Wasmer claims to cold start in 50ns. This is obviously impossible: That's 1000x faster than an NVMe read, which is about the fastest cold storage you can get.
At least make your claims credible before posting them in competitors' HN threads.
Pyodide uses its own event loop which just subscribes to the JavaScript event loop. My suspicion is that this will be more efficient than using uvloop since v8's event loop is quite well optimized. It also allows us to await JavaScript thenables from Python and Python awaitables from JavaScript, whereas I would be worried about how this behaves with separate event loops. Also, porting uvloop would probably be hard.
Clouflare has a lot of great stuff for hosting and databases but I think they haven't done a great job marketing themselves as developer platform which has lead to platforms like Vercel, Netlify taking significant mindshare.
Tangential: does Cloudflare provide container hosting service agnostic of language -- something like Google Cloud Run?
I agree something is wrong with their marketing. I was also initially drawn to Vercel and Netlify but after extended use and not being happy with either I eventually tried Cloudflare and discovered I love it. The pricing and the product is fantastic.
I think it’s because the experience of familiarizing oneself with the platform and getting to a hello world level crud app/basic static site is done a lot better with vercel and netlify than it is with cloudflare. Cloudflares site and docs are not built with the approach of getting an app from 0 to 1 ASAP.
It is not only about marketing. Initially, I was optimistic about Cloudflare's offerings. However, I encountered significant issues with compatibility, especially with website generators such as Next.js and Astro. Some features didn't work at all, while others were only partially supported. Faced with the prospect of dedicating valuable development time to troubleshooting these issues, I found it more efficient to use alternative platforms. Services like Vercel, Netlify, and Deno Deploy offer a smoother experience for our team's needs, minimizing the overhead and enabling us to focus on development rather than infrastructure challenges.
I think Vercel and Netlify aren't aimed at developers, because if you are a developer and are using Vercel and Netlify you are literally getting robbed.
Bandwidth costs are 40x-50x more expensive on Vercel and Netlify that the vast majority of cloud providers. Cloudflare bandwidth is barely a cost.
Edge function calls are 6x more expensive on Vercel and Netlify than Cloudflare. Not including compute time costs which is free on Cloudflare.
I think the only reason Vercel is even popular is because it's by far the best place to host NextJS and that might be why they make it hard to deploy NextJS else where.
I believe the CloudFlare free tier was pretty limited until recently. D1 (their SQLite implementation) became generally available yesterday, and read replicas are announced.
I've played with JS workers on a Cloudflare-fronted site and found them to be easy to use and very quick. Would love to port the whole Django app behind the site over, using their D1 database too.
Same here, I have a couple of mobile apps that use Cloudflare workers + KV/D1, and it’s been great. I’m low traffic enough to be on the free tier, but would happily pay given how easy it’s been to build on.
Agreed, this looks really cool. While there is no Django/DRF support at the moment, it does say that that they'll be increasing the number of packages in the future.
1. Cold start perf
2. Post-cold start perf
- The cost of bridging between JS and WebAssembly
- The speed of the Python interpreter running in WebAssembly
Today, Python cold starts are slower than cold starts for a JavaScript Worker of equivalent size. A basic "Hello World" Worker written in JavaScript has a near zero cold start time, while a Python Worker has a cold start under 1 second.
That's because we still need to load Pyodide into your Worker on-demand when a request comes in. The blog post describes what we're working on to reduce this — making Pyodide already available upfront.
Once a Python Worker has gone through a cold start though, the differences are more on the margins — maybe a handful milliseconds, depending on what happens during the request.
- There is a slight cost (think — microseconds not milliseconds) to crossing the "bridge" between JavaScript and WebAssembly — for example, by performing I/O or async operations. This difference tends to be minimal — generally something measured in microseconds not milliseconds. People with performance sensitive Workers already write them in Rust https://github.com/cloudflare/workers-rs, which also relies on bridging between JavaScript and WebAssembly.
- The Python interpreter that Pyodide provides, that runs in WebAssembly, isn't as fast as the years and years of optimization that have gone into making JavaScript fast in V8. But it's still relatively early days for Pyodide, compared to the JS engine in V8 — there are parts of its code where we think there are big perf gains to be had. We're looking forward to upstreaming performance improvements, and there are WebAssembly proposals that help here too.
Haha, we included it just because it's part of the standard library. Total coincidence in terms of timing but it's nice that using Wasm gives us isolation guarantees :-)
Tried this out today and it was great, was very quick to get up and running!
One question though – does anyone know how I can get my local dev environment to understand the libraries that are built-in to CFW's Python implementation? e.g. there is an `asgi` library that I do not want my linter to flag as unknown, but as it only exists at runtime in the `on_fetch` handler (and isn't actually present in my local dev machine, I couldn't figure this out.
I’ve used CF Pages for static sites with great results and am intrigued by all their open-source-LLM-as-a-service offerings. Main issue preventing me from building more on CF is lack of Python support. Excited to try this out.
Yes! I'm also using CF Pages, and a couple Worker functions, and really love the CF ecosystem. Very easy to get something running quickly, and not have to worry much about infrastructure.
Very happy to see the Python addition. I'd like to see first-class Go support as well.
I'll try to summarize on how they got it running and what are the drawbacks that they have from their current approach (note: I have deep context on running Python with WebAssembly at the Edge as part of my work in Wasmer).
Cloudflare Workers are enabling Python at the Edge by using Pyodide [1] (Python compiled to WebAssembly via Emscripten). They bundled Pyodide into Workerd [2], and then use V8 snapshots [3] to try to accelerate startup times.
On their best case, cold starts of Python in Cloudflare Workers are about 1 second.
While this release is great as it allows them to measure the interest of running Python at the Edge, it has some drawbacks. So, what are those?
In any case, I welcome this initiative with my open hands and look forward all the cool apps that people will now build with this![1] https://pyodide.org/
[2] https://github.com/cloudflare/workerd/blob/main/docs/pyodide...
[3] https://github.com/cloudflare/workerd/pull/1875
Edit: updated wording from "proof of concept" to "release" to reflect the clarification from the Cloudflare team
Could you expand on why you believe V8 will be a limiting factor? It is quite a powerful Wasm runtime, and most of the optimizations we have planned don’t really depend on the underlying engine.
Edit: Also just want to clarify that this is not a POC, it is a Beta that we will continue improving on and eventually GA.
That's exactly the issue that I'm mentioning. Ideally you should be able to pin any Python version that you want to use in your app: 2.7, 3.8 or 3.9 regardless of a Workerd compatibility date. Some packages might work in Python 3.11 but not in 3.12, for example.
Unfortunately, Python doesn't have the full transpiler architecture that JS ecosystem has, and thus "packaging" Python applications into different "compatibility" bundles will prove much more challenging (webpack factor).
> Could you expand on why you believe V8 will be a limiting factor?
Sure thing! I think we probably all agree that V8 is a fantastic runtime. However, the tradeoffs that make V8 great for a browser use case, makes the runtime more challenging for Edge environments (where servers can do more specialized workloads on trusted environments).
Namely, those are:
In any case, I would love to be proven wrong on the long term and I cheer for <100ms cold starts when running Python in Cloudflare Workers. Keep up the good work!https://wasmer.io/templates?language=python
At least make your claims credible before posting them in competitors' HN threads.
Who's running python workloads with sub-100ms latency requirements?
Deleted Comment
In any case, it shall be possible to run uvloop fully inside of WebAssembly. However, doing so will prove challenging using their current architecture
Tangential: does Cloudflare provide container hosting service agnostic of language -- something like Google Cloud Run?
Bandwidth costs are 40x-50x more expensive on Vercel and Netlify that the vast majority of cloud providers. Cloudflare bandwidth is barely a cost.
Edge function calls are 6x more expensive on Vercel and Netlify than Cloudflare. Not including compute time costs which is free on Cloudflare.
I think the only reason Vercel is even popular is because it's by far the best place to host NextJS and that might be why they make it hard to deploy NextJS else where.
Deleted Comment
No but it would be awesome.
I've been using Workers for about 4 years in production and love them but containers are still where I run most of my apps.
Nope. Their Workers are V8 based so JS or Wasm
https://github.com/cloudflare/workerd/discussions/categories...
Is that wise? One DDOS attack could break your budget.
Not that I'm expecting parity, but knowing the rough tradeoff would be helpful.
1. Cold start perf 2. Post-cold start perf - The cost of bridging between JS and WebAssembly - The speed of the Python interpreter running in WebAssembly
Today, Python cold starts are slower than cold starts for a JavaScript Worker of equivalent size. A basic "Hello World" Worker written in JavaScript has a near zero cold start time, while a Python Worker has a cold start under 1 second.
That's because we still need to load Pyodide into your Worker on-demand when a request comes in. The blog post describes what we're working on to reduce this — making Pyodide already available upfront.
Once a Python Worker has gone through a cold start though, the differences are more on the margins — maybe a handful milliseconds, depending on what happens during the request.
- There is a slight cost (think — microseconds not milliseconds) to crossing the "bridge" between JavaScript and WebAssembly — for example, by performing I/O or async operations. This difference tends to be minimal — generally something measured in microseconds not milliseconds. People with performance sensitive Workers already write them in Rust https://github.com/cloudflare/workers-rs, which also relies on bridging between JavaScript and WebAssembly.
- The Python interpreter that Pyodide provides, that runs in WebAssembly, isn't as fast as the years and years of optimization that have gone into making JavaScript fast in V8. But it's still relatively early days for Pyodide, compared to the JS engine in V8 — there are parts of its code where we think there are big perf gains to be had. We're looking forward to upstreaming performance improvements, and there are WebAssembly proposals that help here too.
[1] https://news.ycombinator.com/item?id=39865810
That certainly appears to be the intention.
> Been hoping for this for a while now.
You should check out the other two announcements from today as well if you haven't yet:
"Leveling up Workers AI: General Availability and more new capabilities"
https://blog.cloudflare.com/workers-ai-ga-huggingface-loras-...
"Running fine-tuned models on Workers AI with LoRAs"
https://blog.cloudflare.com/fine-tuned-inference-with-loras
https://blog.cloudflare.com/making-full-stack-easier-d1-ga-h...
https://developers.cloudflare.com/workers/platform/limits/#d...
One question though – does anyone know how I can get my local dev environment to understand the libraries that are built-in to CFW's Python implementation? e.g. there is an `asgi` library that I do not want my linter to flag as unknown, but as it only exists at runtime in the `on_fetch` handler (and isn't actually present in my local dev machine, I couldn't figure this out.
Very happy to see the Python addition. I'd like to see first-class Go support as well.