So the thing is: the original "websocket" was prevented from being a full arbitrary connect() implementation out of security concerns. That is, the web page is running inside the user's network security boundary, and might be able to make connections which appear trusted.
If there's an API for "desktop application" web pages which can make arbitrary connections, what does the security model look like?
The only one I can imagine is still that the server can tell a client "you're allowed to direct socket to these ports on this DNS address specifically", with possible some rules around DNS addresses similar to cookie sharing on domains, and with a mandatory CORS preflight request to an HTTPS server on the same target domain.
Why?
The server can't be handing out arbitrary permissions, for two major reasons. One is that you just can't be putting up servers that are handing out permissions for other networks for a host of obvious reasons. The second is that while the Internet is sufficiently connected that we are often able to just pretend it's one big happy IP namespace, that's not true. The addresses reserved for local networks create one obvious exception (there's a ton of 192.168.1.1s in the world), but just in general what the server thinks another resource's identifier is may not be the identifier from the client's point of view. A lot of hacking opportunity in exploiting the gap between a server's concept of network identity and the client's.
DNS isn't enough, because I can set up a DNS subdomain to point at any IP I want. I'd need to pre-flight check the request to ensure a cert establishes at least some minimal level of ownership over the domain, and there's no protocol-generic way to check, so we have to reuse HTTPS.
Now, by the time this is all set up, you probably might as well just have set up a websocket proxy. You're certainly not using this to build a glorious P2P application or anything. (If you could convince your users to install a new root SSL cert, I can see making this work with some other grease, but without that I think you're stuck being the MitM router for all traffic, which is hardly P2P.)
There is also the YOLO option of just letting browsers open unrestricted sockets and letting the internet pick up the pieces. Which it eventually would. But would probably result in an even more restrictive set up than we have now.
> Attackers may use the API to by-pass third parties' CORS policies.
> Mitigation
> We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
I'm really curious about the implicit ethics here.
There's this idea that the web should be able to do everything that native apps can, an idea that I'm inclined to agree with. One thing that native apps can of course do is to bypass third parties' CORS policies. And there are may legitimate use cases for that, like feed readers for example. Right now, if you want a feed readers as a web app, you need a backend, to be able to request feeds (and homepages, for autodiscovery). For a native app, you could do that all client side.
I also don't get it. The whole reason for CORS is "Resource Sharing" (e.g. indirectly using resources like cookies belonging to another domain). With direct sockets, no shared resource is being accessed, all the browser does is open a TCP connection (e.g. no cookie accessed or sent anywhere).
I can understand that TCP connections could be abused by some websites (e.g. using your browser for spamming, accessing unsecured local services, etc.) but this can be solved with a permission style popup just like with the geolocation or webcam APIs.
I'm pretty sure "Resource Sharing" is about resources on a server being loaded from other origins, not about sharing client-side information. This is a client-side restriction. By offering an alternative you ship a workaround for CORS, effectively disabling it.
You’ve got a point, however almost every native app out there that constantly interact with a web api is likely written with web technologies and shipped with Chromium / Nodejs.
I wonder if it’d make more sense as a PWA only feature. The use cases are definitely there for websites but the security minefield makes the proposal extremely limited. If the user is already going to install the PWA maybe it opens up room to do things like be less worried about needing the user to type every address accessed manually while simultaneously reducing the number of weird new permission asks they see from websites.
In general though it’d be really nifty to have this functionality. Chromium already runs most apps it’s just right now they all ship their own Chromium.
Seeing that a big chunk of the use case is p2p. That’s difficult to do well without a listening API, SO_REUSEPORT and/or UPnP (or other port mapping protocols). If p2p is an explicit use case someone should prototype how that could work and also make it better/simpler than WebRTC before putting a new proposal forward for the entire web.
Since this is meant for high trust anyway, perhaps just shipping the more familiar Node APIs for tcp/udp would be better? That would let people do cool things (including p2p) that actually works, and it’s pretty battletested. Or maybe I’m missing something? It was a while ago I used that.
Another thought: don’t we have too many streaming protocols already? Do we really need SSE, WebSocket, WebTransport and now raw sockets as well? It’s a lot, (but not this project's fault)
What do you find impenetrable about WebRTC? In most cases I have found the complexity necessary. When you find yourself in unique situations it’s a protocol that gives you the knobs.
To be able to accept connections, you would need to have a white IP, configure router to map port, configure firewall to allow connections. Too many things could go wrong.
With BitTorrent plenty of clients can't accept connections but that does not render them unusable.
These things may be a desired use case but as the explainer lays out they were already rejected for being too easy to abuse. This proposal is an attempt to get some basic portion of the functionality that is easier to limit even if there is no way for everyone to agree the rest can be secured too.
This would be so amazing. In order to access most vanilla services like redis, postgres etc. you need to deploy a bridge https://github.com/zquestz/ws-tcp-proxy
CloudFlare/Deno etc. all have these workarounds around tunnelling but all that would disappear with this protocol. I made a service for writing servers on the web (https://webcode.run -- somewhat abandoned at this point but it is still running), and a big problem with the approach was the web's inability to make TCP connections.
This would be so awful. Most vanilla services like redis, postgres, etc... would need to deal with the frontend spew directly instead of offloading that to a bridge as an intermediary.
You could still have a reverse proxy to deal with the worst of it. But if the client could already talk the right protocols it would remove a lot of unnecessary complexity.
the bridge is just a proxy, the volume if traffic is the same but you now have added a regional hop. Exposing your redis and postgres on the public web is not a good idea, but there are tons of internal usecases, and things like postgres actually have a security mechanism so its not default "insecure"
This would also allow browsers to act as direct database clients if I am interpreting this right?
That means that your web app could act like "psql" and open a direct TCP socket connection to "postgresql://user:pass@host:5432/db", which would change everything
You'd no longer need a backend just to middleman SQL requests, assuming you have proper RLS implemented.
(Whether or not this is a good idea/applicable to all cases is debatable, but it at least becomes possible if I'm reading right)
Very cool for projects like Hasura and Supabase which rely on middleman http servers (like PostgREST). But also great for building tools like database inspectors / SQL managers right in the browser!
For public data I am inclined to agree with the parent.
You could just pass a single auth token to the database if it supported it for public data only and fetch it that way. Kinda like a bearer token, etc...
Therefore having direct access to the database from the client side only for public data.
This would be very beneficial to the web as a whole as a lot of the data is public data.
Then the separation of privilege/access has to happen directly at the database level which is totally possible.
Would be a nice addition to the web to treat public data differently.
If there's an API for "desktop application" web pages which can make arbitrary connections, what does the security model look like?
Why?
The server can't be handing out arbitrary permissions, for two major reasons. One is that you just can't be putting up servers that are handing out permissions for other networks for a host of obvious reasons. The second is that while the Internet is sufficiently connected that we are often able to just pretend it's one big happy IP namespace, that's not true. The addresses reserved for local networks create one obvious exception (there's a ton of 192.168.1.1s in the world), but just in general what the server thinks another resource's identifier is may not be the identifier from the client's point of view. A lot of hacking opportunity in exploiting the gap between a server's concept of network identity and the client's.
DNS isn't enough, because I can set up a DNS subdomain to point at any IP I want. I'd need to pre-flight check the request to ensure a cert establishes at least some minimal level of ownership over the domain, and there's no protocol-generic way to check, so we have to reuse HTTPS.
Now, by the time this is all set up, you probably might as well just have set up a websocket proxy. You're certainly not using this to build a glorious P2P application or anything. (If you could convince your users to install a new root SSL cert, I can see making this work with some other grease, but without that I think you're stuck being the MitM router for all traffic, which is hardly P2P.)
There is also the YOLO option of just letting browsers open unrestricted sockets and letting the internet pick up the pieces. Which it eventually would. But would probably result in an even more restrictive set up than we have now.
> Attackers may use the API to by-pass third parties' CORS policies.
> Mitigation
> We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
I'm really curious about the implicit ethics here.
There's this idea that the web should be able to do everything that native apps can, an idea that I'm inclined to agree with. One thing that native apps can of course do is to bypass third parties' CORS policies. And there are may legitimate use cases for that, like feed readers for example. Right now, if you want a feed readers as a web app, you need a backend, to be able to request feeds (and homepages, for autodiscovery). For a native app, you could do that all client side.
I can understand that TCP connections could be abused by some websites (e.g. using your browser for spamming, accessing unsecured local services, etc.) but this can be solved with a permission style popup just like with the geolocation or webcam APIs.
In general though it’d be really nifty to have this functionality. Chromium already runs most apps it’s just right now they all ship their own Chromium.
Since this is meant for high trust anyway, perhaps just shipping the more familiar Node APIs for tcp/udp would be better? That would let people do cool things (including p2p) that actually works, and it’s pretty battletested. Or maybe I’m missing something? It was a while ago I used that.
Another thought: don’t we have too many streaming protocols already? Do we really need SSE, WebSocket, WebTransport and now raw sockets as well? It’s a lot, (but not this project's fault)
This certainly looks simpler than WebRTC, which is absolutely impenetrable.
I created https://webrtcforthecurious.com to try and solve the accessibility/education issues around WebRTC
With BitTorrent plenty of clients can't accept connections but that does not render them unusable.
CloudFlare/Deno etc. all have these workarounds around tunnelling but all that would disappear with this protocol. I made a service for writing servers on the web (https://webcode.run -- somewhat abandoned at this point but it is still running), and a big problem with the approach was the web's inability to make TCP connections.
That means that your web app could act like "psql" and open a direct TCP socket connection to "postgresql://user:pass@host:5432/db", which would change everything
You'd no longer need a backend just to middleman SQL requests, assuming you have proper RLS implemented.
(Whether or not this is a good idea/applicable to all cases is debatable, but it at least becomes possible if I'm reading right)
Connecting browsers directly to databases is generally a bad idea for other reasons.
Wouldn't they be accessible to anyone reading the front end source files or plugins installed in the browser context?
You could just pass a single auth token to the database if it supported it for public data only and fetch it that way. Kinda like a bearer token, etc...
Therefore having direct access to the database from the client side only for public data.
This would be very beneficial to the web as a whole as a lot of the data is public data.
Then the separation of privilege/access has to happen directly at the database level which is totally possible.
Would be a nice addition to the web to treat public data differently.