Direct Sockets: Proposal for a future web platform API

So the thing is: the original "websocket" was prevented from being a full arbitrary connect() implementation out of security concerns. That is, the web page is running inside the user's network security boundary, and might be able to make connections which appear trusted.

If there's an API for "desktop application" web pages which can make arbitrary connections, what does the security model look like?

jerf · 3 years ago

The only one I can imagine is still that the server can tell a client "you're allowed to direct socket to these ports on this DNS address specifically", with possible some rules around DNS addresses similar to cookie sharing on domains, and with a mandatory CORS preflight request to an HTTPS server on the same target domain.

Why?

The server can't be handing out arbitrary permissions, for two major reasons. One is that you just can't be putting up servers that are handing out permissions for other networks for a host of obvious reasons. The second is that while the Internet is sufficiently connected that we are often able to just pretend it's one big happy IP namespace, that's not true. The addresses reserved for local networks create one obvious exception (there's a ton of 192.168.1.1s in the world), but just in general what the server thinks another resource's identifier is may not be the identifier from the client's point of view. A lot of hacking opportunity in exploiting the gap between a server's concept of network identity and the client's.

DNS isn't enough, because I can set up a DNS subdomain to point at any IP I want. I'd need to pre-flight check the request to ensure a cert establishes at least some minimal level of ownership over the domain, and there's no protocol-generic way to check, so we have to reuse HTTPS.

Now, by the time this is all set up, you probably might as well just have set up a websocket proxy. You're certainly not using this to build a glorious P2P application or anything. (If you could convince your users to install a new root SSL cert, I can see making this work with some other grease, but without that I think you're stuck being the MitM router for all traffic, which is hardly P2P.)

There is also the YOLO option of just letting browsers open unrestricted sockets and letting the internet pick up the pieces. Which it eventually would. But would probably result in an even more restrictive set up than we have now.

flangola7 · 3 years ago

It shouldn't matter in theory, perimeter model is dying. Current security paradigm is to treat every device on a network as actively hostile.

forgotusername6 · 3 years ago

Even with connections to localhost? Should a computer no longer trust itself?

IshKebab · 3 years ago

The "everything in the network is 100% trusted" model is dying, but it isn't being replaced by "everything in the network is 0% trusted".

This would also allow browsers to act as direct database clients if I am interpreting this right?

That means that your web app could act like "psql" and open a direct TCP socket connection to "postgresql://user:pass@host:5432/db", which would change everything

You'd no longer need a backend just to middleman SQL requests, assuming you have proper RLS implemented.

(Whether or not this is a good idea/applicable to all cases is debatable, but it at least becomes possible if I'm reading right)

xyzzy_plugh · 3 years ago

Alternatively databases could simply implement HTTP and do this today.

Connecting browsers directly to databases is generally a bad idea for other reasons.

matlin · 3 years ago

Very cool for projects like Hasura and Supabase which rely on middleman http servers (like PostgREST). But also great for building tools like database inspectors / SQL managers right in the browser!

maybenotsofast · 3 years ago

Kinda curious. How do you propose to keep the db/cache login credentials secure on the client side in JavaScript?

Wouldn't they be accessible to anyone reading the front end source files or plugins installed in the browser context?

idsout · 3 years ago

Temporary expirable sessions like we have today could be one way. Generate a temporary session (DB creds) for an authenticated user.

publicdatamaybe · 3 years ago

For public data I am inclined to agree with the parent.

You could just pass a single auth token to the database if it supported it for public data only and fetch it that way. Kinda like a bearer token, etc...

Therefore having direct access to the database from the client side only for public data.

This would be very beneficial to the web as a whole as a lot of the data is public data.

Then the separation of privilege/access has to happen directly at the database level which is totally possible.

Would be a nice addition to the web to treat public data differently.

pjc50 · 3 years ago

amadeuspagel · 3 years ago

> Threat

> Attackers may use the API to by-pass third parties' CORS policies.

> Mitigation

> We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.

I'm really curious about the implicit ethics here.

There's this idea that the web should be able to do everything that native apps can, an idea that I'm inclined to agree with. One thing that native apps can of course do is to bypass third parties' CORS policies. And there are may legitimate use cases for that, like feed readers for example. Right now, if you want a feed readers as a web app, you need a backend, to be able to request feeds (and homepages, for autodiscovery). For a native app, you could do that all client side.

olalonde · 3 years ago

I also don't get it. The whole reason for CORS is "Resource Sharing" (e.g. indirectly using resources like cookies belonging to another domain). With direct sockets, no shared resource is being accessed, all the browser does is open a TCP connection (e.g. no cookie accessed or sent anywhere).

I can understand that TCP connections could be abused by some websites (e.g. using your browser for spamming, accessing unsecured local services, etc.) but this can be solved with a permission style popup just like with the geolocation or webcam APIs.

Timon3 · 3 years ago

I'm pretty sure "Resource Sharing" is about resources on a server being loaded from other origins, not about sharing client-side information. This is a client-side restriction. By offering an alternative you ship a workaround for CORS, effectively disabling it.

mr90210 · 3 years ago

You’ve got a point, however almost every native app out there that constantly interact with a web api is likely written with web technologies and shipped with Chromium / Nodejs.

zamadatix · 3 years ago

I wonder if it’d make more sense as a PWA only feature. The use cases are definitely there for websites but the security minefield makes the proposal extremely limited. If the user is already going to install the PWA maybe it opens up room to do things like be less worried about needing the user to type every address accessed manually while simultaneously reducing the number of weird new permission asks they see from websites.

In general though it’d be really nifty to have this functionality. Chromium already runs most apps it’s just right now they all ship their own Chromium.

pjmlp · 3 years ago

At a given point it might be worthwhile to accept that a native application should be written instead.

klabb3 · 3 years ago

Seeing that a big chunk of the use case is p2p. That’s difficult to do well without a listening API, SO_REUSEPORT and/or UPnP (or other port mapping protocols). If p2p is an explicit use case someone should prototype how that could work and also make it better/simpler than WebRTC before putting a new proposal forward for the entire web.

Since this is meant for high trust anyway, perhaps just shipping the more familiar Node APIs for tcp/udp would be better? That would let people do cool things (including p2p) that actually works, and it’s pretty battletested. Or maybe I’m missing something? It was a while ago I used that.

Another thought: don’t we have too many streaming protocols already? Do we really need SSE, WebSocket, WebTransport and now raw sockets as well? It’s a lot, (but not this project's fault)

Aeolun · 3 years ago

> make it better/simpler than WebRTC

This certainly looks simpler than WebRTC, which is absolutely impenetrable.

Sean-Der · 3 years ago

What do you find impenetrable about WebRTC? In most cases I have found the complexity necessary. When you find yourself in unique situations it’s a protocol that gives you the knobs.

I created https://webrtcforthecurious.com to try and solve the accessibility/education issues around WebRTC

vbezhenar · 3 years ago

To be able to accept connections, you would need to have a white IP, configure router to map port, configure firewall to allow connections. Too many things could go wrong.

With BitTorrent plenty of clients can't accept connections but that does not render them unusable.

You could use STUN, or if you can make arbitrary connections talk to the user's router over UPNP.

These things may be a desired use case but as the explainer lays out they were already rejected for being too easy to abuse. This proposal is an attempt to get some basic portion of the functionality that is easier to limit even if there is no way for everyone to agree the rest can be secured too.

tlarkworthy · 3 years ago

This would be so amazing. In order to access most vanilla services like redis, postgres etc. you need to deploy a bridge https://github.com/zquestz/ws-tcp-proxy

CloudFlare/Deno etc. all have these workarounds around tunnelling but all that would disappear with this protocol. I made a service for writing servers on the web (https://webcode.run -- somewhat abandoned at this point but it is still running), and a big problem with the approach was the web's inability to make TCP connections.

abeyer · 3 years ago

This would be so awful. Most vanilla services like redis, postgres, etc... would need to deal with the frontend spew directly instead of offloading that to a bridge as an intermediary.

You could still have a reverse proxy to deal with the worst of it. But if the client could already talk the right protocols it would remove a lot of unnecessary complexity.

the bridge is just a proxy, the volume if traffic is the same but you now have added a regional hop. Exposing your redis and postgres on the public web is not a good idea, but there are tons of internal usecases, and things like postgres actually have a security mechanism so its not default "insecure"

gavinray · 3 years ago

cft · 3 years ago

I don't understand why one cannot just use a binary websocket as the fairly general case? (https://www.rfc-editor.org/rfc/rfc6455#page-29 binary frame 0x2)

jpgvm · 3 years ago

You would need a proxy to convert the WebSocket connection back to normal byte stream and proxy that to the target.

Yes, but what is the use case? Why cannot you modify your backend to accept Websocket handshake? After that, it's a normal byte stream.

anonymoushn · 3 years ago

It seems like this would not do anything to address the UDP portion of the proposal.