This patch makes curl utilize a gateway for ipfs:// addresses. It prefers a local one but can also use public gateways. It makes no effort to verify that the final product's cryptographic hash corresponds to the one in the address.
This lack of verification is expected with HTTP, but not with IPFS. curl should verify that the resultant output conforms with the IPFS address or else just have users input the gateway/ipfs HTTP address as you always could.
curl can operate in a pipe mode and that adds additional complexity in respect to verification.
This would in theory allow curl to block the pipe until it is able to confirm that a piece that arrived is verified or abort a tampered-with file early. This would take quite a bit of work to implement however - since it seems like there is no maintained IPFS implementation in C: <https://docs.ipfs.tech/concepts/ipfs-implementations>
Question: is the in-URL hash some form of Merkle tree root hash? Or is another method used to avoid having to download all the data before the hash can be verified?
I continue to regret that the window for the major browsers to incubate and support a content-addressable URL scheme based on then-current distributed hash table algorithms closed ~20 years ago and has shown no signs of being reopened.
Because 99% of the time people don't want immutable addresses. They want something that points to the latest and most up to date version. And when you really do want to freeze something in time you just use one of the page archive services.
P2P just flat out doesn't work for mobile devices too which are pretty much the entire internet userbase now.
Had content based urls been a thing 20 years ago, presumably there would have evolved technology for mobile ISPs to host border caches that end users could trust 100% without worry of nefarious manipulation.
Tangential, but does anyone know why the discovery seems to barely work recently in ipfs? A few years ago I could start two nodes in random places and copy files between them with just a little delay. These days, I'm rarely and to start the transfer, even though both sides are connected to a few peers. Has something changed? Is it due to more/fewer people in the network?
How does that work? I would expect more nodes = more capacity for answering lookups / spreading the DHT. Why is the result the opposite? If I start a full node it's definitely not overwhelmed with traffic.
Do you have any sources where I can read up on this? I've also heard of this complaint and I'm interested in diving into it for my thesis. Do you know what algorithms are suitable to make IPFS scalable?
This does not appear to be IPFS, it's just some crude method of rewriting ipfs:// URLs into HTTP requests to gateways that then do the actual IPFS. Which much like bitcoin, nobody seems to want to run locally despite that mostly defeating the purpose. On the bright side none of this ended up polluting libcurl.
I’d love to see IPFS native protocol support in curl, to be honest.
After all, curl already supports dozens of other obscure protocols: DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS…
One challenge of doing this correctly is that curl is intended to be a “one-off” call that downloads and exits without participating in the swarm the way a good BitTorrent client should. Granted, presumably this IPFS gateway solution also has this problem.
Are there actual perverse incentives that come with "it being married to crypto" — keep in mind that there are demonstrable positive incentives — or is it just aesthetic distaste?
Honestly, I agree. This is not IPFS support, this is “we ship with a default URL rewriting rule and then make an HTTP call”
It’s a first step, though
At least the rule is invoked safely. Curl doesn’t do localhost pings by default; instead it checks for the use of an IPFS_GATEWAY environment variable or a ~/.ipfs/gateway file, and will fail with instructions, if neither are present
I'm not that familiar with IPFS I must admit (though it looks great conceptually), but if using the curl CLI tool, why would the operator not just curl the public gateway address?
I'm confused on why such a shallow abstraction was put into something present on every device, and why this seems to be such a big deal to the decentralized community.
Even with it's automatic gateway detection it's purely used to rewrite the URL, which seems like something the operator could easily do themselves.
I wasn't able to find a equivalent in curl with a very cursory search, but wget has `--page-requisites`, which fetches (nominally) every file needed to display a HTML document. If curl does have something analogous, this change would allow html of the form:
<img src="ipfs://WHATEVER"/>
to be handled transparently (even when it occurs in a page that is not itself on IPFS). Ideally similar support would be added for "magnet:?..." and "[...].onion/..." URLs, for the same reason.
I didn't know what IPFS was, clearly I'm living under a rock.
From the above link;
>>The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:
My understanding is that the curl position was around making sure UX (defaults) are safe and don't tie the user to any third-party gateway.
Default behavior in the merged curl PR got adjusted and now the only gateway that is used implicitly is the localhost one. Using an external, potentially untrusted public gateway requires explicit opt-in from the user via IPFS_GATEWAY env variable.
FWIW, in recent years IPFS ecosystem made content-addressing over HTTP viable and useful. Standards and specifications got created. Verifiable responses have standardized content types registered at IANA.
So, that's not really IPFS support in cURL. It is just a support for IPFS' urls as it actually consists in rewriting them to use a HTTP gateway, which will actually do all the IPFS work.
I understand that implementing the IPFS protocol in a tool such as cURL does not make sense. But I don't really see the point of a fake support like this.
> I have also learned that some of the IPFS gateways even do regular HTTP 30x redirects to bounce over the client to another gateway.
> Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.
> I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects
This lack of verification is expected with HTTP, but not with IPFS. curl should verify that the resultant output conforms with the IPFS address or else just have users input the gateway/ipfs HTTP address as you always could.
curl can operate in a pipe mode and that adds additional complexity in respect to verification.
IPFS gateways can serve you in a manner that allows continuous(?) verification: <https://docs.ipfs.tech/reference/http/gateway/#trusted-vs-tr...>
This would in theory allow curl to block the pipe until it is able to confirm that a piece that arrived is verified or abort a tampered-with file early. This would take quite a bit of work to implement however - since it seems like there is no maintained IPFS implementation in C: <https://docs.ipfs.tech/concepts/ipfs-implementations>
Searching for “Merkle” on https://ipld.io/specs/transport/car/carv2/ gives no results.
There’s an intro to IPFS content identifiers here: https://docs.ipfs.tech/concepts/content-addressing/.
P2P just flat out doesn't work for mobile devices too which are pretty much the entire internet userbase now.
Perhaps in 99% of pages you want the latest version, but I think in 99% of requests you want an immutable thing.
Generally speaking web pages are made up of vast numbers of immutable resources.
http://bittorrent.org/beps/bep_0044.html
Edit: well it seems that I have absorbed his downvotes, better this way ahah
After all, curl already supports dozens of other obscure protocols: DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS…
One challenge of doing this correctly is that curl is intended to be a “one-off” call that downloads and exits without participating in the swarm the way a good BitTorrent client should. Granted, presumably this IPFS gateway solution also has this problem.
Imo that’s better than it being married to crypto. IPFS has its uses.
Why not just curl the gateway then? You already could!
It’s a first step, though
At least the rule is invoked safely. Curl doesn’t do localhost pings by default; instead it checks for the use of an IPFS_GATEWAY environment variable or a ~/.ipfs/gateway file, and will fail with instructions, if neither are present
I'm confused on why such a shallow abstraction was put into something present on every device, and why this seems to be such a big deal to the decentralized community.
Even with it's automatic gateway detection it's purely used to rewrite the URL, which seems like something the operator could easily do themselves.
I could easily be missing something here though.
I wonder what changed?
From the above link;
>>The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:
ipfs://bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi
Default behavior in the merged curl PR got adjusted and now the only gateway that is used implicitly is the localhost one. Using an external, potentially untrusted public gateway requires explicit opt-in from the user via IPFS_GATEWAY env variable.
FWIW, in recent years IPFS ecosystem made content-addressing over HTTP viable and useful. Standards and specifications got created. Verifiable responses have standardized content types registered at IANA.
For practical info, see:
"Deserialized responses" vs "Verifiable responses" at https://curl.se/docs/ipfs.html
"Deserialized responses" are designed to be used on localhost, "Verifiable responses" are something one would use with a gateway they don't trust
Client docs at https://docs.ipfs.tech/reference/http/gateway/#trustless-ver...
Server specification at https://specs.ipfs.tech/http-gateways/trustless-gateway/
I understand that implementing the IPFS protocol in a tool such as cURL does not make sense. But I don't really see the point of a fake support like this.
> Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.
> I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects
I wonder what decision was made here.