IPFS support got merged into curl

This does not appear to be IPFS, it's just some crude method of rewriting ipfs:// URLs into HTTP requests to gateways that then do the actual IPFS. Which much like bitcoin, nobody seems to want to run locally despite that mostly defeating the purpose. On the bright side none of this ended up polluting libcurl.

gcr · 2 years ago

I’d love to see IPFS native protocol support in curl, to be honest.

After all, curl already supports dozens of other obscure protocols: DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS…

One challenge of doing this correctly is that curl is intended to be a “one-off” call that downloads and exits without participating in the swarm the way a good BitTorrent client should. Granted, presumably this IPFS gateway solution also has this problem.

yjftsjthsd-h · 2 years ago

Well, that seems like exactly the difference: It's not about obscurity, it's about curl not being a daemon and therefore a poor fit for p2p protocols.

1letterunixname · 2 years ago

Your mischaracterization of HTTPS as obscure is baffling. I guess you aren't in tech or anything older than you is "old".

dartos · 2 years ago

People don’t want to run IPFS nodes bc there’s no incentive to.

Imo that’s better than it being married to crypto. IPFS has its uses.

wmf · 2 years ago

There should be a lean and mean libipfs that just downloads from IPFS without "running a node".

davidmurdoch · 2 years ago

That's part of what filecoin has been trying to solve.

humanizersequel · 2 years ago

Are there actual perverse incentives that come with "it being married to crypto" — keep in mind that there are demonstrable positive incentives — or is it just aesthetic distaste?

evbogue · 2 years ago

This answers the first question that popped into my mind.

Why not just curl the gateway then? You already could!

gcr · 2 years ago

Honestly, I agree. This is not IPFS support, this is “we ship with a default URL rewriting rule and then make an HTTP call”

It’s a first step, though

At least the rule is invoked safely. Curl doesn’t do localhost pings by default; instead it checks for the use of an IPFS_GATEWAY environment variable or a ~/.ipfs/gateway file, and will fail with instructions, if neither are present

Relevant blog by curl's author: https://daniel.haxx.se/blog/2022/08/10/ipfs-and-their-gatewa...

I wonder what changed?

bruce511 · 2 years ago

I didn't know what IPFS was, clearly I'm living under a rock.

From the above link;

>>The InterPlanetary File System (IPFS) is according to the Wikipedia description: “a protocol, hypermedia and file sharing peer-to-peer network for storing and sharing data in a distributed file system.”. It works a little like bittorrent and you typically access content on it using a very long hash in an ipfs:// URL. Like this:

ipfs://bafybeigagd5nmnn2iys2f3doro7ydrevyr2mzarwidgadawmamiteydbzi

lidel · 2 years ago

My understanding is that the curl position was around making sure UX (defaults) are safe and don't tie the user to any third-party gateway.

Default behavior in the merged curl PR got adjusted and now the only gateway that is used implicitly is the localhost one. Using an external, potentially untrusted public gateway requires explicit opt-in from the user via IPFS_GATEWAY env variable.

FWIW, in recent years IPFS ecosystem made content-addressing over HTTP viable and useful. Standards and specifications got created. Verifiable responses have standardized content types registered at IANA.

For practical info, see:

"Deserialized responses" vs "Verifiable responses" at https://curl.se/docs/ipfs.html

"Deserialized responses" are designed to be used on localhost, "Verifiable responses" are something one would use with a gateway they don't trust

Client docs at https://docs.ipfs.tech/reference/http/gateway/#trustless-ver...

Server specification at https://specs.ipfs.tech/http-gateways/trustless-gateway/

coppsilgold · 2 years ago

This patch makes curl utilize a gateway for ipfs:// addresses. It prefers a local one but can also use public gateways. It makes no effort to verify that the final product's cryptographic hash corresponds to the one in the address.

This lack of verification is expected with HTTP, but not with IPFS. curl should verify that the resultant output conforms with the IPFS address or else just have users input the gateway/ipfs HTTP address as you always could.

curl can operate in a pipe mode and that adds additional complexity in respect to verification.

IPFS gateways can serve you in a manner that allows continuous(?) verification: <https://docs.ipfs.tech/reference/http/gateway/#trusted-vs-tr...>

This would in theory allow curl to block the pipe until it is able to confirm that a piece that arrived is verified or abort a tampered-with file early. This would take quite a bit of work to implement however - since it seems like there is no maintained IPFS implementation in C: <https://docs.ipfs.tech/concepts/ipfs-implementations>

hardwaresofton · 2 years ago

This sounds like an excellent issue/feature request to file…

runeks · 2 years ago

Question: is the in-URL hash some form of Merkle tree root hash? Or is another method used to avoid having to download all the data before the hash can be verified?

Searching for “Merkle” on https://ipld.io/specs/transport/car/carv2/ gives no results.

kimburgess · 2 years ago

Similar. It’s a Merkle DAG.

There’s an intro to IPFS content identifiers here: https://docs.ipfs.tech/concepts/content-addressing/.

twoodfin · 2 years ago

I continue to regret that the window for the major browsers to incubate and support a content-addressable URL scheme based on then-current distributed hash table algorithms closed ~20 years ago and has shown no signs of being reopened.

Gigachad · 2 years ago

Because 99% of the time people don't want immutable addresses. They want something that points to the latest and most up to date version. And when you really do want to freeze something in time you just use one of the page archive services.

P2P just flat out doesn't work for mobile devices too which are pretty much the entire internet userbase now.

Negitivefrags · 2 years ago

I don't know if I agree.

Perhaps in 99% of pages you want the latest version, but I think in 99% of requests you want an immutable thing.

Generally speaking web pages are made up of vast numbers of immutable resources.

silotis · 2 years ago

DHTs have had support for mutable pointers for a long time. The bittorrent DHT has BEP44 which was published in 2014.

http://bittorrent.org/beps/bep_0044.html

jasonjayr · 2 years ago

Had content based urls been a thing 20 years ago, presumably there would have evolved technology for mobile ISPs to host border caches that end users could trust 100% without worry of nefarious manipulation.

ndriscoll · 2 years ago

If it can be made fast, content addressed p2p distribution is a perfect replacement for e.g. imgur to offload bandwidth costs for the fediverse.

GaggiX · 2 years ago

I don't understand why this was downvoted several times, what's the problem with the parent comment?

Edit: well it seems that I have absorbed his downvotes, better this way ahah

viraptor · 2 years ago

Tangential, but does anyone know why the discovery seems to barely work recently in ipfs? A few years ago I could start two nodes in random places and copy files between them with just a little delay. These days, I'm rarely and to start the transfer, even though both sides are connected to a few peers. Has something changed? Is it due to more/fewer people in the network?

ranger_danger · 2 years ago

Because more people are using it now but the protocol itself isn't actually suited for this level of scale; IMO it's doomed.

How does that work? I would expect more nodes = more capacity for answering lookups / spreading the DHT. Why is the result the opposite? If I start a full node it's definitely not overwhelmed with traffic.

j_maffe · 2 years ago

Do you have any sources where I can read up on this? I've also heard of this complaint and I'm interested in diving into it for my thesis. Do you know what algorithms are suitable to make IPFS scalable?

Thorrez · 2 years ago

Sounds like what happened to bitcoin.

stefan_ · 2 years ago

TobyTheDog123 · 2 years ago

I'm not that familiar with IPFS I must admit (though it looks great conceptually), but if using the curl CLI tool, why would the operator not just curl the public gateway address?

I'm confused on why such a shallow abstraction was put into something present on every device, and why this seems to be such a big deal to the decentralized community.

Even with it's automatic gateway detection it's purely used to rewrite the URL, which seems like something the operator could easily do themselves.

I could easily be missing something here though.

a1369209993 · 2 years ago

I wasn't able to find a equivalent in curl with a very cursory search, but wget has `--page-requisites`, which fetches (nominally) every file needed to display a HTML document. If curl does have something analogous, this change would allow html of the form:

  <img src="ipfs://WHATEVER"/>

to be handled transparently (even when it occurs in a page that is not itself on IPFS). Ideally similar support would be added for "magnet:?..." and "[...].onion/..." URLs, for the same reason.

tetsuhamu · 2 years ago

no one uses ipfs:// URLs in HTML

Pushing the URL rewriting down into curl eliminates a special case for curl users. But I suspect this is for PR.

badrabbit · 2 years ago

p4bl0 · 2 years ago

So, that's not really IPFS support in cURL. It is just a support for IPFS' urls as it actually consists in rewriting them to use a HTTP gateway, which will actually do all the IPFS work.

I understand that implementing the IPFS protocol in a tool such as cURL does not make sense. But I don't really see the point of a fake support like this.

pests · 2 years ago

> I have also learned that some of the IPFS gateways even do regular HTTP 30x redirects to bounce over the client to another gateway.

> Meaning: not only do you use and rely a total rando’s gateway on the Internet for your traffic. That gateway might even, on its own discretion, redirect you over to another host. Possibly run somewhere else, monitored by a separate team.

> I have insisted, in the PR for ipfs support to curl, that the IPFS URL handling code should not automatically follow such gateway redirects

I wonder what decision was made here.