Cap: Lightweight, modern open-source CAPTCHA alternative using proof-of-work

Did no-one click through to the technical white paper?

https://www.researchgate.net/publication/374638786_Proof-of-...

"Proof-of-Work CAPTCHA with password cracking functionality"

The "work" is "to use the distributed power of webusers’ computers" to "obtain suspects’ passwords in order to access encrypted evidence" and "support law enforcement activities".

Funny how that isn't mentioned anywhere in the linked site.

mcpar-land · 9 months ago

> Normally, it is undesirable for users’ passwords to be cracked. However, in the case of law enforcement, we often need to obtain suspects’ passwords in order to access encrypted evidence. The obvious solution is to build powerful (and expensive) dictionary cryptanalysis computers. A less obvious approach is to use the distributed power of web users’ computers, as has been done in the Seti@Home (https://setiathome.berkeley.edu/ — suspended project) or Folding@Home projects (https://foldingathome.org/). The proposed approach can therefore support law enforcement activities while providing the desired functionality to the web community

"You're not allowed to visit this website unless you submit your computer to being part of the fed's password cracking botnet" that's a whole fresh hell. A better use case is right there in their own description! I'd love my captchas to be little Folding@Home problems.

downrightmike · 9 months ago

That is shady as hell. Welp this is dead on the vine

tiagorangel · 9 months ago

btw no, cap does not contribute to any "fed botnet". you can build the WASM binaries yourself and compare the hashes. added a clarification about that to the docs.

GoblinSlayer · 9 months ago

Bitcoin network used to bruteforce 85 bits per year, which is slightly bigger than capacity of [a-z0-9]{16}

ronsor · 9 months ago

Can't we just submit bogus hashes?

p0w3n3d · 9 months ago

2030: to enter the site you must allow us to mine few ethereal on your pc...

SparkyMcUnicorn · 9 months ago

Definitely concerning, although I'm having trouble finding anything in the codebase to support this.

This paper even seems to contradict aspects of the project's no tracking stance. If someone told me this paper was for a different (but similar) project, I'd believe it after looking at the two side by side.

Would definitely want this to be addressed before I'd consider using it.

prophesi · 9 months ago

There are two binaries commited to the repo (cap_wasm_bg.wasm) but from what I can tell, it doesn't seem to be making any network calls or what have you. They still should get rid of them and add a Rust build step for their browser/node packages.

hathawsh · 9 months ago

Interesting discovery. This research sounds creepy and ill-advised, but my intuition suggests to me this is an innocent attempt to do something useful rather than waste energy on a PoW algorithm. My intuition also tells me that if this project became popular enough, attackers would break the algorithm fairly easily and the project would just revert to a more conventional PoW algorithm that doesn't try to be smart.

powgpu · 9 months ago

Wasn't there a crypto that use GPU to solve LLM computation as PoW. Wouldn't that be better approach?

tiagorangel · 9 months ago

Cap does not send any of the calculated hashes ANYWHERE, the white paper just details a bit how proof-of-work works and I thought that it would be interesting to share.

tiagorangel · 9 months ago

added a note saying that to the docs, should hopefully clarify stuff a bit!

Cyphase · 10 months ago

I think there's a good chance they just linked to the paper for technical background, unrelated to the paper's mention of law enforcement usage. The website mentions self-hosted, no third-party requests, etc. Unless they're flat-out lying.

tiagorangel · 9 months ago

Yes, the code is open-source for you to check

I was wondering if more sites will start to drift to a system where they require you to be logged in to an account attached to a SIM card in some ways.

I feel like accounts that require phone verification are already similar in that they require a some cost to access. It obviously wouldn't stop a large corporation from buying up thousands of numbers if they needed it for a specific purpose, but it would be prohibitively expensive for most to try this.

The benefit of the SIM system is it actually costs zero for people since they already have a cell phone.

jeroenhd · 9 months ago

> a SIM card

That's basically what remote attestation is. But it's using TPMs (or similar) rather than SIM cards. The TPM has a key signed by the manufacturer, and that key can be used to sign tokens to prove that you possess a physical TPM and have it in a mode that provides access to that key.

The problem with either is that the system doesn't work if you can get access to the keys behind the system. That means banning everyone who uses a vulnerable model of SIM card/TPM implementation. SIMs are cheaper to replace, but you'd have to replace millions of them every time someone manages to voltage glitch a SIM card.

If you own an iPhone or Macbook, you have access to a browser you already does this: https://developer.apple.com/news/?id=huqjyh7k

nailer · 9 months ago

> I was wondering if more sites will start to drift to a system where they require you to be logged in to an account attached to a SIM card in some ways.

I hope we move away from SIM cards - they'll require SIM based auth checks and low paid staff at cell phone companies will happily give away my SIM card to another phone to get a kickback from robbing people.

theamk · 9 months ago

Such site is better provide some unique service no one else can.

There is no way I am sharing my phone number with random sites unless I absolutely have to, I get enough spam & scam already, and tracking potential is enormous.

olyjohn · 9 months ago

You might not, but most people don't care anymore, and they will give their personal data. And then you will have no choice, as you will be the outlier who is just an old man yelling at clouds.

hardwaresofton · 10 months ago

No need for the SIM, just being logged in to something will probably be enough to stop most crawlers.

Then, if someone is logged in, you can throw TOS their way, and make it a legal problem.

downrightmike · 9 months ago

Yes because having an account gets around adblockers, anti tracking, age verification and section 230 removal issues. ToS is already weaponized.

landl0rd · 9 months ago

Phone number is also good because you can be reasonably sure as to whether it's voip or not. It is literally the one non-awful solution to the sybil problem we have discovered (the awful ones being things like gov id).

subscribed · 9 months ago

Thank you, I hate it.

There's no way in he'll I'm going to create an account on every site I want to read, and absolutely I'm not submitting my number for the eternal, unrelenting spam.

I have enough crap from the legitimate companies selling/leaking my number, to now deal with _that_.

nicwolff · 9 months ago

internetter · 9 months ago

> @cap.js/solver is a standalone library that can be used to solve Cap challenges from the server. Doesn't this defeat the purpose of Cap? Not really. Server-side solving is a core use case of proof-of-work CAPTCHAs like Cap or altcha. It's about proving effort, not necessarily involving a human.

I like this. Allows for reasonable bots like IA without the mindless wasteful AI scrappers.

Isn't IA's architecture pretty strained already without this?

areyourllySorry · 9 months ago

for manual saves, it might be able to offload the challenge to the saver's computer (but that means adding explicit support for this particular library, which might or might not happen...)

aiiotnoodle · 9 months ago

Sorry what is IA?

underyx · 9 months ago

Internet Archive

coppsilgold · 10 months ago

SHA-256 PoW will probably work until it doesn't (if bots choose to invest in ASICs, or services that offer this pop up). Also users may be at a disadvantage as JS crypto would not be optimized for PoW (for example lack parallel crypto capabilities or context switching between calls).

One advantage a PoW "CAPTCHA" system holds is that the service operator can change the algorithm whenever they want. This may make an ASIC approach too risky to bother with. The JS<>ASM crypto bridge would nevertheless require some optimization from the browser developers.

Some cryptocurrencies which aim for ASIC resistance create PoW algorithms that would require re-implementation of a significant fraction of the the CPU die to be a viable ASIC attack vector. An example of that would be randomx[1]. Using it for in-browser PoW would require native support as it will not be competitive against the bots with just a JS or WASM implementation. A modification would need to be made to not be abused for crypto mining. This will also link the cost of the PoW solution to the opportunity cost of mining the respective cryptocurrency which is well understood.

[1] <https://github.com/tevador/RandomX>

> Also users may be at a disadvantage as JS crypto would not be optimized for PoW (for example lack parallel crypto capabilities or context switching between calls).

JS crypto is only used as a fallback, Rust WASM is used for solving.

satellite2 · 9 months ago

It's just going to make low battery devices with consumer grade compute drain faster while bot farm with access to to ASICs will have a negligible increase in cost. This approach is going to have all the same problem to distribute work democratically as cryptocurrencies had. And as far as I know crypto didn't solve this.

marinmania · 9 months ago

mgrandl · 9 months ago

What does proof-of-work mean here and what makes it easy for humans and hard for bots?

stephantul · 9 months ago

Think of crawlers: a crawler typically makes hundreds or thousands of requests per second. The owners of the crawler then sell this data for X$, or gain X$ profit.

Proof of work adds a very small cost to each individual request, increasing the cost of crawling to a number higher than X. Because actual humans make very few requests, we don’t notice the increase in cost.

timtom123 · 9 months ago

This exactly, having ran very large scraping operations, it only takes a slight increase in cost to make it unprofitable for many use cases.

hombre_fatal · 9 months ago

When you use a captcha, you presumably want to defeat someone curling your CreatePost endpoint, not just make it more annoying to do it at only botnet scale.

This captcha still lets all traffic through. Except now you waste the battery of honest users.

Even HN proponents of the idea don't use it on their own sites.

skydhash · 9 months ago

It's equally easy for both. But people using broswers only do it a few times, while bots need to do it many times. A second for a human every X pages is not much, but it's a death-knell for the general practice of bots (and they can't store the cookies because you can rate-limit them that way).

Imagine scrapping thousands of page, but with a X>1 second wait for each. There wouldn't be a need to use such solution if crawlers were rate-limiting themselves, but they don't.

reaperducer · 9 months ago

So is the solution to stymying bots to just add a page load delay of a second or two? Enough that people won't care, but it doesn't scale for bots?

So if you rate limited to one request per second, then use 100 cookies to make 100 requests per second, 1 request per second per cookie.

pixl97 · 9 months ago

I think it's only more expensive for bots, though just as easy for bots.

The problem with bots is they quite often farm this out to stolen resources. It makes sending whatever they are sending slower, but doesn't stop it.

emseetech · 9 months ago

It will make server hijacking more noticeable and harder to hide.

i wrote a bit about it here: https://capjs.js.org/guide/effectiveness.html

jbellis · 9 months ago

ahh, that makes sense, thanks

I do think that calling this a CAPTCHA when it's not actually intended to distinguish humans from computers is a bit misleading, but I can see why you would do that

pkkkzip · 9 months ago

How does this compare to Anubis, another similar PoW based CAPTCHA?

Paired with this, and if there is a way to block out DDOS https traffic then we might be able to stop dependence on Cloudflare altogether.

throitallaway · 9 months ago

I'd be so happy if the Internet moved away from Cloudflare for Captcha. I got on their "bad list" at one point (for who knows why), and no matter how many times I checked the "I am a human" box their Captcha wouldn't let me through for a few days. I was unable to login to the portal of a product that we pay for. It was such a frustrating experience.

zipping1549 · 9 months ago

One of many reasons people use Cloudflare is that they effectively hide your real static ip, at least for low effort actors. It's not just about DDOS.

In my case the harsh firewall settings made by my company on our laptops were showing a red flag on captchas WAFs etc

anubis is more like Cap's checkpoint, but still the implementation is very different.

robbles · 9 months ago

This is a neat idea.

I don't know enough about the underlying proof-of-work stuff to comment on how effective this could be, but I think it's pretty funny that the UI examples say "I'm a human".

I guess "there's only a few of me at most" or "I could allocate enough computation to this that I'm probably not up to no good" don't read as clearly.