Readit News logoReadit News
costco commented on Show HN: Building a web search engine from scratch with 3B neural embeddings   blog.wilsonl.in/search-en... · Posted by u/wilsonzlin
costco · 16 days ago
This is awesome, and the low cost is especially impressive. I rarely have the motivation after working on a side project to actually document all the decisions made along the way, much less in such a thorough way. Regarding your CoreNN library, Clearview has a blog post [1] on how they index 30 billion face embeddings that you may find interesting. They combine RocksDB with faiss.

[1] https://www.clearview.ai/post/how-we-store-and-search-30-bil...

costco commented on Tor: How a military project became a lifeline for privacy   thereader.mitpress.mit.ed... · Posted by u/anarbadalov
thewebguyd · 20 days ago
The feds and other equivalent agencies in other countries have been running exit nodes for years, but its still better than most solutions even if not perfect. Anyone who has gotten caught though likely wasn't because of any flaws in Tor (or said exit nodes) but because of other lapses in OpSec.

That being said, yes, feds can de-anonymize traffic, probably reliably at this point. There are only about 7-8000 active nodes, most in data centers. The less nodes you hop through, the more likely that traffic can be traced back to the entry point (guard node), and combined with timing can be reasonably traced back to the user. Tor works best with many, many nodes, and a minimum of three. There's not as many nodes as there needs to be so quite often it's only 3 you are going through (guard node/entry point, middle node, exit node)

Plus browsing habits can also be revealing. Just because someone is using Tor doesn't mean they also have disabled javascript, blocked cookies, aren't logging into accounts, etc.

costco · 20 days ago
This page on the mailing list has links to cases of people who were caught because of an unknown flaw in Tor: https://archive.torproject.org/websites/lists.torproject.org...

I can't find a link, but I think people have done simulations and the privacy benefits of more hops are not as great as one might think. If you control the guard and exit, then traffic confirmation is relatively easy by just looking at timing and volume of traffic no matter how many hops are in between.

costco commented on Tor: How a military project became a lifeline for privacy   thereader.mitpress.mit.ed... · Posted by u/anarbadalov
WarOnPrivacy · 20 days ago
> I ran a bridge until recently

A lifetime ago, I ran bridges from RAM only distros. But early versions of the Dan list (1st in wide use) killed that.

DL didn't try hard to differentiate between bridge IPs and exit IPs. Server hosts just grabbed the first list they saw and blocked with it.

It was years before the notion of Exit != Bridge became understood but everyone had moved on. We're at the entropic 'No One Cares Anymore' phase now.

costco · 20 days ago
Were you running specifically a bridge or just a non exit relay? Bridges are generally unlisted and are somewhat expensive to mass scrape (the bridge distributors will require captcha or email or Telegram etc) so they are less likely to show up in those lists. Whereas all relays are listed in the consensus and can be trivially enumerated.
costco commented on Bloom Filters by Example   llimllib.github.io/bloomf... · Posted by u/ibobev
costco · 2 months ago
I had used bloom filters in the past without really understanding how they worked. Then one day I decided to implement them just going off the Wikipedia article with the 32-bit MurmurHash function and was surprised at how simple it was. If you're using C++ you can use std::vector<bool> (or as of C++23, std::bitset) to make it even easier to store the bits in a space efficient way.
costco commented on Unauthorized experiment on r/changemyview involving AI-generated comments   old.reddit.com/r/changemy... · Posted by u/xenophonf
costco · 4 months ago
Look at the accounts linked at the bottom of the post. They actually sound real like people whereas you can usually you can spot bots from a mile away.
costco commented on I'm done with coding   neelc.org/2025/03/01/im-d... · Posted by u/neelc
costco · 6 months ago
Clicked on this post because the domain name sounded familiar from the Tor mailing list. I knew you ran a large set of relays but didn't know you were also a pretty extensive contributor over many years! You're definitely smart and if you were able to get a job at Microsoft you're capable of getting a job at most other places, so this doesn't really have to be a permanent decision if you don't want it to be. You can work at Let's Encrypt, Tor, Signal, etc while making an impact and still doing pretty well for yourself. Anyways, in the spirit of this forum, I wish you luck with your startup.
costco commented on Ask HN: Opinion on efforts to find prior art on outrageous priced drugs    · Posted by u/JPLeRouzic
JPLeRouzic · 6 months ago
I wonder if traders could incentivize such searches. I would appreciate pointers.
costco · 6 months ago
Going through the IPR process will cost half a million and a year of time at least. The filing fees alone are in the tens of thousands. It's unlikely anyone would pursue a case unless thought they had a good chance of winning. Kyle Bass's strategy was just to hope that filing the patent dispute would cause the share price of the manufacturer to decline, which it did not in many cases. He ultimately lost most of the time in court, so I don't know how much he actually profited.

Have you heard of Cloudflare's Project Jengo [0] [1]? They were sued by Sable, a patent troll. So they made a website where anyone could submit prior art for any of Sable's patents and they would pay I think around $1000-2000 if it helped their case. Imagine if you had a website where you listed drugs alongside their patents, and a bounty in dollars if you found prior art. The bounty could be funded by hedge funds, generics companies, or other competitors. If the submissions were solid enough, they would take the case to lawyers and hopefully win.

    KEYTRUDA® (pembrolizumab)
    - Bounty: $100,000 
    - Patents:
        - U.S. Patent No. 8,354,509
        - U.S. Patent No. 8,900,587
        - U.S. Patent No. 9,834,605
        - U.S. Patent No. 11,117,961
        - U.S. Patent No. 9,220,776

    ...
You would probably need a lot of connections to make this work. You would also basically create a side hustle for bored patent lawyers or people with a lot of time on their hands. Though the people who are really good at this sort of work probably already make a lot of money, so maybe it wouldn't work.

This is basically your original idea, but there's a monetary incentive. I don't think people with the level of expertise needed to do this would do it for free.

[0] https://www.cloudflare.com/jengo/sable-prior-art-search/

[1] https://blog.cloudflare.com/the-project-jengo-saga-how-cloud...

Edit: but maybe I'm wrong. The CEO of Cloudflare says most of the people who submitted probably would have done it for free: https://news.ycombinator.com/item?id=41732580. But then again, Cloudflare was able to publicize their cause easily among technical people who can understand software patents on places like HN, and there was a moral righteousness element to it because patent trolls are parasites. It might be difficult to inspire the same level of enthusiasm about orphan drugs, and there is also likely a smaller number of people who have the skill to review drug patents.

costco commented on Ask HN: Opinion on efforts to find prior art on outrageous priced drugs    · Posted by u/JPLeRouzic
costco · 6 months ago
If you can successfully challenge pharma patents you can get rich by shorting the stock. A hedge fund manager named Kyle Bass tried this with a couple dozen drugs to varying degrees of success. But also, a patent being invalidated doesn't mean prices come down immediately. ANDAs still take 3-4 years to get approved on average.

u/costco

KarmaCake day2180December 30, 2019
About
chrisjtarry \at\ gmail.com

https://github.com/chris124567

View Original