Readit News logoReadit News
neilv · 7 months ago
I used Tor for surveillance. But an appropriate kind, IMHO.

I used Tor as a small part of one of the capabilities of a supply chain integrity startup. I built a fancy scraper/crawler to discreetly monitor a major international marketplace (mainstream, not darknet), including selecting appropriate Tor exit nodes for each regional site, to try to ensure that we were seeing the same site content that people from those regions were seeing.

Tor somehow worked perfectly for those needs. So my only big concern was making sure everyone in the startup knew not to go bragging about this unusually good data we had. Since we were one C&D letter away from not being able to get the data at all.

(Unfortunately, this had to be a little adversarial with the marketplace, not done as a data-sharing partnership, since the marketplace benefited from a cut of all the counterfeit and graymarket sales that we were trying to fight. But I made sure the scraper was gentle yet effective, both to not be a jerk, and also to not attract attention.)

(I can talk about it now, since the startup ran out of runway during Covid investor skittishness.)

cakealert · 7 months ago
This is not a good way to do this. Tor exit nodes are public and may be marked for special behavior by the marketplace you are surveying. There is no reason to believe you are getting good information this way.

The right way to do this would be through a VPN/tor + Residential proxy to hide your intentions from everyone involved.

neilv · 7 months ago
> There is no reason to believe you are getting good information this way.

Spot checks checked out. And it was a perfectly fine way to do it.

You are correct that Tor exit nodes often get special handling (at the moment, including by Cloudflare, and by Google Recaptcha). And the idea of poisoning of data is starting to propagate, due do anti-AI-scraper sentiment.

electroly · 7 months ago
Next time you find yourself in this situation, a $5 VPN subscription (Mullvad, etc.) gets you the same result without the IP being an obvious Tor exit node. Faster, too, in latency, bandwidth, and the time it takes to change locations. You only care about the VPN part for this, not the onion part.
gear54rus · 7 months ago
Are Mullvad IPs somehow non-obvious? I assume all mainstream VPNs are detected by their IPs and slapped with captchas at best (and blocked outright at worst).
apaprocki · 7 months ago
Worth mentioning a $5 cloud instance and installing Algo VPN on it gets you the same thing without having to trust a 3rd party VPN provider (only a generic VM provider such as AWS). It’s always worth minimizing companies you deal with if you already use AWS, GCP, etc.
amelius · 7 months ago
The $5 will only give you the equivalent of 1 exit node.
RGamma · 7 months ago
> selecting appropriate Tor exit nodes for each regional site

So, a proxy? Onion routing doesn't really play a role for this use case.

neilv · 7 months ago
> So, a proxy? Onion routing doesn't really play a role for this use case.

The onion routing obscured our identity from the "proxy" exit nodes.

Separately, Tor was also a convenient way to get a lot of arbitrary country-specific "proxies", without dealing with the sometimes sketchy businesses that are behind residential IP proxies.

(Counterfeiting/graymarket operations can be organized crime. I'd rather just fire up Tor, and trust math a little, than to try to vet the legitimacy and intentions of a residential IP broker.)

trod1234 · 7 months ago
Honestly what he describes sounds like Raptor (Princeton Report, 2015)
RobRivera · 7 months ago
HEH

I'm letting my imagination fill in the color on the specifics here and I'm working up a little grin.

A hat tip to you

Deleted Comment

globular-toast · 7 months ago
Is it that difficult for a business to colo in various locations? Not rhetorical, I've never thought about it.
sulandor · 7 months ago
exit node is not really a colo.

though there are commercial residential-proxy services available

cedws · 7 months ago
What was the scraper gathering specifically?
neilv · 7 months ago
Listings of items for sale (for ~100 brands), and how that changed over time. With the marketplace having a pretty rich schema to reconstruct from their server-side rendering.

One of the purposes was cold sales outreaches to an exec at a brand, maybe something like, "Here's a report about graymarket/counterfeit of your brand online, using data you probably haven't seen before; we have a solution we'd like to tell you about".

woadwarrior01 · 7 months ago
If I could wager a guess, it sounds like the startup was in the business of scraping Amazon.
anarbadalov · 7 months ago
For anyone interested in this author’s book on Tor, it’s available for free download! https://direct.mit.edu/books/oa-monograph/5761/TorFrom-the-D... (full disclosure: i work for MIT Press)
dannyobrien · 7 months ago
It's a really good book! I was on the very edges of this scene for a chunk of the time described, and I thought it managed to catch a lot of the complexities without picking one possible narrative over another.

Plus I learned a lot -- it came out of some academic research that pursued a unique angle: finding and talking to the Tor exit node operators about their experiences, rather than just say the developers, the executives, or the funders.

anarbadalov · 7 months ago
I'll share your kind words with the author!
bauruine · 7 months ago
You can also buy it if you want to support the autor. https://mitpress.mit.edu/9780262548182/tor/
TMWNN · 7 months ago
Thanks for that. Is it available as epub? I would like to read it on Kindle.
daft_pink · 7 months ago
I think they publicized it so they could obscurely use it for military purposes. The users are easy to spot if they are all military users. Get tons and tons of regular users to use it and you obscure who is trying to hide.
matthewdgreen · 7 months ago
It's unclear if they really did this, or if this was just the pitch they gave to the government. But it was never secret that this was a goal they had explained to the US government: the inventors were pretty straightforward about everything.
fishgoesblub · 7 months ago
I've also read this at some point. Bit hard to have deniability if you're hacking into $ENEMY_COUNTRY servers using a network that only the US Government has access to.
esseph · 7 months ago
This is exactly it from what I have heard. I have heard this from a large number of trustworthy sources over the years.
schoen · 7 months ago
I think we have to distinguish between "the Navy proposed using it this way for this reason" (clearly they did, in writing!) and "the government actively uses it this way for this reason" (extremely hard to confirm).

I've met law enforcement people who talked about using Tor for anonymity during investigations, but in context they were looking for anonymity on the exit side rather than the entry side (so, a traditional VPN would have worked too). The original proposal about onion routing is focused on the security provided on the entry side (preventing local telecommunications operators from knowing whom you're communicating with).

Deleted Comment

palsecam · 7 months ago
Btw, a Tor relay can be relatively lightweight. I run one on a $5/mo VPS (which does many other things). You need 1 GiB of RAM, but a single basic CPU core largely suffices. My relay sends/receives ~150 GiB of traffic per day (~15 Mbits/s). It’s not an exit node, so no legal worries.

Here’s my torrc:

  SocksPort  0
  ExitRelay  0

  ORPort     NNNN
  DirPort    NNNN

  Nickname     X
  ContactInfo  X@X.com

  RelayBandwidthRate    80 megabits
  RelayBandwidthBurst  120 megabits

  MaxMemInQueues  384 megabytes

  AvoidDiskWrites  1
  HardwareAccel    1
  NoExec           1
  NumCPUs          1
Here’s my override config for systemd (Ubuntu 24.04):

  $ sudo systemctl edit tor@default
  [Service]
  Nice=15
  CPUAffinity=0
  CPUWeight=60
  StartupCPUWeight=6
  IOWeight=60
  TimerSlackNSec=100us

  MemoryMax=896M
  MemoryHigh=800M
  OOMScoreAdjust=1000

  LimitAS=2G
  LimitNPROC=512
  LimitNOFILE=10240

  PrivateDevices=true
  ProtectSystem=true
  ProtectHome=true

ricardo81 · 7 months ago
I'd never used Tor, though had to scrape a bunch of things that required different IPs. I figured their endpoints were already tarred.

With the porn block in the UK though, the "New Private Window with Tor" in Brave is very convenient.

Maybe not for long, or maybe not. I guess websites don't need to comply beyond a certain point.

There are tons of "residential proxy" and whatnot type services available, IP being a source of truth doesn't seem to matter much in 2025. The Perplexity 'bot' recent topic being an example of that.

Basically if you want to access any resource on the web for a dollar a GB or so you can use millions of IPs.

SV_BubbleTime · 7 months ago
>With the porn block in the UK though, the "New Private Window with Tor" in Brave is very convenient.

Has someone interested in seeing privacy secured into the future, I’ve been happy that governments are accelerating their censorship for this reason.

chii · 7 months ago
Tho trying to solve a social issue with technological solutions is just going to force the other side's hand more. Tor/vpn might work today, but there's no telling what new laws tomorrow will be enacted to ban vpns. And having such laws (that everyone breaks) is just a way for the gov't to selectively punish those they deem troublesome - a great chilling effect.

The way to fight these draconian laws is via democratic means - even if it takes long and arduous. For example, enshrining privacy into constitutional guarantees etc.

freedomben · 7 months ago
Indeed, I've investigated some cyber attacks recently that came from residential IPs in California and NY, though investigation turned up the real origins as coming from India. It's pretty easy to pull off nowadays
deadbabe · 7 months ago
Any tutorial?
trod1234 · 7 months ago
The problem with most infrastructure is that there's a big gap in security where it centralizes, and its transparent.

To understand how, you should review the Princeton Report's Raptor attack, and understand how it works (2015).

jmclnx · 7 months ago
I ran a bridge until recently, but the server died a heat death after I moved to another apartment :(

I have not yet had time to find a suitable replacement machine. But running a bridge is a cheap, safe low network volume method people can help out from home. I had it going to help people in 'bad' countries to get out to the rest of the world.

https://community.torproject.org/relay/setup/bridge/

WarOnPrivacy · 7 months ago
> I ran a bridge until recently

A lifetime ago, I ran bridges from RAM only distros. But early versions of the Dan list (1st in wide use) killed that.

DL didn't try hard to differentiate between bridge IPs and exit IPs. Server hosts just grabbed the first list they saw and blocked with it.

It was years before the notion of Exit != Bridge became understood but everyone had moved on. We're at the entropic 'No One Cares Anymore' phase now.

costco · 7 months ago
Were you running specifically a bridge or just a non exit relay? Bridges are generally unlisted and are somewhat expensive to mass scrape (the bridge distributors will require captcha or email or Telegram etc) so they are less likely to show up in those lists. Whereas all relays are listed in the consensus and can be trivially enumerated.
crmd · 7 months ago
I assume when I’m using Tor that every packet is the under the highest level of collection/analysis priority. I think maybe sometimes it’s better to blend into in the crowd
beeflet · 7 months ago
that short-term thinking is what makes it impossible to blend in the long run