Readit News logoReadit News
Posted by u/miki123211 16 days ago
Tell HN: YC companies scrape GitHub activity, send spam emails to users
Hi HN,

I recently noticed that an YC company (Run ANywhere, W26) sent me the following email:

From: Aditya <aditya@buildrunanywhere.org>

Subject: Mikołaj, think you'd like this

[snip]

Hi Mikołaj,

I found your GitHub and thought you might like what we're building.

[snip]

I have also received a deluge of similar emails from another AI company, Voice.AI (doesn't seem to be YC affiliated). These emails indicate that those companies scrape people's Github activity, and if they notice users contributing to repos in their field of business, send marketing emails to those users without receiving their consent. My guess is that they use commit metadata for this purpose. This includes recipients under the GDPR (AKA me).

I've sent complaints to both organizations, no response so far.

I have just contacted both Github and YC Ethics on this issue, I'll update here if I get a response.

martinwoodward · 16 days ago
Martin from GitHub here. This type of behaviour is explicitly against the GitHub terms of service, when we catch the accounts doing this we can (and do) take action against those accounts including banning the accounts. It's a game of whack-a-mole for sure, and it's not just start-ups that take part in this sketchy behaviour to be honest. I've been plenty of examples in my time across the board.

The fundamental nature of Git makes this pretty easy for folks to scrape data from open source repositories. It's against our terms of service and those folks might want to talk with some lawyers about doing it - but as every Git commit contains your name and email address in the commit data it's not technically difficult even if it is unethical.

From the early days we've added features to help users anonymise their email addresses for commits posted to GitHub. Basically, you configure your local Git client to use your 'no-reply' email address in commits and that still links back to your GitHub account when you push: https://docs.github.com/en/account-and-profile/reference/ema...

I think that's still probably the best route. We want to keep open source data as open as possible, so I don't think locking down API's etc is the right route. We do throttle API requests and scraping traffic, but then again there have been plenty of posts here over the years from people annoyed at hitting those limits so it's definitely a balancing act. Love to know what folks here think though.

david_allison · 15 days ago
> when we catch the accounts doing this we can (and do) take action against those accounts including banning the accounts.

This isn't my experience. I requested that you looked into a spammer in July 2025, you ignored my reply and the account is still active.

----

Thank you so much for the report. We're sorry to hear you're receiving unwanted emails, but it's always a possibility when your public contact information is listed on the web. You can keep your email address private if you wish by following the steps here:

Setting your commit email address

We do expect our users to comply with our Terms of Service, which prohibits transmitting using information from the GitHub (whether scraped, collected through our API, or obtained otherwise) for spamming purposes. I'm happy to look into it further to see if we can contact the reported user and let them know that this type of activity is not allowed.

Please let us know if you have any other questions or concerns.

----

My reply which was ignored:

----

I understand it will happen from time to time. I'd rather be contactable (I've received legitimate emails today because my email is on my profile).

Please take further action. My email is public with the expectation that the ToS will be enforced. If GitHub isn't discouraging spammers then it makes it much harder to justify being contactable.

All the best, David

gettingoverit · 15 days ago
I reported spammers ~5 times to GH, and every time the account went down in a couple of hours. Obviously mileage may vary, but I don't want the whole HN to think this process is completely broken.

Please keep reporting spammers, usually it works.

tom_m · 15 days ago
It's impossible for them to stop if you list your email on there. They could make it harder of course. But if you put your email out there for a human to find, then a script or bot or also find it.

And yes of course they can also stop a specific spammer. But that spammer may pick up another account and email.

Aachen · 15 days ago
>> it's always a possibility when your public contact information is listed on the web

Sounds correct to me

> Please take further action. My email is public with the expectation that the ToS will be enforced.

What magic wand are you expecting they wave that distinguishes people who need your email address for legitimate from those who need it for illicit purposes? Why wouldn't we apply the same to the entire population and lock up criminals before they've committed crimes?

What you're asking is entirely impossible short of mandatory mind reading

Rapzid · 15 days ago
Yeah they likely rarely if ever "look into" it and certainly nobody has ever needed a lawyer over this.

As recently as a year or so ago, at least, you could list repo stargazers through their graphQL API and get a TON of email off that depending on the user settings.

retlehs · 16 days ago
I’ve made over five reports for this exact spam scenario, and never once have y’all acted on them. I have a hard time believing you ban spam accounts that clearly violate your ToS.

I even wrote about a specific example of a YC company spamming me from my GitHub email at https://benword.com/dont-tolerate-unsolicited-spam

eli · 16 days ago
How would you know whether the account that did the scraping was banned?
Aachen · 15 days ago
How did you connect joe@legitbusiness.com, where spam usually originates from for me (hacked email accounts), to a specific github user account that was used to scrape the data, which microsoft can choose to ban? And that's assuming they believe you're being truthful and not simply angry with the user whom you're reporting
koito17 · 16 days ago
I don't have any specific suggestions, but I do want to give thanks for implementing functionality to block pushes if the email field is *not* using an anonymized mail address.

It's one thing to offer anonymous e-mail addresses, but it's also awesome that GitHub can help prevent mistakes that would otherwise leak a user's e-mail address. I am not sure how many people try to be privacy conscious on GitHub, but I assume most users don't, so it's nice seeing this little feature exist.

dathinab · 15 days ago
It gets more complicated when commit signing, the widely broken web of trust (for the signing key) and similar are involved.

And not all devs want or need anonymity on github.

In general just because information is publicly accessible in some form doesn't make it okay or legal to abuse it (accessible doesn't mean any form of usage rights are transferred to you weather it's in context of GDPR or in context of copy right).

ayhanfuat · 16 days ago
I am also getting constant spam because apparently they can see who starred a repo (i.e. I see you starred repo x and we are doing something similar). I am not starring anything anymore.
skwashd · 16 days ago
I know it is against the ToS. I've reported multiple organisations doing this. Last time I reported one, support closed the ticket saying the activity is off platform so they can't do anything.
danesparza · 16 days ago
I didn't realize this was against the Github TOS - I just thought it was par for the course for recruiters nowadays. This is good to know!

How do I report that person, though? Your support page about reporting abuse assumes I know the person's Github account: https://docs.github.com/en/communities/maintaining-your-safe...

blobbers · 16 days ago
Scrape once, spam forever.

I think it's pretty clear you need to use an anonymization scheme in the way commits are handled so that it links back to your github account and the email addresses are kept private.

Privacy centric companies like Apple do this for users offering hashed emails, on a per login basis.

I'm sure this would not work in a world of scraping, but having that kind of ability to figure out bad actors would be nice. You could require authenticated users for certain kinds of requests, and block user information from non-authenticated requests.

david_allison · 15 days ago
They already do[0]

    62114487+david-allison@users.noreply.github.com
this includes a unique ID which survives account renames, and the name of the GitHub account at the time.

[0] https://docs.github.com/en/account-and-profile/reference/ema...

realityloop · 15 days ago
I've received several of these types of messages including Voice.ai one mentioned in comments, and the following today:

Tonho<tonho@tonho.wtf>

Hey, I found your GitHub profile and thought you might find this useful.

I've been building Omniget, a desktop downloader that works with YouTube, Telegram, Udemy, Hotmart and 1000+ other sites. It's open source and built with Rust and Tauri.

The part I'm most proud of: you don't even need to open the app. Just press a hotkey and it grabs whatever video you're watching.

I've been working on this for a while now, even got an artist to design a mascot. I'm shaping the app based on feedback from people who actually use it, so if you have any thoughts I'd love to hear them.

Here's the repo: https://github.com/tonhowtf/omniget

Thanks for your time!

Tonho

AznHisoka · 16 days ago
Maybe I am missing something, but can’t you simply not show the email address in a git commit? (Sincere question, not saying this is trivial. i am dumb and like to ask dumb questions even if might be embarassing)

If someone wants to message someone, it goes through github notifications or github emails them

Also banning an account doesnt seem like a heavy punishment, given they can simply move to gitlab, bitbucket etc

easton · 16 days ago
Git commits have a email address as a required field[0], although some people put something bogus in there. And then it's in the data provided when you clone the repo onto your machine even if you aren't using the GitHub APIs.

To his point, you can set that to the no-reply email address GitHub gives you if you don't want mail but do want the commit to be linked to your GitHub account.

[0]: https://git-scm.com/docs/git-commit#_commit_information

EdNutting · 16 days ago
That would be a fundamental change to how Git works, not just GitHub. Even if the web UI didn't show it, a simple `git log` would reveal it.

You can mask your email address in git commits but a lot of open source projects won't accept that. And some pseudo-open-source ones insist on sending you an email to authenticate before they'll give you access to the GitHub repo (looking at you Unreal Engine!)

So, no, I don't think they could simply "not show the email address".

miki123211 · 16 days ago
Git commits are identified by a hash of their entire contents[1]. The way hashes work, if you change even one bit, the hash becomes completely different. Every commit contains the email address of the committer and the hash of the parent commit. If the email address in even one commit is changed or removed, that changes its hash, which in turn requires you to update its children, changing their hashes etc. So, updating a commit from n years ago requires you to update all commits that have been made since. By default, git will refuse to pull from such an updated repository, as commits are considered immutable once pushed.

[1] In practice, it's a bit more complicated. Merkle trees are involved, so it's hashes of hashes of hashes instead of hashing a multi-gigabyte blob on each commit, but that's a performance optimization that doesn't affect semantics much.

dent9 · 16 days ago
You should be using the email address "username@no.reply.github.com" or similar

There's never been an obligation to use a real email address for git

just6979 · 15 days ago
What section of the ToS prohibits this? In other words, what is the thing that is being done that is against the ToS? Looking up the creator of a repo, or the contributors of the repo?

I did a quick scan of the ToS and all I could find was D8 that states that autmated access (scraping) used for "AI" applies a reciprocal license that prevents the scraper from restricting GitHub's access to the data (the whole model? the weights?) resulting from the scraping.

This makes it sound like any model trained on GitHhub content cannot be commercialized, because charging for access to the output would be a "technical or other limit"... So you're obviously not really enforcing this, otherwise MS would be suing every big commercial model out there!

wrs · 15 days ago
It seems like a safe assumption that the big commercial models will have negotiated their own private GitHub terms of service, especially considering their many-digit annual contracts with Azure.
shawmakesmagic · 15 days ago
FYI I get about 5 of these a week. It is pervasive. If someone wants to scrape my email that's one thing, but the number of recruiters who are like "I saw your repo <some ancient repo of mine> and I think you'd be a great fit for our new position in AI agents..." so they are both scraping my e-mail and all the metadata to personalize their pitch to me (poorly).
ericol · 16 days ago
I've had more than a few instances of this over the past 2 years, and my reply is exactly the above.

"What you are doing is against Github's TOS"

nickphx · 15 days ago
How about improving the processing of abuse reports for repos hosting windows malware that is actively being advertised to potential victims? https://github.com/preconfigured/dl/blob/main/ms-update32.ex...
TheSaifurRahman · 16 days ago
Are no-reply emails associated with the accounts if the username is changed? That's one reason why I switched back to my personal email.
martinwoodward · 15 days ago
Since 2017 they are yes.
Foxboron · 15 days ago
I have reported several spam emails to Github and from what I can tell none has been acted upon.
dent9 · 16 days ago
Amazon did this to me. Their recruiters started hounding me at an email address that I only ever used to sign git commits on some repos used on GitHub. When I asked them how they got my email address they said "it was in [our] database"
trympet · 16 days ago
Nice, thank you Martin. How do you punish the fraudsters? Do you send them to prison over CFAA violation terms of service?
martinwoodward · 16 days ago
I kinda wish I had that much power. There would certainly be less people in the world listening to their phones without headphones..

Usually starts with contacting them over email reminding them of the terms of service and warning them to stop. Then their account might get deactivated and they need to write and promise to not be naughty again. If they ignore that then the account gets removed.

There are a bunch of automated checks that are running all the time as well and will take automated action that then gets later reviewed by humans. At lot of times the process is fast-tracked.

The off-platform 'let's scrape a bunch of data and then spam nice people' is the hardest to police. Linking those mails to an offending GitHub account is hard and very manual, also anyone can send emails saying they are someone they are not and because of that anyone can deny they sent the mail and they'll usually blame a rogue agency they where working with etc.

I probably shouldn't say it, but the public shame that comes from being mentioned on social, in hacker news etc. That stops people who want to be treated as legitimate from doing that sort of thing and helps educate the wider community around what is and isn't acceptable behaviour - that is why it's good to see this thread and see the issue getting attention.

nerdsniper · 16 days ago
> CFAA violation terms of service

This would be a gross miscarriage of justice and bringing successful action under this theory would do widespread harm by expanding the definition of the CFAA.

Just because a company can take some nuclear action, doesn't mean they should.

skeptic_ai · 16 days ago
Will send a strong email: Don’t do bad things.
miki123211 · 16 days ago
I've raised this as ticket ID 4114793, just in case.
blibble · 15 days ago
> it's not technically difficult even if it is unethical.

kettle, pot, black?

I received the following offical spam last week from GitHub:

> Build AI agents with the new GitHub Copilot SDK

despite never granting consent for marketing material

(and yes, there's a GDPR complaint now working its way through the national regulator)

moomoo11 · 16 days ago
Ban them. Honestly I get the same and it is beyond frustrating.

I will pay more for GitHub if you go hard on these mfs.

observationist · 16 days ago
Hey, Martin - https://github.com/lucidrains

Mind fixing lucidrains account? Something happened without notice or recourse. He's one of, if not the most well known open source AI researchers on the planet, with implementations and explanations of papers and ideas that are wonderful. If you could bring some sanity to that situation and take it out of whatever kafkaesque account purgatory it fell into, you'd be doing the work of angels.

Thanks!

davnn · 16 days ago
What was happening with this account? I was often seeing popular but empty (only title of the paper and maybe a short readme) repositories that were created directly after a paper was published?
nextaccountic · 15 days ago
Is this mirrored on gitlab or somewhere else? Nobody should trust Github to store all their data
keiferski · 16 days ago
I've spent a lot of my career marketing to developers, and spamming their GitHub account might be top 1 or 2 worst marketing tactics you can use.

Cold emailing rarely works by itself. Cold emailing developers via emails you pulled from their GitHub accounts? At that point, you're actively harming your brand, and may as well just send them spam diet pill ads.

RandallBrown · 15 days ago
If someone took the time to look through my GitHub contributions then pitched me with a job relevant to that work I would absolutely consider them. That's exactly the kind of recruiter I would like to work with.

If it's obviously just a bot scraping emails and sending generic job requests, that's very different.

devmor · 15 days ago
> If it's obviously just a bot scraping emails and sending generic job requests, that's very different.

It's not even that nice. They scrape emails and send cold calls to try to get you to purchase their services.

jamesfinlayson · 15 days ago
Yeah this - I got one of these emails someone sniffing around my GitHub not that long ago and it wasn't immediately obvious that it was a scammy recruiter, so I responded to sound out if they were actually interested in one of my projects. Got the same generic response about let's work together on something so I didn't respond.
genxy · 15 days ago
Find everyone who starred this repo and did a PR against these 10 repos is within reach of all marketers now. I just told them how.
keiferski · 15 days ago
Yeah I mean as a marketing tactic to sell your product. An employer / recruiter offering you work this way is different.
polishdude20 · 15 days ago
Wait why? That seems like the high effort and high specificity thing that I'd love to get.

You searched for people who do what you need to have done, found me, looked at what I've worked on and determined I'd be a good fit and you reached out? That's the number one way to get me to want to work for you.

woah · 15 days ago
> You searched for people who do what you need to have done, found me, looked at what I've worked on and determined I'd be a good fit and you reached out? That's the number one way to get me to want to work for you.

No, their email templating tool finds an old throwaway repo you did 6 years ago, templates its name into a form email, and invites you to join a cattle call to be whiteboarded along with the rest of the shmucks

rapfaria · 15 days ago
"Work for you"? They ain't hiring my friend, they are spamming their product to your inbox, not sending a career opportunity
an0malous · 15 days ago
Ever wonder why YC has the "Describe a time you most successfully hacked some system to your advantage" question? It's because they select for founders that are willing to take advantage of legal gray areas. Airbnb repeatedly violated Craigslist terms of service and called it "growth hacking." Reddit stole content from Digg and faked users. OpenAI trains their models on copyrighted content.
armchairhacker · 16 days ago
I remember this being discussed a while ago

https://news.ycombinator.com/item?id=9332418 (11 years ago)

https://news.ycombinator.com/item?id=20660624 (7 years ago)

https://news.ycombinator.com/item?id=27855152 (5 years ago)

https://news.ycombinator.com/item?id=30900237 (4 years ago)

Seems it’s a reoccurring issue

unfunco · 16 days ago
I also had unsolicited spam from Vincent Jiang of Aden, another YC company.

    Hi Daniel,

    I just came across your profile on social media and wondered if you'd be interested in joining our Discord community for AI agent development. Currently, we see that agents break, loop, get lost, hallucinate, and cost a fortune, and therefore built a space where developers can share challenges and insights.

unfunco · 16 days ago
…and more from Backdrop.

    Hi Daniel, I found your GitHub profile while searching for anthropic projects, and got your email from your profile.

    I'm part of an online program for builders called Backdrop Build, and I think that program would be a great fit given what you are building. We have a track for builders in AI like you, it's fully online/remote and costs nothing to participate. It also works if you have a day job, it's light on time and perfect for side projects!
And then another after I marked the first one as spam and ignored it.

    Checking in one last time to see if you have any questions about the program or the application. If it's not for you, all good - just ignore the email because I won't be pinging you again :)

   Joey from Backdrop
Both companies have guaranteed that I won't use their services nor procure them for any organisation I work for.

agmater · 16 days ago
Hey it's Joey checking in again. We noticed you mentioned our company, let me know if you have any questions about our (free!) program. I'll go ahead and email you some more info, just in case.
foldr · 16 days ago
I had a similar one from that guy asking me to make open source PRs to some repo of theirs for, err, $25-50/hour. I replied explaining that senior software engineers in the UK aren’t quite as desperately poor as that, and got a canned response saying that they were looking forward to reviewing my PRs :D
shunia_huang · 15 days ago
Blows my mind that you guys are so expensive lol.
anezjonathan · 8 days ago
Hi git,

I just came across your profile on social media and wondered if you'd be interested in joining our Discord community for AI agent development. Currently, we see that agents break, loop, get lost, hallucinate, and cost a fortune, and therefore built a space where developers can share challenges and insights.

So far, more than 8,000 members are sharing technical insights, agent templates, and tooling on a daily basis. We're also open-sourcing new toolings based on everyone's feedback.

Let me know if you are interested in joining; you can find the Discord link invite on our site (Adenhq).

Best, Bryan Zhang

--

Hive - Framework for autonomous, adaptive agent Bryan Zhang Co-founder and CTO If you're not interested, please feel free to ignore this message; no further contact will be made.

cyann · 16 days ago
Got this spam today on my GitHub address, YC affiliated:

From: henry@joincactuscompute.com

Hey,

I hope all is well with you, just reaching out as you seem to be interested in on-device speech models.

Cactus is a low-latency AI engine for consumer devices like phones, Macs, wearables, Raspberry Pis, etc.

We support transcription models like Whisper & Parakeet, benchmarks available in the attached GitHub repo.

GitHub: https://github.com/cactus-compute/cactus

We are keen to get your feedback, and star if feeling generous.

Thanks a million

ignoramous · 16 days ago
> star if feeling generous ... Thanks a million

A 419 scam?

mattpal21 · 15 days ago
Atleast they didn't ask for stars lol, but great to see how fast they're iterating!
bakugo · 16 days ago
This sounded familiar, so I checked my inbox and I did indeed receive a similar email from sanchitmonga@runanywheresdk.com earlier this month:

> I came across your GitHub profile and thought you might be interested in what my team and I are building. We're developing an open source SDK that runs LLMs directly on-device.

What's even more interesting is that both buildrunanywhere.org and runanywheresdk.com show a stock hostinger parking page when accessed in a browser. Something tells me they're intentionally registering these "alternate" domains specifically for spam, to avoid tanking the email reputation of their main runanywhere.ai domain.

I guess I shouldn't be surprised given YC is going all in on AI and most AI companies are no better than the crypto scammers of yesteryear, but still.

Imustaskforhelp · 16 days ago
I observed the same thing and it was only when you told me the main domain that I found their website.

> Something tells me they're intentionally registering these "alternate" domains specifically for spam, to avoid tanking the email reputation of their main runanywhere.ai domain

This is a really bad look on them.

https://www.whatsmydns.net/domain-age?q=buildrunanywhere.org and https://www.whatsmydns.net/domain-age?q=runanywheresdk.com

Both these domain were registered only 36 days ago

Their main domain had been around for 6 month (216 days) tho:- https://www.whatsmydns.net/domain-age?q=runanywhere.ai

(I also couldn't see any post created by them on YC checking algolia from their website fwiw)

Seeing their star history on their product, I see some few interesting observations[0] Their star history was almost horizontal between december and february until it got vertical all of a sudden.

[0]:https://www.star-history.com/#runanywhere.ai/runanywhere.ai&...

I looked through their linkedin and found this website owned by them as well https://www.openclawpi.com/ and using the YC brand here as well. (registerered 26 days ago)

This website looks fairly AI generated to me as well and there are some bugs within the original website as well which I am now incredibly more unsure of if generated by AI or not given the similarities between the two websites UI/UX as well.

Deleted Comment

elwebmaster · 16 days ago
Just got a SPAM email from a Github scraper while reading this thread:

From: james@techglobal.website Quick note – your GitHub profile Hi X,

I came across your profile on GitHub. Given you're based in the US, I thought it might be relevant to reach out.

Profile:

I run a technical team (full-stack, cloud, DevOps) that delivers for clients. We're looking to work with an engineer based in the US on client-facing coordination—discovery, requirements, alignment—while we handle delivery. If that might be relevant, I'd be glad to set up a short call.

Regards, James

If I had to guess, "James" is a North Korean looking to scam US clients, based on my experience with shady actors.

max__dev · 16 days ago
Checked my spam after seeing this thread and found the same sender/email. Subject and signature are slightly changed.

From: james@techglobal.website Brief note – Following up on your GitHub work

Hi ,

I came across your profile on GitHub. Given you're based in the US, I thought it might be relevant to reach out.

Profile:

I run a technical team (full-stack, cloud, DevOps) that delivers for clients. We're looking to work with an engineer based in the US on client-facing coordination—discovery, requirements, alignment—while we handle delivery. If that might be relevant, I'd be glad to set up a short call.

Best, James

vintagedave · 15 days ago
I'm curious, what leads you to North Korean from that email? Is it that there's an anonymous team, which has a US "front"?
elwebmaster · 15 days ago
Yes, having a US "front" is how North Koreans pass the identity verifications at US companies looking for remote workers. I have personally spoken with numerous such individuals. Think about it, if you were a legitimite organization attempting to gain US presence would your first action be to SPAM individuals on Github or to register a business and submit a job post on LinkedIn?