This is simultaneously an epic clickbait and a very accurate represenatation of reality that is very boring and not shocking at all. Congrats to whoever wrote the line.
>So I obviously assumed they're letting China scan my private repos.
To clarify, it's Microsoft/Github doing the scanning of private repos on behalf of the partners. They're just forwarding the tokens that match the partners' regexp.
Yeah I read the article and the comments on HN so I know what it's about now. I still think they (not HN) should change the title to include what secret scanning means.
Edit: how about dropping the corporatese and title it "github will now scan public repos for secret WeChat tokens"?
Devils advocate: I read recently that GitHub is being used to circumvent censorship in China. Does this system of allowing them to provide regexes allow China to automatically obtain lists of users who are mentioning certain words or phrases? Or is that nonsense?
I think the name of the service is a bit ambiguous; they could've called it "Access Key Scanning" or even just "Secret*s* Scanning". Even capitalizing it would set it apart as a service instead of regular words in a sentence.
But this is an excellent next step where they build an integration with these partners where, as soon as a secret is scanned, they can notify tencent/AWS/other providers automatically to instantly invalidate those keys before they’re abused.
Hmm, 2022 (2023 soon!) and people are jumping and screaming on their first and incorrect reaction to some headline (not you, but scroll down for a comment doing just that). God Bless The Internet! /s
To everyone portraying this as harmless and as Wechat just looking for security breaches: Tencent itself is the security breach. Not only can Chinese ppl not sign up without providing a phone number, just to get a SIM card they now take your government ID, a picture of your face and a fingerprint! Xi is making absolutely sure that every single internet user is IDed and has their conversations tracked on apps like Wechat. Whatsapp, Signal & co are banned.
These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked. It might not be a WeChat secret at all who knows? They're not a trustworthy partner, nothing should be shared with this company.
And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help. Obviously GitHub is supporting their scanning efforts here.
> And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help.
GitHub has a global stream API for all public events,[1] but it is delayed by five minutes, precisely so that sensitive actions like revoking leaked tokens can be performed before the world sees them. That’s what the secret scanning program is about, and you would have known if you spent 1/3 of the time of your rant learning about it.
Edit: Additionally, for private repos, secret scanning is opt-in and only alerts owners.
Wait a second, the requirement of a government to get a sim card is kinda standard practice in multiple countries. Also, when it comes to privacy, US based companies must be last ones to talk, like as if China is the only bad guy who infringes upon peoples right to privacy.
China is dangerous, but it's not the only dangerous thing in the room.
Also, your comment doesn't make sense. If you are committing your public credentials while diseenting against the government, you are doing it wrong.
Also, any publicly committed credentials are like literally tracked by thousands of both within minutes. Its not like if China really want to scan them, they can't do it without Github telling them they found something.
You may have misunderstood. There is no way to anonymously access Weixin from China unless you have hacked credentials. You need a phone number. Note that local Weixin and foreign Wechat are not the same. Last time my Mainland friend bought a SIM card the vendor had a government app on his phone, snapped a picture of my friend's face, scanned the ID (身份证) and had him take a fingerprint with a reader he also had connected to his phone. All this data gets uploaded directly to the Chinese government.
There isn't a country in the world which does this. But the details are also not the main point, it's how extremely restricted and controlled simple access to information or forums of free expression is for people in China. Tencent has party officials working within the company. This isn't a regular business as Westerners might imagine it, it's an extended part of the CCP just like any other large corporation under Xi.
Again, people are saying it's no big deal but why would GitHub help them at all? It's not a good cause.
This is offtopic (as it has nothing to do with the linked blogpost) but it's even worse. At the tail end of 2019 I went to China for a few weeks, I created a WeChat account at home without any problems. As soon as I stepped into China it got locked and I needed someone with a WeChat account to verify me. They can only verify (I think?) 3 new accounts per year, and 6 accounts that got locked out for whatever reason. This is (from my perspective) even worse than requiring ID for SIM etc. It links people together and I'm sure it brings some repercussions to the people that verified you if you make trouble down the line.
It was very fascinating to see, a near total domination of WeChat everywhere and relatively very hard onboarding for new accounts. Contrary to the west where most of services seek to streamline onboarding as much as possible - I guess that becomes an anti feature when you have total monopoly and _everyone_ has a WeChat account. I think it's a very effective (and very dystopian) form of control.
P.S: Signal worked without any problems for me, even on a Chinese SIM (one "trick" to go around most of the GFW was buying a HK SIM in HK. Works across china and has a lot less blocks, but for various reasons I got a China SIM too).
This is a service running on public repos, anyone can scrape this which is the problem. GitHub does the scanning and all that is forwarded is the "secret" matching their regex. Tencent then identifies the account owner and informs them about the public secret. That's all.
GitHub is available in China, why shouldn't they protect their Chinese users?
And the SIM card requirements have nothing to do with Tencent, have you tried getting a SIM in Germany? Impossible without government ID and an address. And there are a lot of services which you can't sign up for without German ID / address. As a foreigner I also can't easily open a bank account in the US.
Without taking away from your first paragraph at all, if any dissidents are publishing their access codes to GitHub repos, they are 1) doing it completely wrong and 2) are already screwed.
The threat here, in the worst case, is associating a GitHub ID with a WeChat ID.
> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.
This is GitHub scanning private repos and telling WeChat about them.
WeChat can already scan public repos.
They are not already screwed if they’re publishing something to a private repo, it might be the wrong way to do it, but it doesn’t mean they’re already screwed.
If you don’t trust GitHub’s private repo security then why are you using it in the first place?
Imagine if someone protested against Finnish–Russian cooperation on search-and-rescue operations near their border because the evil Russian government could be searching for political dissidents to imprison. That’s what your comment sounds like.
This is about preventing things like API keys from being published to code. That’s not a dissident use-case…
While Tencent and Wechat sound absolutely dystopian, the "you need a Government ID and a picture of your face" is often a requirement for creating a Facebook account or retaining your old one as well. Twitter also used to require a phone number to retain an active account; and Google frequently locks people out of old accounts unless they provide a phone.
Is this whataboutism? Possibly – but what I'd actually like to happen is US-based companies are charged company-hurting fines for mismanaging PII like this (Twitter, for example, is currently openly planning to sell user phone data [1] that they previously gathered for security purposes).
All this to say, we can't reasonably call out other dystopian companies if the ones we use everyday are doing the exact same thing. So we should call out secret scanning from Meta [2] and (if it ever happens) Twitter as well.
> These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked.
"Leaked" here means "made public", i.e. "published such that literally anyone can use them", for example when burned into a commit of a public repo. Even for a dissident, publishing an API key or other credential where literally anyone can find it to use it, is almost assuredly a mistake. Because external scrapers can also find it there, such that the key will be inevitably picked up and fed into a botnet to abuse — at which point the ops staff at the service will notice the abuse and revoke the key, thus "burning" it as useful from the dissident's perspective.
If you store a secret on Github somewhere that only people and people you trust have access to, rather than everyone having access to it, then this is not considered a "leak", and so Github does not detect this as a "leaked secret." For example, commit data of private repos is not scanned for secrets (if it was, GitOps as a concept would be impossible!); nor are a repo's formal Actions Secrets store (part of a repo's configuration readable only by triggered Github Actions CI jobs).
Github's own secret-scanning here, is trying to catch the cases where a user has done something stupid by accident. Whether or not they reported secrets to third parties, they'd still be doing leaked-secret scanning of their own Github API keys, to ensure that people aren't accidentally trying to configure Github Actions by burning their Github Actions CI API key into the workflow itself. If they find such keys, they revoke them.
The point of Github's secret-scanning partner program, is that because Github is doing this leaked-secret scanning for their own purposes anyway, you (the partner) can sign up to be told when API keys of yours are accidentally made public as well.
> That makes no sense, then they don't need GitHubs help.
Ignoring for a moment that Github is a website, and so anyone can just crawl it—
Did you know? Github pushes the commit data of all public repos to BigQuery as a public research dataset: https://codelabs.developers.google.com/codelabs/bigquery-git.... Literally anyone can do their own "secret scanning" with a simple BigQuery query. It costs about $500 to run such a query, because the Github dataset is pretty large. It's not a price most SMEs would pay. But it's definitely a price attackers could be willing willing to pay. It's a lot cheaper than running your own web-spider infrastructure!
The difference with Github's own secret scanning, is that it happens synchronously, on push of commits; whereas the ETL of commit data to Github et al happens asynchronously, some time after commits happen. Tencent — and every other secret-scanning partner — depends on Github to stay ahead of any third-party attackers trying to scrape leaked credentials for use in botnets et al.
Also, FYI, you yourself can sign up to be a Github secret-scanning partner. You just need 1. a regex that uniquely identifies your secrets, so that Github can recognize them on push, and 2. a webhook URL to report them to. (https://docs.github.com/en/developers/overview/secret-scanni...)
And by the way, this isn't a hypothetical nice-to-have. I run an API SaaS — and not one that's even very large, in relative terms. But my own customers' accidentally-leaked secrets have been scraped from their Github repos and used by botnets already! Signing up as a Github secret-scanning partner is on my to-do list.
Here’s what I just copied from the blog post without modification:
> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.
It’s not just public repos, it’s private repos too.
However, this is already a well established and useful thing. When you publish your AWS (for example) secrets to your public repo, it will scan it and stop it leaking before damage can be done. This is just the same for another service.
It would be nice of Github if they could publish a transparency repo with all the partners and all the regex along with this initiative. I see a lot of people in this thread worried that "China gets their data" and this transparency repo could alleviate some of that.
Why do people worry about China so much? There is barely any cooperation between Chinese intelligence and the rest of the world.
If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it. My own government and its allies is infinitely more dangerous to me than such a foreign one.
Because if you want to influence the voting patterns of a population, knowing as much as you can is useful. Search history + TikTok + FB would give you unbeatable datasets that you could use for the lifetime of the person. Take a dataset now, add a decade of AI progress and I’m pretty sure a nefarious actor would be leading many people around by the nose. Not all, but 5% would move many elections.
They wouldn’t even need to learn all that much about you as an individual. Just enough to match you with a cluster from their own population that they have infinite data on.
> If I was forced to pick one government to share my secrets with, it would be the Chinese, because there’s nothing they can do about it.
What makes you so sure about that?
I worry about China because there’s no internal checks to prevent them from doing anything.
Western governments and allies have a long culture of court systems and thinking about balancing constituent needs. That is eroding and becoming more dangerous to the extent western leaders are envious of dictatorial powers and trying to emulate Chinese totalitarianism, but there is a lot of institutional and cultural bulwark against it.
Any powerful totalitarian country should worry people. People underestimate the level of covert aggression in all facets of foreign involvement in regimes with no internal accountability.
I agree. For most people here, it is their own government (or allied governments) which are the ones to be most cautious about. They're the ones most likely to ruin your life if they don't agree with what you're up to - elected or not.
You may not have noticed this, but China is seeking to extend it's influence. By whatever means required including deadly force. There is a picture you might want to look at. Search "Tank Man". That is why people worry about China so much.
I don't think they would release the regex used to validate the API keys since it would help people automate scanning for API keys of all supported providers on public repos on any other site using the regex given by the provider itself.
For public repositories only though. For private repos it's optional, and when enabled the repo admins get an alert to handle it themselves without it going to the vendor.
You can already do the former by using GitHub Events API. This simply helps with the accidental leak of tokens into the public, so Tencent / Repo owner can revoke it before it gets abused.
https://docs.github.com/en/rest/activity/events?apiVersion=2...
Poor functionary creates political incident with humble template...sounds like a Gogol short story. "but it worked great for redirect.pizza!"
Btw I assume this recent thread was about the same feature:
Secret scanning is now available for free on public repositories - https://news.ycombinator.com/item?id=34007637 - Dec 2022 (70 comments)
Even though I'm a paid github customer, I had no idea they had a program called "secret scanning" and that it's actually beneficial.
So I obviously assumed they're letting China scan my private repos.
They really need to work on wording.
Fyi... this feature was also previously mentioned in the news for public repos: https://techcrunch.com/2022/12/15/github-brings-free-secret-...
>So I obviously assumed they're letting China scan my private repos.
To clarify, it's Microsoft/Github doing the scanning of private repos on behalf of the partners. They're just forwarding the tokens that match the partners' regexp.
Edit: how about dropping the corporatese and title it "github will now scan public repos for secret WeChat tokens"?
Though perhaps that’s just my own bias on the subtle differences in the meanings of those words.
It's not scanning that they're doing in secret. Credential scanning removes the ambiguity
But this is an excellent next step where they build an integration with these partners where, as soon as a secret is scanned, they can notify tencent/AWS/other providers automatically to instantly invalidate those keys before they’re abused.
That’s what’s novel here.
These "leaked" secrets GitHub forwards might be dissidents getting access without being tracked. It might not be a WeChat secret at all who knows? They're not a trustworthy partner, nothing should be shared with this company.
And to the folks saying it's public information and they already have it: That makes no sense, then they don't need GitHubs help. Obviously GitHub is supporting their scanning efforts here.
GitHub has a global stream API for all public events,[1] but it is delayed by five minutes, precisely so that sensitive actions like revoking leaked tokens can be performed before the world sees them. That’s what the secret scanning program is about, and you would have known if you spent 1/3 of the time of your rant learning about it.
Edit: Additionally, for private repos, secret scanning is opt-in and only alerts owners.
[1] https://docs.github.com/en/rest/activity/events?apiVersion=2...
There isn't a country in the world which does this. But the details are also not the main point, it's how extremely restricted and controlled simple access to information or forums of free expression is for people in China. Tencent has party officials working within the company. This isn't a regular business as Westerners might imagine it, it's an extended part of the CCP just like any other large corporation under Xi.
Again, people are saying it's no big deal but why would GitHub help them at all? It's not a good cause.
It was very fascinating to see, a near total domination of WeChat everywhere and relatively very hard onboarding for new accounts. Contrary to the west where most of services seek to streamline onboarding as much as possible - I guess that becomes an anti feature when you have total monopoly and _everyone_ has a WeChat account. I think it's a very effective (and very dystopian) form of control. P.S: Signal worked without any problems for me, even on a Chinese SIM (one "trick" to go around most of the GFW was buying a HK SIM in HK. Works across china and has a lot less blocks, but for various reasons I got a China SIM too).
GitHub is available in China, why shouldn't they protect their Chinese users?
And the SIM card requirements have nothing to do with Tencent, have you tried getting a SIM in Germany? Impossible without government ID and an address. And there are a lot of services which you can't sign up for without German ID / address. As a foreigner I also can't easily open a bank account in the US.
The threat here, in the worst case, is associating a GitHub ID with a WeChat ID.
> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.
This is GitHub scanning private repos and telling WeChat about them.
WeChat can already scan public repos.
They are not already screwed if they’re publishing something to a private repo, it might be the wrong way to do it, but it doesn’t mean they’re already screwed.
If you don’t trust GitHub’s private repo security then why are you using it in the first place?
This is about preventing things like API keys from being published to code. That’s not a dissident use-case…
Is this whataboutism? Possibly – but what I'd actually like to happen is US-based companies are charged company-hurting fines for mismanaging PII like this (Twitter, for example, is currently openly planning to sell user phone data [1] that they previously gathered for security purposes).
All this to say, we can't reasonably call out other dystopian companies if the ones we use everyday are doing the exact same thing. So we should call out secret scanning from Meta [2] and (if it ever happens) Twitter as well.
----------------------------------------
[1] https://www.businessinsider.com/twitter-plans-to-force-users...
[2] https://developers.facebook.com/blog/post/2021/11/09/meta-jo...
"Leaked" here means "made public", i.e. "published such that literally anyone can use them", for example when burned into a commit of a public repo. Even for a dissident, publishing an API key or other credential where literally anyone can find it to use it, is almost assuredly a mistake. Because external scrapers can also find it there, such that the key will be inevitably picked up and fed into a botnet to abuse — at which point the ops staff at the service will notice the abuse and revoke the key, thus "burning" it as useful from the dissident's perspective.
If you store a secret on Github somewhere that only people and people you trust have access to, rather than everyone having access to it, then this is not considered a "leak", and so Github does not detect this as a "leaked secret." For example, commit data of private repos is not scanned for secrets (if it was, GitOps as a concept would be impossible!); nor are a repo's formal Actions Secrets store (part of a repo's configuration readable only by triggered Github Actions CI jobs).
Github's own secret-scanning here, is trying to catch the cases where a user has done something stupid by accident. Whether or not they reported secrets to third parties, they'd still be doing leaked-secret scanning of their own Github API keys, to ensure that people aren't accidentally trying to configure Github Actions by burning their Github Actions CI API key into the workflow itself. If they find such keys, they revoke them.
The point of Github's secret-scanning partner program, is that because Github is doing this leaked-secret scanning for their own purposes anyway, you (the partner) can sign up to be told when API keys of yours are accidentally made public as well.
> That makes no sense, then they don't need GitHubs help.
Ignoring for a moment that Github is a website, and so anyone can just crawl it—
Did you know? Github pushes the commit data of all public repos to BigQuery as a public research dataset: https://codelabs.developers.google.com/codelabs/bigquery-git.... Literally anyone can do their own "secret scanning" with a simple BigQuery query. It costs about $500 to run such a query, because the Github dataset is pretty large. It's not a price most SMEs would pay. But it's definitely a price attackers could be willing willing to pay. It's a lot cheaper than running your own web-spider infrastructure!
The difference with Github's own secret scanning, is that it happens synchronously, on push of commits; whereas the ETL of commit data to Github et al happens asynchronously, some time after commits happen. Tencent — and every other secret-scanning partner — depends on Github to stay ahead of any third-party attackers trying to scrape leaked credentials for use in botnets et al.
Also, FYI, you yourself can sign up to be a Github secret-scanning partner. You just need 1. a regex that uniquely identifies your secrets, so that Github can recognize them on push, and 2. a webhook URL to report them to. (https://docs.github.com/en/developers/overview/secret-scanni...)
And by the way, this isn't a hypothetical nice-to-have. I run an API SaaS — and not one that's even very large, in relative terms. But my own customers' accidentally-leaked secrets have been scraped from their Github repos and used by botnets already! Signing up as a Github secret-scanning partner is on my to-do list.
It lets WeChat revoke tokens that GitHub finds in public repositories.
“GitHub will forward access tokens found in public repositories to Tencent WeChat, who will notify affected users.”
Here’s what I just copied from the blog post without modification:
> We have partnered with Tencent WeChat to scan for their tokens and help secure our mutual users on all public repositories and private repositories with GitHub Advanced Security.
It’s not just public repos, it’s private repos too.
However, this is already a well established and useful thing. When you publish your AWS (for example) secrets to your public repo, it will scan it and stop it leaking before damage can be done. This is just the same for another service.
If I was forced to pick one government to share my secrets with, it would be the Chinese, because there's nothing they can do about it. My own government and its allies is infinitely more dangerous to me than such a foreign one.
Are you talking about the China that bought huge areas in ports around the world? The same one that has secret police stations as well?
Unless you live or visit there. Wasn’t there reports of China having concentration camps?
They wouldn’t even need to learn all that much about you as an individual. Just enough to match you with a cluster from their own population that they have infinite data on.
What makes you so sure about that?
I worry about China because there’s no internal checks to prevent them from doing anything.
Western governments and allies have a long culture of court systems and thinking about balancing constituent needs. That is eroding and becoming more dangerous to the extent western leaders are envious of dictatorial powers and trying to emulate Chinese totalitarianism, but there is a lot of institutional and cultural bulwark against it.
Any powerful totalitarian country should worry people. People underestimate the level of covert aggression in all facets of foreign involvement in regimes with no internal accountability.
What makes you think this?
Yep totally harmless.
> China operating over 100 police stations across the world with the help of some host nations, report claims
[0]: https://docs.github.com/en/enterprise-cloud@latest/code-secu...
I assume that the regex is `TC:[a-z0-9]{20}` or something uninteresting like that.
So any string (which Github deems an access token) is forwarded to Tencent?
Or will Tencent share all their current access tokens with github?