https://home.engineering.iastate.edu/~julied/classes/ee524/L...
https://home.engineering.iastate.edu/~julied/classes/ee524/L...
That said, if you're not using it, it defeats the purpose. And the more you're using it, the higher the likelihood you will be detected down the line. Compare to Solarwinds.
https://www.ussc.gov/sites/default/files/pdf/research-and-pu...
In 2022, the average murderer in the Federal Southern District of New York (N = 58) was sentenced to a median of 231 months, which comes out to 19 years and change.
No crime of any type had an mean or median sentence higher than 25 years; most were far less than half.
So it should be really hard to argue that this is a "light" sentence. If anything, it's excessive if you consider the nature of the crime relative to the nature of murder or kidnapping.
The formal specification for something like Redis is likely much more akin to your car bridge spec. And to continue your analogy, I imagine the specifications for big bridges (Golden Gate, etc) are much more thorough than the ones you built.
Only a small group of software engineers are interested in this principled approach. Hell many software engineers are scared of recursion.
For a bridge, you specify that it needs to withstand specific static and dynamic loads, plus several other things like that. Once you have the a handful of formulas sorted out, you can design thousands of bridges the same way; most of the implementation details, such as which color they paint it, don't matter much. I'm not talking out of my butt: I had a car bridge built and still have all the engineering plans. There's a lot of knowledge that goes into making them, but the resulting specification is surprisingly small.
Now try to formally specify a browser. The complexity of the specification will probably rival the complexity of implementation itself. You can break it into small state machines and work with that, but if you prove the correctness of 10,000 state machines, what did you actually prove about the big picture?
If you want to eliminate security issues, what does it even mean that a browser is secure? It runs code from the internet. It reads and writes files. I talks directly to hardware. We have some intuitive sense of what it's supposed and not supposed to do, but now write this down as math...
These "data removal" services spend a lot of effort going after the frontends, which is pretty self-serving: they can show the customer that there's something new to remove every single month or quarter, so you have to keep paying forever.
I hash every string with a SimHash and perform a Hamming distance query against those hashes for any hash that belongs to more than 3 accounts, i.e., any full string (> 42 characters) which was posted as a post title, post body, comment body, or account "description" by more than 3 accounts.
Regularly, this exposes huge networks of both fresh accounts and what I have to assume are stolen, credentialed "aged" accounts being used to spam that just recycle the same or very similar (Hamming distance < 5 on strings > 42 characters) titles/bodies. We're talking thousands of accounts over months just posting the same content over and over to the same range of subreddits.
I'm just some random Laravel enjoyer, and I've automated the 'banning' of these accounts (really, I flag the strings, and any account that posts them is then flagged).
This doesn't even touch on the media... (I've basically done the same thing with hashing the media to detect duplicate or very, very similar content via pHash). Thousands and thousands of accounts are spamming the same images over and over and over.
From my numbers, 59% of the content on Reddit is spam, and 51% of the accounts are spam, and that's not including the media-flagged spammers.
They don't seem to care about the spam, or they're completely inept. With the resources at their disposal, there's such a huge portion of this that should be able to be moderated before it ever reaches the API/live.
I suspect that the objective of these bulk spamming operations isn't to promote stuff on the platform, but to mess with other apps. LLMs trained on Reddit content, search engines that rank Reddit posts highly, etc.
It seems like the real intent was to regain control over the surfaces users use to consume the site, especially on mobile.
Incognito mode is fine. It doesn't persist data, it doesn't give websites access to existing main-profile data (cookies, etc), and it actually honors the settings of the profile it was spawned from.
None of this robustly prevents fingerprinting, but neither does switching to another browser or wiping your profile clean. There's just a bunch of system and network characteristics that leak info because of how the web is designed. Google didn't make it so and I don't think they're using it to serve you ads.
I think two things can simultaneously be true. Google's privacy practices aren't great, and they weren't actually doing anything that a reasonable person wouldn't expect to be happening in incognito. This was a lawsuit filed to shake them down, not to benefit the consumer. And apparently, it was flimsy enough that it started with a $5B demand, and is ending with no payout at all.