quatrefoil (u/quatrefoil)

quatrefoil commented on Google to delete records from Incognito tracking bbc.com/news/business-687... · Posted by u/pseudolus

summerlight · 2 years ago

For those who believe that "incognito" is actually something private and secret, no it is not and was never meant to be! It still can access the major part of your personal configuration and history, mostly for a convenience purpose. Its main purpose is to stop appending more data there, not to hide you. If you really want to keep your browsing private, you should be using guest mode.

quatrefoil · 2 years ago

Guest mode does not honor your settings. It mostly defaults to install-time configuration.

Incognito mode is fine. It doesn't persist data, it doesn't give websites access to existing main-profile data (cookies, etc), and it actually honors the settings of the profile it was spawned from.

None of this robustly prevents fingerprinting, but neither does switching to another browser or wiping your profile clean. There's just a bunch of system and network characteristics that leak info because of how the web is designed. Google didn't make it so and I don't think they're using it to serve you ads.

I think two things can simultaneously be true. Google's privacy practices aren't great, and they weren't actually doing anything that a reasonable person wouldn't expect to be happening in incognito. This was a lawsuit filed to shake them down, not to benefit the consumer. And apparently, it was flimsy enough that it started with a $5B demand, and is ending with no payout at all.

quatrefoil commented on Not so fast, Mr. Fourier lcamtuf.substack.com/p/no... · Posted by u/dvfjsdhgfv

meindnoch · 2 years ago

Meh. The author confuses DFT with STFT. They refer to the STFT as "DFT", but then introduce the usual IDFT as the inverse transform, which is not the inverse of the STFT (the inverse STFT can be computed with the overlap-add method).

quatrefoil · 2 years ago

I don't think your comment is accurate. The formulas appear to be DFT and IDFT, in line with:

https://home.engineering.iastate.edu/~julied/classes/ee524/L...

quatrefoil commented on XZ backdoor: "It's RCE, not auth bypass, and gated/unreplayable." bsky.app/profile/filippo.... · Posted by u/junon

sega_sai · 2 years ago

One have question on this is, if the backdoor would not been discovered due to performance issue (which was as I understood it purely an oversight/fixable deficiency in the code), what are the chances of discovering this backdoor later, or are there tools that would have picked it up? Those questions are IMO relevant to understand if this kind of backdoor is the first one of the kind, or the first one that was uncovered.

quatrefoil · 2 years ago

If the exploit wasn't baing used, the odds would would be pretty low. They picked the right place to bury it (i.e., effectively outside the codebase, where no auditor ever looks).

That said, if you're not using it, it defeats the purpose. And the more you're using it, the higher the likelihood you will be detected down the line. Compare to Solarwinds.

quatrefoil commented on Sam Bankman-Fried sentenced to 25 years in prison cnn.com/business/live-new... · Posted by u/misiti3780

A_D_E_P_T · 2 years ago

To put this into its proper perspective, see table 7 here:

https://www.ussc.gov/sites/default/files/pdf/research-and-pu...

In 2022, the average murderer in the Federal Southern District of New York (N = 58) was sentenced to a median of 231 months, which comes out to 19 years and change.

No crime of any type had an mean or median sentence higher than 25 years; most were far less than half.

So it should be really hard to argue that this is a "light" sentence. If anything, it's excessive if you consider the nature of the crime relative to the nature of murder or kidnapping.

quatrefoil · 2 years ago

You're comparing to medians; is this a median crime?

quatrefoil commented on Why fuzzing over formal verification? blog.trailofbits.com/2024... · Posted by u/rrampage

fwip · 2 years ago

A browser is possibly the most difficult application to formally specify. 99.somemorenines% of software has less complex specifications.

The formal specification for something like Redis is likely much more akin to your car bridge spec. And to continue your analogy, I imagine the specifications for big bridges (Golden Gate, etc) are much more thorough than the ones you built.

quatrefoil · 2 years ago

A browser is one of the most consequential attack surfaces in the lives of billions of people. Redis isn't. Having proofs where said proofs don't matter much in the first place is not a particularly good use of our time. And FWIW, the correctness specs for Redis would be pretty intractable too.

quatrefoil commented on Why fuzzing over formal verification? blog.trailofbits.com/2024... · Posted by u/rrampage

voxl · 2 years ago

Formal methods is what happens when you try to do engineering in the traditional sense in the software space. You sit down, design, find a specification, and then build to spec. It's arguably an idealized version, even, because depending on the methods you cannot fail the spec if you succeed in building it.

Only a small group of software engineers are interested in this principled approach. Hell many software engineers are scared of recursion.

quatrefoil · 2 years ago

That's a pretty cynical take. I think a more profound problem is that formal specifications for software are fairly intractable.

For a bridge, you specify that it needs to withstand specific static and dynamic loads, plus several other things like that. Once you have the a handful of formulas sorted out, you can design thousands of bridges the same way; most of the implementation details, such as which color they paint it, don't matter much. I'm not talking out of my butt: I had a car bridge built and still have all the engineering plans. There's a lot of knowledge that goes into making them, but the resulting specification is surprisingly small.

Now try to formally specify a browser. The complexity of the specification will probably rival the complexity of implementation itself. You can break it into small state machines and work with that, but if you prove the correctness of 10,000 state machines, what did you actually prove about the big picture?

If you want to eliminate security issues, what does it even mean that a browser is secure? It runs code from the internet. It reads and writes files. I talks directly to hardware. We have some intuitive sense of what it's supposed and not supposed to do, but now write this down as math...

quatrefoil commented on Mozilla Drops Onerep After CEO Admits to Running People-Search Networks krebsonsecurity.com/2024/... · Posted by u/todsacerdoti

jart · 2 years ago

What else could they do? They're working within a system that the government designed, and the government always designs things to keep people running on the hamster wheel.

quatrefoil · 2 years ago

Request deletion from backend brokers? Many have some mechanisms for opt-out, either in general or for people in specific states (e.g., California).

quatrefoil commented on Mozilla Drops Onerep After CEO Admits to Running People-Search Networks krebsonsecurity.com/2024/... · Posted by u/todsacerdoti

micromacrofoot · 2 years ago

Data brokers are like the hydra, one goes down and another 2 new ones pop up. It's a lot of work to keep on top of deletions if you want privacy.

quatrefoil · 2 years ago

Not really. There's a fairly small and stable number of companies that actually collect and resell information about you. There is also about a zillion ephemeral web front ends that republish this data, however. I suspect this is done for a reason, but a bit of sleuthing quickly reveals who the big players are.

These "data removal" services spend a lot of effort going after the frontends, which is pretty self-serving: they can show the customer that there's something new to remove every single month or quarter, so you have to keep paying forever.

quatrefoil commented on The Reddits ycombinator.com/blog/the-... · Posted by u/sandslash

Implicated · 2 years ago

As someone who has been aggressively cataloging "data" (posts, comments, subreddits, etc.) from Reddit and, importantly in this context, keeping those records relatively up-to-date, it's absolutely astonishing how much spam there is.

I hash every string with a SimHash and perform a Hamming distance query against those hashes for any hash that belongs to more than 3 accounts, i.e., any full string (> 42 characters) which was posted as a post title, post body, comment body, or account "description" by more than 3 accounts.

Regularly, this exposes huge networks of both fresh accounts and what I have to assume are stolen, credentialed "aged" accounts being used to spam that just recycle the same or very similar (Hamming distance < 5 on strings > 42 characters) titles/bodies. We're talking thousands of accounts over months just posting the same content over and over to the same range of subreddits.

I'm just some random Laravel enjoyer, and I've automated the 'banning' of these accounts (really, I flag the strings, and any account that posts them is then flagged).

This doesn't even touch on the media... (I've basically done the same thing with hashing the media to detect duplicate or very, very similar content via pHash). Thousands and thousands of accounts are spamming the same images over and over and over.

From my numbers, 59% of the content on Reddit is spam, and 51% of the accounts are spam, and that's not including the media-flagged spammers.

They don't seem to care about the spam, or they're completely inept. With the resources at their disposal, there's such a huge portion of this that should be able to be moderated before it ever reaches the API/live.

quatrefoil · 2 years ago

There are commercial influencing operations on Reddit, but I think what you're describing doesn't really affect the usual user experience.

I suspect that the objective of these bulk spamming operations isn't to promote stuff on the platform, but to mess with other apps. LLMs trained on Reddit content, search engines that rank Reddit posts highly, etc.

quatrefoil commented on The Reddits ycombinator.com/blog/the-... · Posted by u/sandslash

BHSPitMonkey · 2 years ago

This doesn't track. They spent a lot of time and energy coordinating with the authors of popular mobile clients, and could easily have extended some means of letting them continue to operate given that they were clearly not harvesting content for themselves. Meanwhile, content can still be harvested for LLM training without the API (by using the HTML site).

It seems like the real intent was to regain control over the surfaces users use to consume the site, especially on mobile.

quatrefoil · 2 years ago

Scraping is a lot more dicey than using an official API. Why did Google enter that partnership? They have the data in their index. The only conceivable reason is that they prefer to pay Reddit to avoid the risk of litigating it and ending up with some unfavorable precedent.