Readit News logoReadit News
tooltower commented on We can't have nice things because of AI scrapers   blog.metabrainz.org/2025/... · Posted by u/LorenDB
tooltower · 2 months ago
> Rather than downloading our dataset in one complete download, they insist on loading all of MusicBrainz one page at a time.

Is there a standard mechanism for batch-downloading a public site? I'm not too familiar with crawlers these days.

tooltower commented on Zero knowlege proof of compositeness   johndcook.com/blog/2025/1... · Posted by u/ColinWright
tooltower · 4 months ago
Are we sure that the base reveals nothing about the factors if n is composite? I have never seen a proof of that.

Usually, zero knowledge proofs also require a prover who knows the answer (the factors in this case). This is just a primality test that can be performed locally.

tooltower commented on Why is hash(-1) == hash(-2) in Python?   omairmajid.com/posts/2021... · Posted by u/alexmolas
kstrauser · a year ago
But even that’s an implementation detail that happens to be true today, but could easily disappear tomorrow. There’s nothing in the docs saying it has to be that way, or that you can infer anything from a hash output other than that it’s going to be consistent within a single execution of that Python program.
tooltower · a year ago
This article is not about the API contract of the hash function, or the abstraction it provides. If you are just trying to hash things, you don't need any info here.

It's very much trying to go _under_ the abstraction layer to investigate its behavior. Because it's interesting.

This is very similar to how people investigate performance quirks or security issues.

tooltower commented on GenChess   labs.google/genchess... · Posted by u/xnx
LZ_Khan · a year ago
This is freakin awesome but do the creators even play chess? Playing chess from that viewing angle is unpleasant as hell.

Imagine putting all your work into creating an amazing demo and then drowning it with one design decision.

Edit: Ok I see that you can change the view in settings. They should make that option more visible.

tooltower · a year ago
I still don't see the settings. Can anyone help?
tooltower commented on Google’s TOS doesn’t eliminate a user’s Fourth Amendment rights, judge rules [pdf]   ww3.ca2.uscourts.gov/deci... · Posted by u/coloneltcb
naet · a year ago
It seems like a large part of the ruling hinges on the fact that Google matched the image hash to a hash of a known child pornography image, but didn't require an employee to actually look at that image before reporting it to the police. If they had visually confirmed it was the image they suspected it was based on the hash then no warrant would have been required, but the judge reads that the image hash match is not equivalent to a visual confirmation of the image. Maybe there's some slight doubt in whether or not the image could be a hash collision, which depends on the hash method. It may be incredibly unlikely (near impossible?) for any hash collision depending on the specific hash strategy.

I think it would obviously be less than ideal for Google to require an employee visually inspect child pornography identified by image hash before informing a legal authority like the police. So it seems more likely that the remedy to this situation would be for the police to obtain a warrant after getting the tip but before requesting the raw data from Google.

Would the image hash match qualify as probable cause enough for a warrant? On page 4 the judge stops short of setting precedence on whether it would have or not. Seems likely that it would be a solid probable cause to me, but sometimes judges or courts have a unique interpretation of technology that I don't always share, and leaving it open to individual interpretation can lead to conflicting results.

tooltower · a year ago
The hash functions used for these purposes are usually not cryptographic hashes. They are "perceptual hashes" that allows for approximate matches (e.g. if the image has been scaled or brightness-adjusted). https://en.wikipedia.org/wiki/Perceptual_hashing

These hashes are not collision-resistant.

tooltower commented on LHC experiments at CERN observe quantum entanglement at the highest energy yet   home.cern/news/press-rele... · Posted by u/gmays
guy234 · a year ago
So does this imply that the phrase "entangled system" doesn't mean anything about a specific system but rather indicates which types of statistics govern a class of systems produced en mass?
tooltower · a year ago
Kind of? Keep aside quantum mechanics for a second. In any classical experiment that has random outcomes, would you say that the probability distribution is a property of a single system or a bunch?

You can only deduce a distribution from repeated measurements. But most physicists would have no problem talking about a single experiment having many possible outcomes, governed by a probability distribution. It's almost a philosophical question about whether probability means anything in single systems.

It's the same way in quantum mechanics. The effects of entanglement can only be discerned if you take repeated samples. But we still feel okay talking about single systems governed by such entanglement.

tooltower commented on LHC experiments at CERN observe quantum entanglement at the highest energy yet   home.cern/news/press-rele... · Posted by u/gmays
guy234 · a year ago
I am having trouble wrapping my head around the sense in which entanglement is a physical phenomenon as opposed to a semantical byproduct of the bookkeeping involved in modern quantum theory. How can an entangled system be differentiated from a nonentangled system? If the answer is that such an identification is nonfeasible, then in what sense is entanglement an actual physical phenomenon?

I was under the impression that a particular entangled system is defined in terms of a particular waveform, which means that the choice of another waveform including, say, an additional particle off to the side, would imply that the entanglement -- which is supposed to be the behaviour being described, not the theory used to describe it -- actually changes. So, substitution of separate waveforms for each component of the entanglement would imply that entanglement is not present. How would this be false in a way different from the inaccuracies present in any other choice of waveform?

tooltower · a year ago
> How can entangled system be differentiated from a nonentangled system?

The canonical answer to your question is Bell's inequality: https://en.wikipedia.org/wiki/Bell's_theorem. But TL;DR: the distinction only shows up in the statistics of repeated experiments. There is _no way_ to distinguish them in single-fire experiments. Entanglement is defined in terms of "odd" statistics.

In repeated measurements of related properties (e.g. spin along varying angle), entangled systems show more correlation than it should be possible classically.

tooltower commented on Chain of Thought empowers transformers to solve inherently serial problems   arxiv.org/abs/2402.12875... · Posted by u/krackers
nopinsight · a year ago
In the words of an author:

"What is the performance limit when scaling LLM inference? Sky's the limit.

We have mathematically proven that transformers can solve any problem, provided they are allowed to generate as many intermediate reasoning tokens as needed. Remarkably, constant depth is sufficient.

http://arxiv.org/abs/2402.12875 (ICLR 2024)"

https://x.com/denny_zhou/status/1835761801453306089

tooltower · a year ago
Constant depth circuits can solve everything? I feel like I missed some important part of circuit complexity. Or this is BS.
tooltower commented on CrowdStrike ex-employees: 'Quality control was not part of our process'   semafor.com/article/09/12... · Posted by u/everybodyknows
Alupis · 2 years ago
> “Speed was the most important thing,” said Jeff Gardner, a senior user experience designer at CrowdStrike who said he was laid off in January 2023 after two years at the company. “Quality control was not really part of our process or our conversation.”

This type of article - built upon disgruntled former employees - is worth about as much as the apology GrubHub gift card.

Look, I think just as poorly about CrowdStrike as anyone else out there... but you can find someone to say anything, especially when they have an axe to grind and a chance at some spotlight. Not to mention this guy was a designer and wouldn't be involved in QC anyway.

> Of the 24 former employees who spoke to Semafor, 10 said they were laid off or fired and 14 said they left on their own. One was at the company as recently as this summer. Three former employees disagreed with the accounts of the others. Joey Victorino, who spent a year at the company before leaving in 2023, said CrowdStrike was “meticulous about everything it was doing.”

So basically we have nothing.

tooltower · 2 years ago
This is like online reviews. If you selectively take positive or negative reviews and somehow censor the rest, the reviews are worthless. Yet, if you report on all the ones you find, it's still useful.

Yes, I'm more likely to leave reviews if I'm unsatisfied. Yes, people are more likely to leave CS if they were unhappy. Biased data, but still useful data.

tooltower commented on "Firefox added [ad tracking] and has already turned it on without asking you"   mastodon.social/@mcc/1127... · Posted by u/notamy
tooltower · 2 years ago
I have been donating a few hundred dollars to Mozilla every year (or at least most years) for the last 7 years. It's not much, but I might stop that donation now.

u/tooltower

KarmaCake day637November 9, 2020View Original