Readit News logoReadit News
joefkelley commented on SWE-Bench Pro   github.com/scaleapi/SWE-b... · Posted by u/tosh
gpt5 · 3 months ago
Slightly tangent question - they said that they have protected the public test set with a strong copyleft license to prevent training private models on them.

Does it actually work? Isn’t AI training so far simply ignores all license and copyright restrictions completely?

joefkelley · 3 months ago
I happen to have worked on exactly this at Google. No, we don't train on restrictively-licensed code to the best of our abilities.
joefkelley commented on After 20 years, the globally optimal Boggle board   danvk.org/2025/04/23/bogg... · Posted by u/danvk
joefkelley · 8 months ago
This reminded me of one of my high school computer science assignments- simply to find all words in a single boggle board. And try to optimize your solution a bit. The point was to teach about recursion/backtracking and data structures. The intended solution was roughly: start at a square, check if your current prefix is a valid prefix, move to a neighbor recursively, and emit any words you find. Trying to optimize naturally motivates a trie data structure.

I found it to be at least an order of magnitude faster, though, to invert the solution: loop through each word in the dictionary and check whether it exists in the grid! The dictionary is small compared to the number of grid paths, and checking whether a word exists in the grid is very very fast, requiring not much backtracking, and lends itself well to heuristic filtering.

joefkelley commented on The Origins of Wokeness   paulgraham.com/woke.html... · Posted by u/crbelaus
throwaway_2494 · a year ago
Summary: Rich white guy complains that it's too much effort to figure out what we're supposed to call 'coloured people' these days. It reads like the lament of a sore winner who has been forced to think of other's feelings against his will.

And all of this is couched in a pseudo-histororical style that perhaps the author hopes will shield it from being read as an 'emotional' argument.

And you know what's the worst thing? We live in a conservative world. They set the rules of the game, the draw the chalk outlines of the playing field, they own the ball the stadium and the referees.

And now they tell us we have to be silent when they rough us up too?

joefkelley · a year ago
Yeah I guess wokeness and cancel culture are what you complain about now when your life is so free from challenges that you have nothing else to complain about.
joefkelley commented on New Docs Reveal US DHS Conspiracy to Violate First Amendment   public.substack.com/p/new... · Posted by u/IG_Semmelweiss
uLogMicheal · 2 years ago
I welcome critical points of view but the fact remains that few are allowed to even see posts like this, much less discuss them with logic. The root comment here dismisses the article without any argument. How is a ticketing portal between government and social media companies for post take downs on subjective matters not evidence of censorship? Why is our government subsidizing moderation?
joefkelley · 2 years ago
The burden of proof is on the person making a positive claim. So I didn't think I really needed any more argument than "the article's argument is insufficient for me to believe their claims". I wasn't trying to make any more general point about censorship or social media.

I don't have a problem in general with a ticketing system for the government to request specific kinds of moderation. There do exist valid carve-outs of free speech, and I see no reason why government and industry couldn't collaborate on that. If it were used to censor political speech, then yes, I would have a problem with that. But I haven't seen evidence of that.

joefkelley commented on New Docs Reveal US DHS Conspiracy to Violate First Amendment   public.substack.com/p/new... · Posted by u/IG_Semmelweiss
joefkelley · 2 years ago
The evidence presented in the article does not match the claims.

The emails they include show that there were meetings between DHS and Twitter and between DHS and Stanford, on the topic of election integrity. And that there was a Signal chat (I guess this is kind of sketchy).

But there's no evidence of censorship or anything politically-motivated that I can discern.

joefkelley commented on Retrofitting null-safety onto Java at Meta   engineering.fb.com/2022/1... · Posted by u/mikece
driggs · 3 years ago
It's so sad that Java 8 had the chance to really fix the null problem, but gave us only the half-assed `java.util.Optional<T>`. Rather than implementing optional values at the language level, it's just another class tossed into the JRE.

This is perfectly legal code, where the optional wrapper itself is null:

    Optional<String> getMiddleName() {
        return null;
    }

joefkelley · 3 years ago
Do you feel that a language has to have something at the language level to prevent NPEs?

In my experience, Scala does pretty well without it.

I guess is your point that the language should make it impossible to write bad code, not just make it easy to write good code?

joefkelley commented on Zoom: Remote Code Execution with XMPP Stanza Smuggling   bugs.chromium.org/p/proje... · Posted by u/Flowdalic
Diggsey · 4 years ago
There are just so many issues here.

1) Don't rely on two parsers having identical behaviour for security. Yes parsers for the same format should behave the same, but bugs happen, so don't design a system where small differences result in such a catastrophic bug. If you absolutely have to do this, at least use the same parser on both ends.

2) Don't allow layering violations. All content of XML documents is required to be valid in the configured character encoding. That means layer 1 of your decoder should be converting a byte stream into a character stream, and layers 2+ should not even have the opportunity to mess up decoding a character. Efficiency is not a justification, because you can use compile-time techniques to generate the exact same code as if you combined all layers into one. This has the added benefit that it removes edge-cases (if there is one place where bytes are decoded into characters, then you can't get a bug where that decoding is only broken in tag names, and so your test coverage is automatically better).

3) Don't transparently download and install stuff without user interaction, regardless of where it comes from!

4) Revoke certificates for old compromised versions of an installer so that downgrade attacks are not possible.

joefkelley · 4 years ago
> 3) Don't transparently download and install stuff without user interaction, regardless of where it comes from!

This is an interesting one. I totally get your point. But also users are terrible about updating their software if you give them the choice. Automatic updates have very practical security benefits. I've witnessed non-technical folks hit that "remind me later" button for years.

Deleted Comment

joefkelley commented on Why I Work on Ads   jefftk.com/p/why-i-work-o... · Posted by u/benjaminjosephw
beloch · 5 years ago
The crux of his justification for what he does is that, he argues, people wouldn't want to pay a monthly fee for services like youtube bundled with complete respect for their privacy.

First, users do not currently have that choice. Sure, you can pay for some things (e.g. youtube premium), but it does nothing for your privacy. If you buy youtube premium you'll very likely see more ads for youtube premium (if you're not already blocking ads).

Second, The real benefit of ads is that it lets small sites that might get a single one-time visit from a user monetize that visit. A blog with a trending post is not going to be able to sell micro-subscriptions to one-time users, but they can get some ad revenue. The only current alternative here is begging for donations. That takes some effort and can piss off readers.

Ironically, although Kaufman mentions that micropayments are hard, Google is one of the few companies currently situated to implement them in a way that would actually improve user privacy. e.g. If a user paid a "Premium Internet" monthly fee, Google Ads could have a flag that turns it's data collection/sharing off and replaces it with micropayments to any site that user visits that are running Google Ads.

Of course, it does seem a little bit like a mafia protection racket for a company devoted to invading user privacy and selling their data to turn around and offer to stop doing that if paid by users!

joefkelley · 5 years ago
> If you buy youtube premium you'll very likely see more ads for youtube premium

Really? I have youtube premium and I can't recall seeing ads for it. Why would they advertise a product to people that already have that product?

> Google Ads could have a flag that turns it's data collection/sharing off and replaces it with micropayments to any site that user visits that are running Google Ads.

FWIW, this flag already exists, except you don't have to do micropayments: https://adssettings.google.com/

I guess you still see ads with this setting, they're just not personalized. Hypothetically you could imagine a "stronger" setting that doesn't just do away with personalization, it does away with ads altogether by allowing the user to "outbid" any advertiser. But I suspect there will be some surprised users who get a bill for hundreds of dollars by doing some particularly high-value searches like "personal injury lawyer" or "mortgage" or something.

And if it were a flat rate, my intuition is that the fee would have to be much higher than most would expect or be willing to pay.

joefkelley commented on Enter Sandbox: Google is building an internet without cookies   pressgazette.co.uk/death-... · Posted by u/wolverine876
shostack · 5 years ago
Do you use any signals from people who opt out of Web and App activity to feed into models that are used not just for measurement but for targeting?

What if someone is identified by a model as being in market for a TV and then opts out? Would they still be classified as in market at that point?

I work in digital media, feel free to get technical with your response.

joefkelley · 5 years ago
I'll hedge this by saying I can only speak for the teams I've worked on, plus my somewhat limited understanding of company-wide policies.

The answer to both questions is no. If you opt out, your data is not used for modeling or targeting or anything. Perhaps some internal reporting that isn't used for anything other than like PMs wanting to understand user behavior? Even that I'm not sure about.

If you are identified as being in market for something based on activity and then opt out, you will no longer be classified as being in market. That classification will be deleted- though perhaps not immediately but within some reasonable time frame, say 24 hours or so.

u/joefkelley

KarmaCake day519October 29, 2013
About
[ my public key: https://keybase.io/joefkelley; my proof: https://keybase.io/joefkelley/sigs/tQvRyEmh9cYiyGuHqWJiUOQyCysGDoulVS3wCX_PG7M ]
View Original