So, i'm all for anything that shakes up messaging and maybe returns some of it to users.
By merging in the code of StringUTF16.compress and StringUTF16.toBytes into one function (and they are both very small), this wouldn't even have to slow things down, once you noticed location 'i' has char 'c' which isn't LATIN1, you make the new buffer, copy 0..i-1 over, then your known non-LATIN1 'i', then i+1..len.
This might be very slightly slower, but would fix (I think) the problem of breaking interning, which feels quite painful.
It's not quite as easy as it seems, because these methods are "intrinsics". That is, the pure Java code you see is not always the one that gets used; instead, the JIT compiler can use a faster implementation that uses vectorized assembly or whatever. (That's why you see "@IntrinsicCandidate" on compress() and toBytes() in StringUTF16.java)
But I think your idea would also be possible to implement in vectorized assembly, so it still works!
The rule for most things (such as ArrayList) in the JDK is: "if you use race conditions to break this thing, that's your problem, not ours". But String is different: it's one of the few things meant to be "rock-solid, can't break it even if you try", so I think this bug in String would qualify as a potential security issue in their mind: there are many places in Java that trust Strings not to act weird, and some of them are even in native code deep in the guts of the VM.
On the other hand, String is used all over the place so having to introduce a performance regression to fix this bug would be quite painful. All of the other proposed solutions in this thread introduce an extra copy, and an extra pass over the string. Your fix is basically no extra cost, and as a bonus, can be tweaked so each char in the array is read only once.
Which means that the bug can now be fixed "guilt-free", if anyone from the JDK team is reading this thread. Though they have some pretty clever people there too, so they might have thought of it eventually for all I know.
Example: suppose I offer you free email hosting "forever". Should you believe me? What if I go out of business? Well, in that case, I can at least ensure that your email address gets transferred to a different provider, so I'll still have kept my promise. So maybe this particular promise is believable.
But that only works because email is more or less a standard so there are many providers. Suppose I offer something that no one else does. Can you trust my promise that it will be "free forever"? Clearly, if I go out of business, then no.
Can I at least make a conditional promise that it will be free "as long as I'm in business"? But suppose I'm a month from bankruptcy, and my accountant tells me that getting rid of my free tier would save me. Surely, it's better for my free users to lose service rather than ALL my users losing service. So, unless you're sure I'd never be in that situation, you shouldn't believe me when I say "as long as I'm business this will be free".
Okay, how about a vague promise like "this service will keep its free tier around unless the business is in a desperate situation of some kind"? That's a promise that I could indeed keep, if I decide it's important enough... unless... how sure are you that I'll remain in full control of what is currently "my" business? What if I take my company public? I might then be kicked out by the board. (It happened to Steve Jobs, right?) So either my "free forever" promise means I'm not allowed to go public, or at least I need to do some very careful legal acrobatics to ensure that the board can't go back on my promise, even if they kick me out.
Still, if you find yourself needing to break a promise you made about "forever" a mere month after you made it, you should probably at least apologize instead of just hoping that no one remembers that you ever made such a promise. Chances are, they will indeed remember. If "it's the right thing to do" is not enough motivation, then do it for the brownie points, you'll get more of them this way. ("Who cares about brownie points?" you ask? Well, clearly, if they weren't worried about brownie points then they wouldn't be playing this weird game of "let's pretend we never said the forever".)
This is a service that OpenCage provides, and for whatever reason OpenCage happens to be one of the popular services for this use case. (Maybe it’s because you get the text description of location back right away without having to do a round trip through a heavyweight on-screen map, maybe their free tier allows more requests than most, maybe their api is easier to use, maybe they are lucky or skilled with SEO and their tutorial happens to be the first result for some common phrases, who knows.)
So there’s this process that starts with a search for “convert phone location to address”, often involves the OpenCage api, and ends with a happy developer getting the information they wanted. Various algorithms pick up on the existence and repeated traversal of this happy path.
In another part of the internet, code tutorial content farms notice a demand for determining an incoming call’s location from the number that’s calling. They search for things like “convert phone number to location” and “convert phone number to address”. Some of these searches end up falling into the nearby well-trodden path of “convert phone location to address” and the content farmer is presented with the OpenCage api. They mess around with the api for a bit and find they can start from a phone number and get a successful api call that returns a lat/lon pair. A successful api call that returns legitimate-looking lat/lon data is all they need to make a video, they make it and post it. Higher-quality, more scrupulous code tutorials attempt to answer this same demand but find it’s not possible, so those tutorials don’t get made, leaving the less scrupulous ones that stop with a successful-looking api call to flourish in this space. The tutorial is doing well, so the content farms endlessly recycle it into blogspam.
As a result, OpenCage starts getting weird usage patterns, tracks them down, finds the source is these tutorials, and makes a post about it.
Some time later, ChatGPT is released. People are astounded with its ability to write code and start using it for this purpose. Naturally, some of those people have the same demand as the previous generation of devs who stumbled onto the unscrupulous code tutorials. Because of the blogspam, ChatGPT’s training data includes many variations on the tutorial, and just as naturally it ends up reproducing that tutorial when asked - except ChatGPT’s magic kicks in and instead of including (what its embeddings see as) some weird unrelated area-code-to-string nonsense from the tutorial, it just bullshits some plausible-sounding data plumbing code instead. Unfortunately, because the tutorial never worked in the first place, that weird hacky irrelevant bit that ChatGPT ignored happened to be the secret sauce that makes the whole thing superficially appear to work.
As a result, OpenCage starts getting weird usage patterns, tracks them down, finds the source is ChatGPT, and makes a post about it.
In deference to Hacker News’ policy of keeping comments pleasant, I will elide the analysis of the process that leads to comments accusing OpenCage of nefariously engineering the whole thing for attention.
And it further implies that these people don't immediately follow that thought with: "That's surely impossible, since it would be a privacy nightmare if literally everyone in the world could track everyone else in the world's every move".
Or perhaps with this alternative thought, which would lead to the same conclusion: "let's not worry about privacy, how would this even work? Does every phone company in the world pro-actively send every customer's location data to OpenCage, just in case someone queries it? Or does OpenCage wait until it gets a query, and only then query the cell phone company 'just-in-time'? Both of these sound like a lot of work for each phone company to support ... what's the incentive?"
Honestly, I'm a bit surprised that the OpenCage blog post is so calm about this, instead of just yelling incoherently "why WHY why would anyone think like this?!?"
> note that back then, it was a signed integer.
Still and always is. We'd be having a 2106 problem instead of 2038 if it was unsigned 32-bit.
It's somewhat arbitrary, but by making it a signed integer, Dennis Ritchie figured it was good enough to represent dates spanning his entire life time. He probably thought Unix's life time would be significantly shorter rather than outliving him.
https://www.tuhs.org/cgi-bin/utree.pl?file=V3/man/man2/time....
vs
https://www.tuhs.org/cgi-bin/utree.pl?file=V4/man/man2/time....
EDIT: note that back then, it was a signed integer. Turns out that people wanted to be able to represent dates from before 1970 so we lost one bit.
Also you have to go over your bag, clothes, car, etc. and get rid of anything that could out you.
I eventually landed on "I don't date" as my official presentation, but after about 25, that starts looking really weird too. If I ever have to go back into the closet now, I'd probably claim to be divorced or widowed. "Traumatized by a bad marriage" is still easier for lots of people than "rug muncher".
I'm a lesbian, and the main problems I've encountered are American culture/organizations acting as though everybody has a spouse (when you're homosexual, your dating pool is small enough that if there's nobody around, there's nobody around, but this hits single people in general), assuming that I could move or live anywhere (the suggestion to live in the middle of a rural area to save money was a lot more dicey for us a decade or two ago, and there are still a lot of places I can't/won't travel), and trying to be sure not to out myself on accident during interviews. I had to practice saying I had a boyfriend so I wouldn't slip up. That kind of thing.
That said, I also wasn't visibly gay until last year. I've never run the interview gauntlet as a butch woman, and I imagine that being a butch dyke or an effeminate male adds a new layer of issue.
They never TELL you they're rejecting you because you're gay (or a woman. Or too young. Or disabled). You just get fewer jobs than your peers.