josephg (u/josephg) - Readit News

josephg commented on Leaving Gmail for Mailbox.org giuliomagnifico.blog/post... · Posted by u/giuliomagnifico

You can find several discussions of this practice online, including people commenting that they receive email for previous holders of those ids.

The commenter above may have never deleted the alias to release it for reuse.

Reusing email addresses is pretty universally considered terrible practice. So you may want to discuss it with your friends there.

josephg · 6 hours ago

I will. But for everyone on this site, the real answer is to own your own domain and use that. Then your email service is just a commodity.

josephg commented on I'm too dumb for Zig's new IO interface openmymind.net/Im-Too-Dum... · Posted by u/begoon

hu3 · 17 hours ago

I beg to differ. Rust async implementation is contentious and criticized often. Sometimes you just miss the mark despite pondering about it for a while. Same with Go.

josephg · 7 hours ago

Yeah async rust is definitely the exception. That and Pin, which in my opinion totally missed the mark. The feature rust needs is the ability to have self reference in a struct. Pin is a hacky, inadequate half solution.

josephg commented on I'm too dumb for Zig's new IO interface openmymind.net/Im-Too-Dum... · Posted by u/begoon

j1elo · 19 hours ago

> To convert the Stream.Reader to an std.Io.Reader, we need to call its interface() method. To get a std.io.Writer from an Stream.Writer, we need the address of its &interface field. This doesn't seem particularly consistent.

That made me think of how that change would be received in Go (probably would be discarded). They way they approach changes in extremely deep analysis and taking as much time as it needs to avoid mistakes and reach a consistent solution (or as close as possible).

This has been my favorite for a while: https://github.com/golang/go/issues/45624

4 years to decide on something relatively minor, that right now can be done with a bit of a one-liner extra work. But things need to be well thought out. Inconsistencies are pointed out. Design concerns are raised. Actual code usage in the real world are taken into account... too slow for some people, but I think it's just as slow as it needs to be. The final decision is shaping out to be very nice.

josephg · 19 hours ago

Rust is the same. It grinds my goat a little how many useful features are implemented - but only available in nightly rust. Things like generators.

But when rust ships features to stable, they’re usually pretty well thought through. I’m impatient. But the rust language & compiler teams probably have the right idea.

josephg commented on Leaving Gmail for Mailbox.org giuliomagnifico.blog/post... · Posted by u/giuliomagnifico

ants_everywhere · a day ago

> Do that, it's a non-issue

I think the issue is why use an email provider that has designed such a glaring security hole into their system? Does it not raise questions about their judgment in other matters that are less visible to the user?

josephg · a day ago

First, it’s not been established that they do have that security hole. Someone upthread said the email address they used during a fastmail trial was no longer available when they tried to sign up later because they didn’t want to give out the address again.

Second, and I don’t know how much weight this carries - but I personally know some of the people on the Fastmail team. They’re some of the most thoughtful, steady engineers I’ve ever met. Every time I’ve criticised something about Fastmail to my friends there, it turns out they’ve had the same discussion internally and immediately tell me about a bunch of arguments I hadn’t thought of which explain their final product choices. I wish much more of my software was made at companies like that. They have excellent judgement. They’re absolutely the right kind of people to host a long lived email service.

josephg commented on Show HN: OS X Mavericks Forever mavericksforever.com/... · Posted by u/Wowfunhappy

philistine · 3 days ago

Both those projects can only go skin deep. The macOS experience is not only how it looks, but the depth of its interactivity and the thoughtful implementations within that depth.

I still shudder when I see the limitations of dragging files in Windows. The fact I can drag a folder to a save dialog to jump to that folder is so natural to me, and Windows and Linux never bothered with those details.

josephg · 2 days ago

Yep. And you can drag from the title bar in a lot of applications to get the open file. And all the shortcut keys are consistent across applications.

I daily drive Linux mint. I can’t use ctrl+C in the terminal for Copy because that’s reserved for the interrupt signal. Fine - I’ll use meta+C. But I can’t use meta+C to copy in IntelliJ because the meta key isn’t a modifier key in Java. I’ve ended up needing to memorise different keys for copy+paste in every program I use. I mess it up on a daily basis. It’s madness!

Linux is like that everywhere. I like smooth scrolling. Some applications support it properly. Some half support it, or add scrolling lag for no reason (Firefox) and some break completely, assuming every scroll event should scroll a few lines down. I eventually solved my software problems by buying a worse mouse without smooth scrolling support.

Alt+mouse drag moves windows around. I love that feature! I can’t believe windows and macOS are missing it! But - oops. Alt+click is a thing in davinci resolve for adding keyframes. Urgh. It’s this. Over and over again constantly.

josephg commented on It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019) hsivonen.fi/string-length... · Posted by u/program

account42 · 2 days ago

> Eg maybe insert “a” at position 0 is valid, but inserting at position 1 would be invalid because it might insert in the middle of a codepoint.

You have the same problem with code points, it's just hidden better. Inserting "a" between U+0065 and U+0308 may result in a "valid" string but is still as nonsensical as inserting "a" between UTF-8 bytes 0xC3 and 0xAB.

This makes code points less suitable than UTF-8 bytes as mistakes are more likely to not be caught during development.

josephg · 2 days ago

I hear your point, but invalid codepoint sequences are way less of a problem than strings with invalid UTF8. Text rendering engines deal with weird Unicode just fine. They have to since Unicode changes over time. Invalid UTF8 on the other hand is completely unrepresentable in most languages. I mean, unless you use raw byte arrays and convert to strings at the edge, but that’s a terrible design.

> This makes code points less suitable than UTF-8 bytes as mistakes are more likely to not be caught during development.

Disagree. Allowing 2 kinds of bugs to slip through to runtime doesn’t make your system more resilient than allowing 1 kind of bug. If you’re worried about errors like this, checksums are a much better idea than letting your database become corrupted.

josephg commented on It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019) hsivonen.fi/string-length... · Posted by u/program

account42 · 2 days ago

> Number of code points when parsing.

You shouldn't really ever care about the number of code points. If you do, you're probably doing something wrong.

josephg · 2 days ago

It’s a bit of a niche use case, but I use the codepoint counts in CRDTs for collaborative text editing.

Grapheme cluster counts can’t be used because they’re unstable across Unicode versions. Some algorithms use UTF8 byte offsets - but I think that’s a mistake because they make input validation much more complicated. Using byte offsets, there’s a whole lot of invalid states you can represent easily. Eg maybe insert “a” at position 0 is valid, but inserting at position 1 would be invalid because it might insert in the middle of a codepoint. Then inserting at position 2 is valid again. If you send me an operation which happened at some earlier point in time, I don’t necessarily have the text document you were inserting into handy. So figuring out if your insertion (and deletion!) positions are valid at all is a very complex and expensive operation.

Codepoints are way easier. I can just accept any integer up to the length of the document at that point in time.

josephg commented on Io_uring, kTLS and Rust for zero syscall HTTPS server blog.habets.se/2025/04/io... · Posted by u/guntars

avar · 2 days ago

    every HTTP session was commonly a forked
    copy of the entire server in the CERN
    and Apache lineage!

And there's nothing wrong with that for application workers. On *nix systems fork() is very fast, you can fork "the entire server" and the kernel will only COW your memory. As nginx etc. showed you can get better raw file serving performance with other models, but it's still a legitimate technique for application logic where business logic will drown out any process overhead.

josephg · 2 days ago

So long as you have something like nginx in front of your server. Otherwise your whole site can be taken down by a slowloris attack over a 33.6k modem.

josephg commented on CRDT: Text Buffer madebyevan.com/algos/crdt... · Posted by u/skadamat

yladiz · 4 days ago

I'm starting to learn more about RGAs and CRDTs in general, so I'm not sure if this makes sense, but when you say replacing the sequence number for the "right parent" (`originRight` in your code?), so you mean replacing the Lamport timestamp for the node/operation with a pointer to the element adjacent to the right, correct? One alternative way to approach it that comes to mind is to introduce transaction semantics so that you can consider a node to be identified by a [Lamport timestamp, site ID, transaction sequence] and the parent, and use a sequence number within the transaction to sort, but it seems like it would add additional data and complexity compared to the "right parent" approach, so it might not be ideal, and may fall victim to the same downside as the original RGA.

josephg · 3 days ago

> so you mean replacing the Lamport timestamp for the node/operation with a pointer to the element adjacent to the right, correct?

Yeah thats right. Its a GUID, because they need to be sent over the wire. For text editing, we usually use {site ID, transaction sequence} because they compress better than random IDs.

> One alternative way to approach it that comes to mind is to introduce transaction semantics so that you can consider a node to be identified by a [Lamport timestamp, site ID, transaction sequence] and the parent, and use a sequence number within the transaction to sort, ...

Maybe? I don't fully understand what you mean. And even if I did, I'm not clever enough to infer all the implications of that construction. But yes, I suspect you're right that in the best case, it would be equivalent to fuguemax. And in the worst case, it would introduce new bugs.

josephg commented on CRDT: Text Buffer madebyevan.com/algos/crdt... · Posted by u/skadamat

archagon · 3 days ago

> If you ask me, this is a mistake. The algorithm is simpler if you primarily think of it as inserts into a list. (Thanks Kevin Jahns for this insight!)

Can you elaborate? This sounds a little tautological, so I must be missing something.

josephg · 3 days ago

The way RGA and Fugue are usually described, all the inserted items form a tree. In RGA, each item has {parent: ID, seq: number}. The item is inserted as a child of its specified parent. Fugue is a little more complex because items specify 2 parent IDs. You can store this as a tree where every item has left children and right children.

But if you actually implement your CRDT like this, you'll find the tree is incredibly unbalanced. You'll end up with runs of thousands of items where you have (x)->(y)->(z)->(q) and so on. It resembles a linked list more than anything. Performance is abysmal as a result. This is one of the causes for the terrible performance of early versions of automerge.

Here's the trick: Flatten the tree. Store all items in a list instead, in the order all the items show up in the document. But this presents a new problem: how do you correctly handle inserts? We need to insert new items in the list in the correct location, as if we inserted into a tree then flattened it afterwards. But it turns out that this translation is quite simple in practice. Its like ~10-20 lines of code.

Interestingly, the fugue paper first describes fugue (as a tree). Then it identifies & fixes a problem in the algorithm to produce fuguemax. If you do the list insertion order translation on both fugue and fuguemax, fugue ends up with an extra if() statement that causes this problem. If you remove that if statement, you get the (better) fuguemax algorithm.

This transformation results in much better performance, and much lower memory usage. Counter-intuitively, you get another order of magnitude improved performance if you then store this flattened list once more in a b-tree.

If you're curious, here's the equivalent insertion code for fuguemax, rga and yjs. These are all fuzz tested against their upstream reference implementations to verify equivalence. Fugue is also somewhere in this file, if you want to compare.

Here's FugueMax[1]: https://github.com/josephg/reference-crdts/blob/c53947408770...

RGA: https://github.com/josephg/reference-crdts/blob/c53947408770...

And Yjs: https://github.com/josephg/reference-crdts/blob/c53947408770...

As I said, I didn't come up with this idea. Kevin Jahns figured out this trick for Yjs. I adapted it to the other algorithms.

[1] Fuguemax is called "yjsmod" in this repository because this code predates the fugue paper. It turns out our algorithms are equivalent.