Readit News logoReadit News
dloreto commented on Rethinking Infrastructure as Code from Scratch   nathanpeck.com/rethinking... · Posted by u/dguo
codethief · 3 years ago
My thoughts have been going into another direction entirely:

- We need to get rid of YAML. Not only because it's a horrible file format but also because it lacks proper variables, proper type safety, proper imports, proper anything. To this day, usage & declaration search in YAML-defined infrastructure still often amounts to a repo-wide string search. Why are we putting up with this?

- The purely declarative approach to infrastructure feels wrong. For instance, if you've ever had to work on Gitlab pipelines, chances are that already on day 1 you started banging your head against the wall because you realized that what you wanted to implement is not possible currently – at least not without jumping through a ton of hoops –, and there's already an open ticket from 2020 in Gitlab's issue tracker. I used to think, how could the Gitlab devs possibly forget to think of that one really obvious use case?! But I've come to realize that it's not really their fault: If you create any declarative language, you as the language creator will have to define what all those declarations are supposed to mean and what the machine is supposed to do when it encounters them. Behind every declaration lies a piece of imperative code. Unfortunately, this means you'll need to think of all potential use cases of your language and your declarations, including combinations and permutations thereof. (There's a reason why it's taken so long for CSS to solve even the most basic use cases.) Meanwhile, imperative languages simply let the user decide what they want. They are much more flexible and powerful. I realize I'm not saying anything new here but it often feels like as if DevOps people have forgotten about the benefits of high-level programming languages. Now this is not to say we should start defining all our infrastructure in Java but let's at least allow for a little bit of imperativeness and expressiveness!

dloreto · 3 years ago
I have a similar view to yours: as soon as you need variables, imports, functions or any other type of logic ... the existing "data-only" formats break down. Over time people either invent new configuration languages that enable logic (i.e. cue or jsonnet), or they try to bolt-in some limited version of these primitives into their configuration.

My personal take is that at some point you are better of just using a full programming langugage like TypeScript. We created TySON https://github.com/jetpack-io/tyson to experiment with that idea.

dloreto commented on TySON: TypeScript Object Notation. Use TS as an embeddable config language.   github.com/jetpack-io/tys... · Posted by u/dloreto
dloreto · 3 years ago
JSON + types + functions using TypeScript syntax. Makes it possible to use TypeScript as a configuration language for applications written in `go`, and soon `rust` and other major languages.
dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
klabb3 · 3 years ago
A couple of suggestions:

Lock down the prefix string now before it’s too late and document it. I see in Go that it’s lowercase ascii, which seems fine except for compound types (like “article-comment”). May be worth looking at allowing a single separator given that many complex projects (and ORMs) can’t avoid them.

The Go implementation has no tests. This is very unit-testable. Add tests goddammit!

For Go, I’d align with Googles UUID implementation, with proper parse functions and an internal byte array instead of strings. Strings are for rendering (and in your case, the prefix). Right now, it looks like the parsing is too permissive, and goes into generation mode if the suffix is empty. And the SplitN+index thing will panic if no underscores, no? Anyway, tests will tell.

As for the actual design decisions, I tried to poke holes but I fold! I think this strikes the sweet spot between the different tradeoffs. Well done!

dloreto · 3 years ago
A follow up:

1. We've now implemented pretty thorough testing: https://github.com/jetpack-io/typeid-go/blob/main/typeid_tes...

2. I clarified the prefix in the spec

Thanks for the feedback!

dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
dolmen · 3 years ago
There is no tests.

There is just a single test. Which only tests the decoding of a single known value. No encoding test.

Go has infrastructure for benchmarking and fuzzing. Use it!

Also, you took code from https://github.com/oklog/ulid/blob/main/ulid.go which has "Copyright 2016 The Oklog Authors" but this is not mentionned in your base32.go.

dloreto · 3 years ago
We've now implemented pretty thorough testing: https://github.com/jetpack-io/typeid-go/blob/main/typeid_tes...

Thanks for the feedback!

dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
inopinatus · 3 years ago
I'm not wild about the Crockford encoding. In practice I've found it to be a flat-out mistake when you come to provide technical support or analysis for values encoded this way. The Crockford alphabet is based on design goals that are rarely encountered in practice, such as pronouncing identifiers over the phone. It introduces ambiguity, which is a disaster for grepping logs or any other circumstances where you might query or cross-reference based on the encoded string instead of the decoded value, then permits hyphens, a leading source of cut-and-paste and line-break errors.

Note that people generally do not type in object identifiers, but they do frequently cut-and-paste them between applications and chat/forum interfaces, forward them by email, search for them in log files. Verbal transmission is rare to non-existent. Under these conditions, pronunciation proves irrelevant, and case-insensitivity becomes an impediment, but consistency and paste/break resilience become necessary.

Base 58 offers a bijective encoding that fits these concerns much more effectively and is more compact to boot. Similarly inspired by Stripe, I've been using type-prefixed base58-encoded UUIDs for object identifiers for some years. user_1BzGURpnHGn6oNru84B3Ri etc.

Edit to add: to be fair to Douglas Crockford, his encoding of base 32 was designed two decades ago, when the usage landscape looked quite different.

dloreto · 3 years ago
I hear you ... and I debated using either base58 or base64url. I do like the more compact encoding they provide.

Ultimately I ended up leaning towards a base32 encoding, because I didn't want to pre-suppose case sensitivity. For example, you might want to use the id as a filename, and you might be in an environment where you're stuck with a case insensitive filesystem.

Note that TypeID is using the Crockford alphabet and always in lowercase – *not* the full rules of Crockford's encoding. There's no hyphens allowed in TypeIDs, nor multiple encodings of the same ID with different variations of the ambiguous characters.

dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
klabb3 · 3 years ago
A couple of suggestions:

Lock down the prefix string now before it’s too late and document it. I see in Go that it’s lowercase ascii, which seems fine except for compound types (like “article-comment”). May be worth looking at allowing a single separator given that many complex projects (and ORMs) can’t avoid them.

The Go implementation has no tests. This is very unit-testable. Add tests goddammit!

For Go, I’d align with Googles UUID implementation, with proper parse functions and an internal byte array instead of strings. Strings are for rendering (and in your case, the prefix). Right now, it looks like the parsing is too permissive, and goes into generation mode if the suffix is empty. And the SplitN+index thing will panic if no underscores, no? Anyway, tests will tell.

As for the actual design decisions, I tried to poke holes but I fold! I think this strikes the sweet spot between the different tradeoffs. Well done!

dloreto · 3 years ago
Thanks for the feedback!

We have tests for the base32 encoding which is the most complicated part of the implementation (https://github.com/jetpack-io/typeid-go/blob/main/base32/bas...) but your point stands. We'll add a more rigorous test suite (particularly as the number of implementations across different languages grows, and we want to make sure all the implementations are compatible with each other)

Re: prefix, is the concern that I haven't defined the allowed character set as part of the spec?

dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
aartav · 3 years ago
I've been doing this kind of thing for years with two notable differences:

1. I don't believe people actually hand type-in these values, so I'm not really concerned about the 'l' vs '1' issue. I do base 32 without `eiou` (vowels) to reduce the likelihood of words (profanity) sneaking in.

2. I add two base-32 characters as a checksum (salted of course). This is prevents having to go look at the datastore when the value is bogus either by accident or malice. I'm unsure why other implementations don't do this.

dloreto · 3 years ago
The checksum idea is interesting. I'm considering whether it makes sense to add it as part of the TypeID spec.
dloreto commented on Type-safe, K-sortable, globally unique identifier inspired by Stripe IDs   github.com/jetpack-io/typ... · Posted by u/dloreto
jtmarmon · 3 years ago
This looks great! Is there a reason one couldn't use this with v4 UUIDs? A quick test shows that they encode/decode just fine. Wondering if I could use the encoded form as a way to niceify our URLs without having to change how the IDs (currently v4 uuids) are stored
dloreto · 3 years ago
The CLI tool will support encoding/decoding any valid UUID, whether v1, v4, or v7. We picked v7 as the definition of the spec, because we need to choose one of them when generating a new random ID, and our opinion is that by default, that should be v7.

We might add a warning in the future if you decode/encode something that is not v7, but if it suits your use-case to encode UUIDv4 in this way, go for it. Just keep in mind that you'll lose the locality property.

u/dloreto

KarmaCake day238August 15, 2011View Original