A Reply to “Let’s stop copying C”

>NULL/nil is just one of many invalid memory addresses, and in practice most of invalid memory address are not NULL.

I want a language that can check at compile/check time that none of my pointers will have invalid addresses. Recognizing null as just another invalid value makes it more obvious to me that I want a language to handle it differently than how C does it.

In memory-safe languages, it's already unthinkable (/ very rare outside of situations where you're deliberately doing unsafe/native code integrations) to get non-null invalid pointers that point to a different type of thing than you want. When was the last time you had a Java program crash because a typed reference actually had a different type of reference at runtime somehow? Isn't that great how that basically never happens? -- But if you consider nulls a different type of reference, then it does actually happen sometimes. It would be great if we could try to close off that issue.

I'm a huge fan of languages with non-nullable references as the standard, like Kotlin, Typescript, and Rust. In my experience, it's much easier as a programmer to understand how a codebase uses nulls when the codebase is in a language with default non-nullable references, and therefore null-related issues where a future programmer passes a nullable value somewhere it shouldn't be happen much less often.

enriquto · 6 years ago

> I want a language that can check at compile/check time that none of my pointers will have invalid addresses.

This is incompatible with pointer arithmetic, as it would require to solve the halting problem. You could, still, verify it at run time.

AgentME · 6 years ago

Right, but another option is to eschew pointer arithmetic. Iterators in many languages address many usages of pointer arithmetic, and can be designed to compile to the same sort of code as pointer arithmetic-using C compiles to.

C wouldn't ever be able to take out pointer arithmetic, so I worry anyone envisioning a new language as some diff from C is probably going to get stuck on that sort of thing too. I'm a big fan of the original referenced article by Eevee for bringing this sort of thing up.

kragen · 6 years ago

It's entirely possible for a compiler to reject all programs that it can't verify as safe at compile time, including for array bounds checking, including pointer arithmetic. It's true that the uncomputability of the halting problem means that the compiler must reject some safe programs to achieve this. Doing this in practice probably requires using dependent types.

dgellow · 6 years ago

> I'm a huge fan of languages with non-nullable references as the standard, like Kotlin, Typescript, and Rust.

What about C++ references? They cannot be null, have to be initialized, and cannot be reinitialzed or made to target another object.

pjc50 · 6 years ago

You can trivially initialize one to null by setting it to reference a dereference of a null pointer.

That's undefined behavior, but the big weakness of C is that undefined behavior nearly always compiles without even a warning.

(I was going to suggest that a dialect of C in which undefined behavior would never compile would be very useful, but then I remembered that it wouldn't be able to do integer arithmetic)

kaetemi · 6 years ago

In theory. In practice, there are almost always ways.

TIL the C variable declaration syntax is meant to evoke how you would use the variable. For instance,

  int *x[3]

means that to get an int, I type

  *x[3]

(well ok, 3 would be a poor choice). Whereas if I have

  int (*x)[3]

I'd dereference it as

  (*x)[0]

meaning, it's a pointer to array of ints, while the first is an array of int pointers.

This is mildly life-changing.

ensiferum · 6 years ago

You got an off by one bug in your first example ;-)

wruza · 6 years ago

If we used languages with 1-based arrays, then the first element was 1, the last was n, and find(x) could return 0 as an “item not found” marker instead of returning -1 or worse uint_max. Reasoning about counting from the end could be off-by-one easier: n+1-i instead of n-i-1, n-1-i or n-(i+1), whichever nonsense you like better. Looping: i=1, i<=n. Setting next: a[n+1]=x, where the capacity allows.

But when you mention one of these languages, you get a bunch of “oh, 1-based arrays, so uncool, not an option”.

jlebar · 6 years ago

You mean "(well ok, 3 would be a poor choice)"?

alfiedotwtf · 6 years ago

You want something even more life-changing? There's the Spiral Rule: http://c-faq.com/decl/spiral.anderson.html

rcxdude · 6 years ago

I think the 'declaration follows use' insight is far more useful than the 'spiral rule', which manages to get the same result but completely obscures the intuition.

dooglius · 6 years ago

Looking at it this way also makes it easier to remember/understand function pointer syntax, as well as attributes like const and volatile when placed between asterisks.

dang · 6 years ago

The referenced article was discussed last year: https://news.ycombinator.com/item?id=18977460

and a bit at the time: https://news.ycombinator.com/item?id=13079341

eximius · 6 years ago

Hadn't read the original essay, but Eevee is always a great blog.

Honestly, Rust hits a lot of good points for me. My only concern so far has been `.await` and a minor concern that they'll keep adding junk and end up like Perl with too many features.

zozbot234 · 6 years ago

Rust has a standardized 'edition' system, so they can deprecate superseded/junk features from newer editions of the language while still playing nice with legacy code targeting an older 'edition'.

That's true and I hadn't really thought of it being used for removing features!

But, also, the features can only be removed from the language. The compiler must support them forever. :/

DarthGhandi · 6 years ago

It's sugar for existing features. "await"ing existed before but required some really messy syntax to achieve the same thing. It's very much an incredible improvement over the status quo.

I'm aware. I just believe that the design decision around `future.await` was dumb.

Preface: this is a minor syntactical annoyance that aesthetically and in principle annoys me a great deal, but in practice is unlikely to cause much confusion. The rest of this should be considered a light hearted rant.

`future.await` looks like it should be field access, not running arbitrary code for a future. It should have either been a keyword (a la `await future`), or some language built in trait method (like `future.await()`), of which there is precedent for things that require compiler magic. The arguments against macro or method like syntax were "but it's not a function in the mathematical sense", which is ironic given they chose something that looks like field access, and irrelevant in that I could have a function call inline assembly and just jump elsewhere, ignoring their concept of mathematical functions and stacks anyway. I'll admit that the post-`future` syntaxes have the benefit of chaining, which is why I prefer `future.await()` over `await future` or `await!(future)`, but I still stand by that `future.await` was the wrong choice.

So my main problem is that they made a poor design choice from what seemed an obvious pool of candidates (to me). The worry is that it'll happen again for more bizarre features that happen to be lobbied for.

/Endrant

zzo38computer · 6 years ago

I do not agree with all of them. I think assignment expressions is good, and textual inclusion is good (but should not be the only kind of inclusion), and increment/decrement operators is good, and macros is good (although the way C does it is not good enough; there are many things it doesn't do), and pointer arithmetic is good, and goto is good. But I agree with them that the C syntax for octal numbers is bad, and the C syntax for types is bad. Identifiers should not have hyphens if the same sign is also used as an operator. Strings should be byte strings which have no "default" encoding other than ASCII (although UTF-8 and other encodings are still usable too, if they are compatible with ASCII). Another problem with C is that it does not allow null declarations, and does not allow duplicate declarations even if they are the same. There are many other problems with C as well; some kind of low-level stuff is too difficult with C.

RealityVoid · 6 years ago

>> some kind of low-level stuff is too difficult with C.

Surprised to hear this. I do C for a living, including some low level stuff (not x86) and there aren't a lot of low level things I feel I couldn't do. Can you expand a bit?

jstimpfle · 6 years ago

> Another problem with C is that it does not allow null declarations, and does not allow duplicate declarations even if they are the same.

What do you mean here?

Well, it works now (so that problem no longer exists); but on the computer I used before this one, it didn't work.

axaxs · 6 years ago

I'm having trouble following the negative modulo wording or formula. It says % and %% are identical for unsigned integers then below defines it as

a %% b == ((a % b) + a) % b

If that's the case I can't figure out how they are identical for unsigned integers. Or is that example only a hint for how it would work with negatives?

klyrs · 6 years ago

I don't precisely follow the text either; it looks like that formula computes ±(2a)%b.

Perhaps it's a typo for

a %% b == ((a % b) + b) % b

Which brings a negative modulus into the range [0, b)

mjevans · 6 years ago

I think you are probably correct. Lets try a set of simple cases.

    ( 4  % 3) == 1
    ( 4 %% 3) == 1

Are these two correct answers?

    (-4  % 3) == -2 ??
    (-4 %% 3) == 1 ??

Lets assume I got that correct and plug in some numbers.

     a %% b == (( a % b) +  **b** ) % b
    -4 %% 3 == ((-4 % 3) + 3 ) % 3
    -4 %% 3 == (-2 + 3) % 3
    -4 %% 3 == 1

However I think it would be clearer for maintenance if the extra operator wasn't used and instead some 'math.absolute()' function were used.

PS: Hopefully those are short enough to not die on mobile

gingerBill · 6 years ago

That was a typo and has now been corrected.

mckinney · 6 years ago

I hadn't given Odin a look before I read this post. As a fellow general purpose language author (Gosu) I applaud the author's pragmatic, albeit unpopular at times, point of view. For instance, his position regarding Null is spot on re the "streetlight effect" reference. Others include: * pascal-style declarations * type system * multiple returns * strings * switch Also I would not downplay the advantages of operator tokens && ||. More than just familiarity with C-family developers, in my experience, they are more generally effective as expression delimiters. They stand out better than 'and' 'or', which is better for readability.

kazinator · 6 years ago

The direction of modulo with negative operands (and of division) was implementation-defined in C90.

At least it is pinned down now.

Common Lisp has a number different modulo operators that go every which way.

Firstly there are the single-valued mod and rem:

http://clhs.lisp.se/Body/f_mod_r.htm

Secondly, the floor, ceiling, truncate and round functions also return a remainder:

http://clhs.lisp.se/Body/f_floorc.htm