Dennis Ritchie’s first C compiler (c. 1972)

I have to say, the way indentation and brackets were done here looks like it's just inviting subtle bugs. Take this for example:

    if (peekc) {
      c = peekc;
      peekc = 0;
     } else
      if (eof)
       return(0); else
       c = getchar();

code_sloth · 5 years ago

If you judge the coding style of the early 1970s with modern standards, it isn't going to look great. 50 years is a loooong time.

dmr was one of, if not the first C programmer(s).

kiwidrew · 5 years ago

And he was almost certainly using ed(1) as his editor and a mechanical teletype at 7.5 or 10.0 characters per second as his terminal...

The C language (and all of Unix) was designed to be very terse as a consequence.

rightbyte · 5 years ago

If you can only read it line by line:

    return(0); else

makes a bit of sense.

bawolff · 5 years ago

I dont think it would look that out of place if he was using the ternary operator which is the same thing after all.

E.g.:

  if (peekc) {
    c = peekc;
    peekc = 0;
  } else
    eof ?
      return(0) :
      c = getchar();

The first else clause sill looks weird, but the final part isn't nearly as out of place (well i guess assigning in a ternary would be weird, but in terms of indentation) and its not like we actually changed anything.

andreareina · 5 years ago

Can't return in a ternary.

yitchelle · 5 years ago

It would have made code review a nightmarish activity for the team.

wruza · 5 years ago

Why nightmarish? A reviewer may explain that it is prone for other human to overlook the end of conditional and go sleep as usual. I never understood the emotional component of blaming someone’s personal code styles, as if it were a religion with sacrifices and blasphemy instead of just a practical way of coding in a heterogeneous team.

This triggers me because many people jump on “bad c0de” in the forums, but then you read some practical code on github, and it is (a) not as perfectly beautiful as they imagine at all and (b) is still readable without nightmares they promised and (c) the algorithms and structure itself requires programmer’s perception and understanding levels far beyond the “it’s monday morning so i feel like missing an end of statement in a hand-written scanner” anyway.

sriram_malhar · 5 years ago

That team had Bell Labs researchers, Ken Thompson and Doug McIlroy being among them. Their brains could handle much harder things!

amw-zero · 5 years ago

The difference between Dennis Ritchie and the average programmer of today is, Dennis Ritchie did not write bugs in the first place.

Koshkin · 5 years ago

Well, some say (I disagree) that C is a bug.

The original keywords - https://github.com/mortdeus/legacy-cc/blob/master/last1120c/...

Interestingly, long was commented

josephg · 5 years ago

It makes sense long would have needed a comment. It needs a comment because “long” and “double” are terrible names for data types. Long what? Double length what? Those type names could easily have opposite meanings and meant long floating point / double length integer. WORD/DWORD are nearly as bad - calling something a "word" incorrectly implies the data type has something to do with strings.

If you don't believe me, ask a non programmer friend what kind of thing an "integer" is in a computer program. Then ask them to guess what kind of thing a "long" is.

The only saving grace of these terms is they’re relatively easy to memorise. int_16/int_32/int_64 and float_32/float_64 (or i32/i64/f32/f64/...) are much better names, and I'm relieved that’s the direction most modern languages are taking.

(Edit: Oops I thought Microsoft came up with the names WORD / DWORD. Thanks for the correction!)

yiyus · 5 years ago

> ask a non programmer

Why should a non programmer understand programming terms? Words have different meanings in different contexts. That's how words work. There is no need to make these terms understandable to anyone. The layman does not need to understand the meaning of long or word in C source code.

Ask a non-golf player what is an eagle or ask a physicist, a mathematician and a politic the meaning of power.

Word and long may have been poor word choices, but asking a non-programmer is not a good way to test it.

zokier · 5 years ago

> Microsoft’s WORD/DWORD are nearly as bad. Don’t call something a “word” if it doesn’t store characters.

"Word" as a term has been in the wide use since at least 50s-60s, you can't really blame MS for that

https://en.wikipedia.org/wiki/Word_(computer_architecture)

iasmseanyoung · 5 years ago

Yes, int_16/int_32 or something like that makes a lot more sense. Today, not when this compiler was written.

The PDP-9, PDP-10, and PDP-18 have 18 bits registers. The world had not settled on 16/32/64 bits at all.

Even the intel 80286 far/fat pointers are 24 bits.

Symbiote · 5 years ago

It could not be clearer that you didn't read the page referenced by the comment you replied to.

"long" was commented out.

bpgate · 5 years ago

No, the names are fine and self evident after glancing through K&R for 15 min.

The real mistake in retrospect is that int and long are platform dependent. This is an amazing time sink when writing portable programs.

For some reason C programmers looked down on the exact width integer types for a long time.

The base types should have been exact width from the start, and the cool sounding names like int and long should have been typedefs.

In practice, I consider this a larger problem than the often cited NULL.

wruza · 5 years ago

int is neither 32 nor 64. Its width corresponded to a platform’s register width, which is now even more blurred because today’s 64 bit is often too big for practical use and CPUs have separate instructions for 16/32/64 bit operations, and we agreed that int should be likely 32 and long 64, but the latter depends on a platform ABI. So ints with modifiers may be 16, 32, 64 or even 128 on some future platform. intN_t are different fixed-width types (see also int_leastN_t, int_mostN_t, etc in stdint.h; see also limits.h).

Also, don’t forget about short, it feels so sad and alone!

_kst_ · 5 years ago

> It makes sense long would have needed a comment.

I think you misunderstood. There's no explanatory comment. The "long" keyword is commented out, meaning that it was planned but not yet implemented.

    ...
            init("int", 0);
            init("char", 1);
            init("float", 2);
            init("double", 3);
    /*      init("long", 4);  */
            init("auto", 5);
            init("extern", 6);
            init("static", 7);
    ...

kps · 5 years ago

(a) FORTRAN used ‘DOUBLE PRECISION’ since the '50s, so ‘double’ would be immediately obvious.

(b) Many important machines had word sizes that were not a multiple of 8.

coliveira · 5 years ago

In C, long is not the name of a data type, it is a modifier. It turns out that C standard type is integer, so if you say long without another data type (such as double, for example), this means long int.

elvis70 · 5 years ago

That made me curious. From section 2.2 of the 2nd edition of K&R, 'long' is not a type but a qualifier that applies to integers (not floats though), so you can declare a 'long int' type if you prefer.

Deleted Comment

innocenat · 5 years ago

Float doesn't make sense either. What is floating?

(I know it's floating point, but it's the same as long/double).

rightbyte · 5 years ago

There is no short or unsigned either. Maybe int modifiers were pending work?

for is missing too.

dboreham · 5 years ago

Brings back some memories because this was the first "large" program available to me for study when I was first learning to program, around 1978.

Having talked my way into using the local university EE dept's 11/45, I found the source tree was mounted (they had a 90MB CDC "washing machine" drive) and decided to print it out on the 132-col Tally dot-matrix printer.

Some minutes into this, the sysadmin bursts into the terminal room, angrily asking "who's running this big print job".

I timidly raise my hand. He asks what I'm printing. I tell him the C compiler source code because I want to figure out how it works. He responds "Oh, that's ok then, no problem, let me know if you need more paper loaded or a new ribbon".

amyjess · 5 years ago

userbinator · 5 years ago

Previous discussion:

https://news.ycombinator.com/item?id=14669709

https://news.ycombinator.com/item?id=5748672

tirrex · 5 years ago

https://github.com/mortdeus/legacy-cc/blob/ccbe90a2803e2cc7e...

“auto” exists in the initial version.

volta83 · 5 years ago

And what does it mean there? (automatic storage duration)

jstanley · 5 years ago

It means it lives on the stack, so it "exists" until the function returns.

Do you mean automatic type deduction?

Edit : I know it’s also storage specifier but it also does type deduction here, hope I’m not confused with the terminology

astroanax · 5 years ago

RcouF1uZ4gsC · 5 years ago

Very interesting. Does anyone have any idea how the C compiler was bootstrapped. Was it first written in B, or was it written in PDP-11 assembler?

retrac · 5 years ago

It was written in B. And B was written in B. To trace the bootstrap up into the high level, you have to go further back than the PDP-11. To bootstrap B on the PDP-7 and GE-635 machines he had access too, Ritchie wrote the minimum subset required to compile a compiler in B, in a language known as TMG (in the vein of, and a big influence on, Yacc and similar). This minimal bootstrap was then used to compile a B compiler written in B, and further development was self-hosted.

Later, the language would be retargeted to the PDP-11 while on the PDP-7. Various changes, like byte-addressed rather than word-addressed memory, led to it morphing into C after it was moved. There was no clear line between B and C -- the language was self-hosting the whole time as it changed from B into C.

Mr. Ritchie wrote a history from his perspective published in 1993. I've mostly just summarized it above. It's available here: https://web.archive.org/web/20150611114355/https://www.bell-...

pabs3 · 5 years ago

I like that the Bootstrappable Builds folks are working on a bootstrap process that goes from a small amount of machine code (~512 bytes) all the way up to a full Linux distro, including compilers/interpreters for modern languages.

https://bootstrappable.org/

systemvoltage · 5 years ago

When people say bootstrapping, is this how all computer languages came to be? Or there has been multiple attempts at bootstrapping? What I’m getting at is whether all languages have a single root.

AntiRush · 5 years ago

The basic sequence was:

- Rewrite B compiler in B (generating threaded code)

- Extend B to a language Ritchie called NB (new B). This compiler generated PDP assembly There is no version of the NB compiler known to exist.

- Continue extending NB until it became the very early versions of C.

You can read the longer version of this history here:

https://www.bell-labs.com/usr/dmr/www/chist.html

0x003 · 5 years ago

I think it was written in B as Ritchie extended B for the PDP-11. By that point the B compiler was written in B

https://sci-hub.do/10.1145/155360.155580

EGreg · 5 years ago

Real programmers use butterflies

QuantumG · 5 years ago

Flint knives and bear skins.

Abacus

andi999 · 5 years ago

So do I get this right: this is not the first C compiler (since that one would be written in B) and the first C compiler written in C?

Jeema101 · 5 years ago

Thank you for pointing this out. It obviously can't be his first C compiler because it's written in C! :)

I'm not even sure it's the first C compiler written in C, though - it just says in the github description "the very first c compiler known to exist in the wild."

Regardless, if it's from 1972 it's a very early version.

mywittyname · 5 years ago

> Thank you for pointing this out. It obviously can't be his first C compiler because it's written in C! :)

This isn't obvious to me.

I just assumed that the first iteration was compiled by hand to bootstrap a minimum workable version. Then the language would be extended slightly, and that version would be compiled with the compiler v. n-1 until a full-feature compiler is made.

It makes sense to write a compiler in a different language, but given the era, I could see hand-compilation still being a thing.

Blahagun · 5 years ago

It's not uncommon for a compiler to be partially written in its own language, so therefore this may as well be the first C compiler.

jb1991 · 5 years ago

well, this sure looks wasteful:

https://github.com/mortdeus/legacy-cc/blob/master/prestruct/...

tyingq · 5 years ago

Explanation: https://news.ycombinator.com/item?id=5748762

lisper · 5 years ago

Why not just:

char waste[however-many-bytes-are-needed];