Brings back some memories because this was the first "large" program available to me for study when I was first learning to program, around 1978.
Having talked my way into using the local university EE dept's 11/45, I found the source tree was mounted (they had a 90MB CDC "washing machine" drive) and decided to print it out on the 132-col Tally dot-matrix printer.
Some minutes into this, the sysadmin bursts into the terminal room, angrily asking "who's running this big print job".
I timidly raise my hand. He asks what I'm printing. I tell him the C compiler source code because I want to figure out how it works. He responds "Oh, that's ok then, no problem, let me know if you need more paper loaded or a new ribbon".
I dont think it would look that out of place if he was using the ternary operator which is the same thing after all.
E.g.:
if (peekc) {
c = peekc;
peekc = 0;
} else
eof ?
return(0) :
c = getchar();
The first else clause sill looks weird, but the final part isn't nearly as out of place (well i guess assigning in a ternary would be weird, but in terms of indentation) and its not like we actually changed anything.
Why nightmarish? A reviewer may explain that it is prone for other human to overlook the end of conditional and go sleep as usual. I never understood the emotional component of blaming someone’s personal code styles, as if it were a religion with sacrifices and blasphemy instead of just a practical way of coding in a heterogeneous team.
This triggers me because many people jump on “bad c0de” in the forums, but then you read some practical code on github, and it is (a) not as perfectly beautiful as they imagine at all and (b) is still readable without nightmares they promised and (c) the algorithms and structure itself requires programmer’s perception and understanding levels far beyond the “it’s monday morning so i feel like missing an end of statement in a hand-written scanner” anyway.
It makes sense long would have needed a comment. It needs a comment because “long” and “double” are terrible names for data types. Long what? Double length what? Those type names could easily have opposite meanings and meant long floating point / double length integer. WORD/DWORD are nearly as bad - calling something a "word" incorrectly implies the data type has something to do with strings.
If you don't believe me, ask a non programmer friend what kind of thing an "integer" is in a computer program. Then ask them to guess what kind of thing a "long" is.
The only saving grace of these terms is they’re relatively easy to memorise. int_16/int_32/int_64 and float_32/float_64 (or i32/i64/f32/f64/...) are much better names, and I'm relieved that’s the direction most modern languages are taking.
(Edit: Oops I thought Microsoft came up with the names WORD / DWORD. Thanks for the correction!)
Why should a non programmer understand programming terms? Words have different meanings in different contexts. That's how words work. There is no need to make these terms understandable to anyone. The layman does not need to understand the meaning of long or word in C source code.
Ask a non-golf player what is an eagle or ask a physicist, a mathematician and a politic the meaning of power.
Word and long may have been poor word choices, but asking a non-programmer is not a good way to test it.
int is neither 32 nor 64. Its width corresponded to a platform’s register width, which is now even more blurred because today’s 64 bit is often too big for practical use and CPUs have separate instructions for 16/32/64 bit operations, and we agreed that int should be likely 32 and long 64, but the latter depends on a platform ABI. So ints with modifiers may be 16, 32, 64 or even 128 on some future platform. intN_t are different fixed-width types (see also int_leastN_t, int_mostN_t, etc in stdint.h; see also limits.h).
Also, don’t forget about short, it feels so sad and alone!
In C, long is not the name of a data type, it is a modifier. It turns out that C standard type is integer, so if you say long without another data type (such as double, for example), this means long int.
That made me curious. From section 2.2 of the 2nd edition of K&R, 'long' is not a type but a qualifier that applies to integers (not floats though), so you can declare a 'long int' type if you prefer.
It was written in B. And B was written in B. To trace the bootstrap up into the high level, you have to go further back than the PDP-11. To bootstrap B on the PDP-7 and GE-635 machines he had access too, Ritchie wrote the minimum subset required to compile a compiler in B, in a language known as TMG (in the vein of, and a big influence on, Yacc and similar). This minimal bootstrap was then used to compile a B compiler written in B, and further development was self-hosted.
Later, the language would be retargeted to the PDP-11 while on the PDP-7. Various changes, like byte-addressed rather than word-addressed memory, led to it morphing into C after it was moved. There was no clear line between B and C -- the language was self-hosting the whole time as it changed from B into C.
I like that the Bootstrappable Builds folks are working on a bootstrap process that goes from a small amount of machine code (~512 bytes) all the way up to a full Linux distro, including compilers/interpreters for modern languages.
When people say bootstrapping, is this how all computer languages came to be? Or there has been multiple attempts at bootstrapping? What I’m getting at is whether all languages have a single root.
Thank you for pointing this out. It obviously can't be his first C compiler because it's written in C! :)
I'm not even sure it's the first C compiler written in C, though - it just says in the github description "the very first c compiler known to exist in the wild."
Regardless, if it's from 1972 it's a very early version.
> Thank you for pointing this out. It obviously can't be his first C compiler because it's written in C! :)
This isn't obvious to me.
I just assumed that the first iteration was compiled by hand to bootstrap a minimum workable version. Then the language would be extended slightly, and that version would be compiled with the compiler v. n-1 until a full-feature compiler is made.
It makes sense to write a compiler in a different language, but given the era, I could see hand-compilation still being a thing.
Having talked my way into using the local university EE dept's 11/45, I found the source tree was mounted (they had a 90MB CDC "washing machine" drive) and decided to print it out on the 132-col Tally dot-matrix printer.
Some minutes into this, the sysadmin bursts into the terminal room, angrily asking "who's running this big print job".
I timidly raise my hand. He asks what I'm printing. I tell him the C compiler source code because I want to figure out how it works. He responds "Oh, that's ok then, no problem, let me know if you need more paper loaded or a new ribbon".
dmr was one of, if not the first C programmer(s).
The C language (and all of Unix) was designed to be very terse as a consequence.
E.g.:
The first else clause sill looks weird, but the final part isn't nearly as out of place (well i guess assigning in a ternary would be weird, but in terms of indentation) and its not like we actually changed anything.This triggers me because many people jump on “bad c0de” in the forums, but then you read some practical code on github, and it is (a) not as perfectly beautiful as they imagine at all and (b) is still readable without nightmares they promised and (c) the algorithms and structure itself requires programmer’s perception and understanding levels far beyond the “it’s monday morning so i feel like missing an end of statement in a hand-written scanner” anyway.
https://news.ycombinator.com/item?id=14669709
https://news.ycombinator.com/item?id=5748672
“auto” exists in the initial version.
Edit : I know it’s also storage specifier but it also does type deduction here, hope I’m not confused with the terminology
Interestingly, long was commented
If you don't believe me, ask a non programmer friend what kind of thing an "integer" is in a computer program. Then ask them to guess what kind of thing a "long" is.
The only saving grace of these terms is they’re relatively easy to memorise. int_16/int_32/int_64 and float_32/float_64 (or i32/i64/f32/f64/...) are much better names, and I'm relieved that’s the direction most modern languages are taking.
(Edit: Oops I thought Microsoft came up with the names WORD / DWORD. Thanks for the correction!)
Why should a non programmer understand programming terms? Words have different meanings in different contexts. That's how words work. There is no need to make these terms understandable to anyone. The layman does not need to understand the meaning of long or word in C source code.
Ask a non-golf player what is an eagle or ask a physicist, a mathematician and a politic the meaning of power.
Word and long may have been poor word choices, but asking a non-programmer is not a good way to test it.
"Word" as a term has been in the wide use since at least 50s-60s, you can't really blame MS for that
https://en.wikipedia.org/wiki/Word_(computer_architecture)
The PDP-9, PDP-10, and PDP-18 have 18 bits registers. The world had not settled on 16/32/64 bits at all.
Even the intel 80286 far/fat pointers are 24 bits.
"long" was commented out.
The real mistake in retrospect is that int and long are platform dependent. This is an amazing time sink when writing portable programs.
For some reason C programmers looked down on the exact width integer types for a long time.
The base types should have been exact width from the start, and the cool sounding names like int and long should have been typedefs.
In practice, I consider this a larger problem than the often cited NULL.
Also, don’t forget about short, it feels so sad and alone!
I think you misunderstood. There's no explanatory comment. The "long" keyword is commented out, meaning that it was planned but not yet implemented.
(b) Many important machines had word sizes that were not a multiple of 8.
Deleted Comment
(I know it's floating point, but it's the same as long/double).
for is missing too.
Later, the language would be retargeted to the PDP-11 while on the PDP-7. Various changes, like byte-addressed rather than word-addressed memory, led to it morphing into C after it was moved. There was no clear line between B and C -- the language was self-hosting the whole time as it changed from B into C.
Mr. Ritchie wrote a history from his perspective published in 1993. I've mostly just summarized it above. It's available here: https://web.archive.org/web/20150611114355/https://www.bell-...
https://bootstrappable.org/
- Rewrite B compiler in B (generating threaded code)
- Extend B to a language Ritchie called NB (new B). This compiler generated PDP assembly There is no version of the NB compiler known to exist.
- Continue extending NB until it became the very early versions of C.
You can read the longer version of this history here:
https://www.bell-labs.com/usr/dmr/www/chist.html
https://sci-hub.do/10.1145/155360.155580
I'm not even sure it's the first C compiler written in C, though - it just says in the github description "the very first c compiler known to exist in the wild."
Regardless, if it's from 1972 it's a very early version.
This isn't obvious to me.
I just assumed that the first iteration was compiled by hand to bootstrap a minimum workable version. Then the language would be extended slightly, and that version would be compiled with the compiler v. n-1 until a full-feature compiler is made.
It makes sense to write a compiler in a different language, but given the era, I could see hand-compilation still being a thing.
https://github.com/mortdeus/legacy-cc/blob/master/prestruct/...
char waste[however-many-bytes-are-needed];
?