The first publicly available version of Oracle Database (v2 released in 1979) was written in assembly for PDP-11. Then Oracle rewrote v3 in C (1983) for portability across platforms. The mainframes at the time didn't have C compilers, so instead of writing a mainframe-specific database product in a different language (COBOL?), they just wrote a C compiler for mainframes too.
UNIX was ported to the System/370 in 1980, but it ran on top of TSS, which I understand was an obscure product.
"Most of the design for implementing the UNIX system for System/370 was done in 1979, and coding was completed in 1980. The first production system, an IBM 3033AP, was installed at the Bell Laboratories facility at Indian Hill in
early 1981."
Interesting. Summer 84/85 (maybe 85/86) I used a port of PCC to System/360 (done, I believe, by Scott Kristjanson) on the University of British Columbia mainframes (Amdahls running MTS). I was working on mail software, so I had to deal with EBCDIC/ASCII issues, which was no fun.
I sometimes wonder if that compiler has survived anywhere.
> The first publicly available version of Oracle Database (v2 released in 1979) was written in assembly for PDP-11.
I wonder if anybody still has a copy of Oracle v2 or v3?
Oldest I've ever seen on abandonware sites is Oracle 5.1 for DOS
> The mainframes at the time didn't have C compilers
Here's a 1975 Bell Labs memo mentioning that C compilers at the time existed for three machines [0] – PDP-11 UNIX, Honeywell 6000 GCOS, and "OS/370" (which is a bit of a misnomer, I think it actually means OS/VS2 – it mentions TSO on page 15, which rules out OS/VS1)
That said, I totally believe Oracle didn't know about the Bell Labs C compiler, and Bell Labs probably wouldn't share it if they did, and who knows if it had been kept up to date with newer versions of C, etc...
SAS paid Lattice to port their C compiler to MVS and CMS circa 1983/1984, so probably around the same time Oracle was porting Oracle to IBM mainframes – because I take it they also didn't know about or couldn't get access to the Bell Labs compiler
Whereas, Eric Schmidt succeeded in getting Bell Labs to hand over their mainframe C compiler, which was used by the Princeton Unix port, which went on to evolve into Amdahl UTS. So definitely Princeton/Amdahl had a mainframe C compiler long before SAS/Lattice/Oracle did... but maybe they didn't know about it or have access to it either. And even though the original Bell Labs C compiler was for MVS (aka OS/VS2 Release 2–or its predecessor SVS aka OS/VS2 Release 1), its Amdahl descendant may have produced output for Unix only
I assume whatever C compiler AT&T's TSS-based Unix port (UNIX/370) used was also a descendant of the Bell Labs 370 C compiler. But again, it probably produced code only for Unix not for MVS, and probably wasn't available outside of AT&T either
I very much doubt anyone from the time wants to talk about it, but there is substantial bad blood about Oracle and Ingres. I believe not all of this story is in the public domain, nor capable of being discussed without lawyers.
Was it actually that uncommon back then? My understanding is that there were other things (including Unix itself, since it predated C and was only rewritten in it later) written in assembly initially back in the 70s. Maybe Oracle is much larger compared to other things done this way than I realize, or maybe the veneration of Unix history has just been part of my awareness for too long, but for some reason hearing that this happened with Oracle doesn't seem to hit as hard for me as it seems for you. It's possible become so accustomed to something historically significant that I fail to be impressed by a similar feat, but I genuinely thought that assembly was just the language used for stuff low-level for a long time (not that I'm saying there weren't other systems languages besides C, but my recollection is having read that for a while some people were skeptical of the idea of using any high-level language in the place of assembly for systems programming).
SQLite error messages are similarly spartan. I wrote a SQLite extension recently and didn't find it difficult to have detailed/dynamic error messages, so it may have just been a preference of the author.
It's an awkward way to reserve memory. The important detail here is that both compiler phases do this, and the way the programs are linked guarantees that the reserved region has the same address in both phases. Therefore an expression tree involving pointers can be passed to the second phase very succinctly. Not pretty, no, but hardware limitations force you to do come up with strange solutions sometimes.
It's much less of your own code if you use TCL (THINK Class Library), which shipped with THINK C 4.0 (and THINK Pascal) in mid 1989.
Your System 6.0.8 is from April 1991, so TCL was well established by then and the C/C++ version in THINK C 5 even used proper C++ features instead of the hand-rolled "OOP in C" (nested structs with function pointers) used by TCL in THINK C 4.
I used TCL for smaller projects, mostly with THINK Pascal which was a bit more natural using Object Pascal, and helped other people use it and transition their own programs that previously used the Toolbox directly, but my more serious programs used MacApp which was released for Object Pascal in 1985, and for C++ in 1991.
Thanks for this. I was using think C 3.X last night unaware that there is a 5.0. I figured it out as I typed and googled this morning. I will have to revisit the 5.0, and pick up a digitised book.
There is a variable declared right before the waste space function. The 'wasted' space is statically allocated memory for the variable 'ospace' just before it.
There's nothing in that repo that says, but at a guess: old machines often had non-uniform ways to access memory, so it may have been to test that the compiler would still work if the binary grew over some threshold.
Even today's machines often have a limit as to the offset that can be included in an instruction, so a compiler will have to use different machine instructions if a branch or load/store needs a larger offset. That would be another thing that this function might be useful to test. Actually that seems more likely.
It might be instructive to compare the binary size of this function to the offset length allowed in various PDP-11 machine instructions
Wild guess: it was a way to offset the location of the "main" function by an arbitrary amount of bytes. In the a.out binary format, this translates to an entry point which is not zero.
" A second, less noticeable, but astonishing peculiarity is the space allocation: temporary storage is allocated that deliberately overwrites the beginning of the program, smashing its initialization code to save space. The two compilers differ in the details in how they cope with this. In the earlier one, the start is found by naming a function; in the later, the start is simply taken to be 0. This indicates that the first compiler was written before we had a machine with memory mapping, so the origin of the program was not at location 0, whereas by the time of the second, we had a PDP-11 that did provide mapping. (See the Unix History paper). In one of the files (prestruct-c/c10.c) the kludgery is especially evident. "
Looks like "extern" is used to bring global symbols into function scope. Everything looks to be "int" by default. Some array declarations are specifying a size, others are not. Are the "sizeless" arrays meant to be used as pointers only?
>Looks like "extern" is used to bring global symbols into function scope.
a better way to think of extern is, "this symbol is not declared/defined/allocated here, it is declared/defined/allocated someplace else"
"this is its type so your code can reference it properly, and the linker will match up your references with the declared/defined/allocated storage later"
(i'm using reference in the generic english sense, not pointer or anything. it's "that which can give you not only an r-value but an l-value")
Yes, pretty much. To be fair, C at this point was basically BCPL with slightly different syntax (and better char/string support). The introduction of structs (and then longs) changed it forever.
"auto" used to mean automatic memory management because if you are coming from assembly or even some other older higher-level languages you can't just declare a local variable and use it as you please. You must declare somewhere to store it and manage its lifetime (even if that means everything is global).
C and its contemporaries introduced automatic or in modern terms local or stack allocated values, often with lexically-scoped lifetimes. extern meaning something outside this file declares the storage for it and register meaning the compiler should keep the value in a register.
However auto has always been the default and thus redundant and style-wise almost no one ever had the style of explicitly specifying auto so it was little-used in the wild. So the C23 committee adopted auto to mean the same as C++: automatically infer the type of the declaration.
You can see some of B's legacy in the design of C. Making everything int by default harkens back to B's lack of types because everything was a machine word you could interpret however you wanted.
Also with original C's function declarations which don't really make sense. The prototype only declares the name and the local function definition then defines (between the closing paren and the opening brace) the list of parameters and their types. There was no attempt whatsoever to have the compiler verify you passed the correct number or types of parameters.
In C23, auto doesn't have a default type, if you write auto without a type then you get the C++ style "type deduction" instead. This is part of the trend (regretted by some WG14 members) of WG14 increasingly serving as a way to fix the core of C++ by instead mutating the C language it's ostensibly based on.
You can think of deduction as crap type inference.
Declaring a variable or function as extern(al) just tells the compiler to assume that it is defined "externally", i.e. in another source file. The compiler will generate references to the named variable/function, and the linker will substitute the actual address of the variable/function when linking all the object files together.
Modern C won't let you put extern declarations inside a function like this, basically because it's bad practice and makes the code less readable. You can of course still put them at global scope (e.g. at top of the source file), but better to put them into a header file, with your code organized into modules of paired .h definition and .c implementation files.
Reminds me of the humility every programmer should have, basically we're standing on the shoulders of giants and abstraction for the most part. 80+ years of computer science.
Cool kids may talk about memory safety but ultimately someone had to take care of it, either in their code or abstracted out of it.
Memory safety predates C by a decade, in languages like JOVIAL (1958), ESPOL/NEWP (1961) and PL/I (1964), it follows along in the same decade outside Bell Labs, PL/S(1970), PL.8 (1970), Mesa (1976), Modula-2 (1978).
If anything the cool kids are rediscovering what we lost in systems programming safety due to the wide adoption of C, and its influence in the industry, because the cool kids from 1980's decided memory safety wasn't something worth caring about.
"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."
-- C.A.R Hoare's "The 1980 ACM Turing Award Lecture"
Guess what programming language he is referring to by "1980 language designers and users have not learned this lesson".
The thing I always loved about C was its simplicity, but in practice it's actually very complex with tons of nuance. Are there any low level languages like C that actually are simple, through and through? I looked into Zig and it seems to approach that simplicity, but I have reservations that I can't quite put my finger on...
The reality is, the only languages that are truly simple are Turing tarpits, like Brainfuck.
Reality is not simple. Every language that’s used for real work has to deal with reality. It’s about how the language helps you manage complexity, not how complex the language is.
Maybe Forth gets a pass but there’s good reason why it’s effectively used in very limited circumstances.
The perceived complexity from a semantic standpoint comes from the weakly-typed nature of the language. When the operands of an expression have different types, implicit promotions and conversions take place. This can be avoided by using the appropriate types in the first place. Modern compilers have warning flags that can spot such dodgy conversions.
The rest of the complexity stems from the language being a thin layer over a von Neumann abstract machine. You can mess up your memory freely, and the language doesn’t guarantee anything.
Representing computation as words of a fixed bit length, in random access memory, is not (See The Art of Computer Programming). And the extent to which other languages simplify is creating simpler memory models.
What about C is simple? Its syntax is certainly not simple, it's hard to grok and hard to implement parsers for, and parsing depends on semantic analysis. Its macro system is certainly not simple; implementing a C preprocessor is a huge job in itself, it's much more complex than what appears to be necessary for a macro system or even general text processor. Its semantics are not simple, with complex aliasing rules which just exist as a hacky trade-off between programming flexibility and optimizer implementer freedom.
C forces programs to be simple, because C doesn't offer ways to build powerful abstractions. And as an occasional C programmer, I enjoy that about it. But I don't think it's simple, certainly not from an implementer's perspective.
It’s not really clear to me how you could have a simple low level language without tons of nuance. Something like Go is certainly simple without tons of nuance, but it’s not low level, and I think extending it to be low level might add a lot of nuance.
Lisp could be simple... but there's a lot of reasons it isn't.
It uses a different memory model than current hardware, which is optimized for C. While I don't know what goes on under SBCL's hood, the simpler Lisps I'm familiar with usually have a chunk of space for cons cells and a chunk of "vector" space kinda like a heap.
Lisp follows s-expression rules... except when it doesn't. Special forms, macros, and fexprs can basically do anything, and it's up to the programmer to know when sexpr syntax applies and when it doesn't.
Lisp offers simple primitives, but often also very complex functionality as part of the language. Just look at all the crazy stuff that's available in the COMMON-LISP package, for instance. This isn't really all that different than most high level languages, but no one would consider those "simple" either.
Lisp has a habit of using "unusual" practices. Consider Sceme's continuations and use of recursion, for example. Some of those - like first-class functions - have worked their way into modern languages, but image how they would have seemed to a Pascal programmer in 1990.
Finally, Lisp's compiler is way out there. Being able to recompile individual functions during execution is just plain nuts (in a good way). But it's also the reason you have EVAL-WHEN.
All that said, I haven't invested microcontroller Lisps. There may be one or more of those that would qualify as "simple."
I would say rust. When you learn the basics, rust is very simple and will point to you any errors you have, so you get basically no runtime errors. Also the type system is extremely clean, making the code very readable.
But also C itself is very simple language. I do not mean C++, but pure C. I would probably start with this. Yes, you will crash at runtime errors, but besides that its very very simple language, which will give you good understanding of memory allocation, pointers etc.
Got through C and K&R with no runtime errors, on four platforms, but the first platform... Someone asked the teacher why a struct would not work on Lattice C. The instructor looked at the code, sat down at the students computer, typed in a small program compiled it, and camly put the disks in the box with the manual and threw it in the garbage. "We will have a new compiler next week." We switched to Manx C, which is what we had on the Amiga. Structs worked on MS C, which I thought was the lettuce compiler. ( Apparently a different fork of the portable C compiler, but later they admitted that it was still bigendian years later )
Best programming joke. Teacher said when your code becomes "recalcitrent", we had no idea what he meant. This was in the bottom floor of the library, so on break, we went upstairs and used the dictionary. Recalcitrant means not obeying authority. We laughed out loud, and then went silent. Opps.
The instructor was a commentator on the cryptic-C challenges, and would often say... "That will not do what you think it will do" and then go on and explain why. Wow. We learned a lot about the pre-processor, and more about how to write clean and useful code.
It depends what you mean by simple. C still is simple, but it doesn't include a lot of features that other languages do, and to implement them in C is not simple.
C is simple for some use cases, and not for others.
The appeal of C is that you're just operating on raw memory, with some slight conveniences like structs and arrays. That's the beauty of its simplicity. That's why casting a struct to its first argument works, why everything has an address, or why pointer arithmetic is so natural. Higher level langs like C++ and Go try to retain the usefulness of these features while abstracting away the actuality of them, which is simultaneously sad and helpful.
Turing Tarpits like Brainfuck or the Binary Lambda Calculus are a more extreme demonstration of the distinction, they can be very tiny languages but are extremely difficult to actually use for anything non-trivial.
I think difficulty follows a "bathtub" curve when plotted against language size. The smallest languages are really hard to use, as more features get added to a language it gets easier to use, up to a point where it becomes difficult to keep track of all the things the language does and it starts getting more difficult again.
"Most of the design for implementing the UNIX system for System/370 was done in 1979, and coding was completed in 1980. The first production system, an IBM 3033AP, was installed at the Bell Laboratories facility at Indian Hill in early 1981."
https://web.archive.org/web/20240930232326/https://www.bell-...
I sometimes wonder if that compiler has survived anywhere.
I wonder if anybody still has a copy of Oracle v2 or v3?
Oldest I've ever seen on abandonware sites is Oracle 5.1 for DOS
> The mainframes at the time didn't have C compilers
Here's a 1975 Bell Labs memo mentioning that C compilers at the time existed for three machines [0] – PDP-11 UNIX, Honeywell 6000 GCOS, and "OS/370" (which is a bit of a misnomer, I think it actually means OS/VS2 – it mentions TSO on page 15, which rules out OS/VS1)
That said, I totally believe Oracle didn't know about the Bell Labs C compiler, and Bell Labs probably wouldn't share it if they did, and who knows if it had been kept up to date with newer versions of C, etc...
SAS paid Lattice to port their C compiler to MVS and CMS circa 1983/1984, so probably around the same time Oracle was porting Oracle to IBM mainframes – because I take it they also didn't know about or couldn't get access to the Bell Labs compiler
Whereas, Eric Schmidt succeeded in getting Bell Labs to hand over their mainframe C compiler, which was used by the Princeton Unix port, which went on to evolve into Amdahl UTS. So definitely Princeton/Amdahl had a mainframe C compiler long before SAS/Lattice/Oracle did... but maybe they didn't know about it or have access to it either. And even though the original Bell Labs C compiler was for MVS (aka OS/VS2 Release 2–or its predecessor SVS aka OS/VS2 Release 1), its Amdahl descendant may have produced output for Unix only
I assume whatever C compiler AT&T's TSS-based Unix port (UNIX/370) used was also a descendant of the Bell Labs 370 C compiler. But again, it probably produced code only for Unix not for MVS, and probably wasn't available outside of AT&T either
[0] https://archive.org/details/ThePortableCLibrary_May75/page/n...
I remember few buddies using similar pattern in ASM that just added n NOP's into code to allow patching and thus eliminating possible recompilation..
Boy it took a lot of code to get a window behaving back in the day... And this is a much more modern B/C; it's actually ANSI C but the API is thick.
I did really enjoy the UX of macOS 6 and it's terse look, if you can call it that [3].
[1] https://www.gryphel.com/c/minivmac/start.html
[2] https://archive.org/details/think_c_5
[3] https://miro.medium.com/v2/resize:fit:1024/format:webp/0*S57...
Your System 6.0.8 is from April 1991, so TCL was well established by then and the C/C++ version in THINK C 5 even used proper C++ features instead of the hand-rolled "OOP in C" (nested structs with function pointers) used by TCL in THINK C 4.
I used TCL for smaller projects, mostly with THINK Pascal which was a bit more natural using Object Pascal, and helped other people use it and transition their own programs that previously used the Toolbox directly, but my more serious programs used MacApp which was released for Object Pascal in 1985, and for C++ in 1991.
And don't answer "to waste space of course" please. :)
Even today's machines often have a limit as to the offset that can be included in an instruction, so a compiler will have to use different machine instructions if a branch or load/store needs a larger offset. That would be another thing that this function might be useful to test. Actually that seems more likely.
It might be instructive to compare the binary size of this function to the offset length allowed in various PDP-11 machine instructions
" A second, less noticeable, but astonishing peculiarity is the space allocation: temporary storage is allocated that deliberately overwrites the beginning of the program, smashing its initialization code to save space. The two compilers differ in the details in how they cope with this. In the earlier one, the start is found by naming a function; in the later, the start is simply taken to be 0. This indicates that the first compiler was written before we had a machine with memory mapping, so the origin of the program was not at location 0, whereas by the time of the second, we had a PDP-11 that did provide mapping. (See the Unix History paper). In one of the files (prestruct-c/c10.c) the kludgery is especially evident. "
a better way to think of extern is, "this symbol is not declared/defined/allocated here, it is declared/defined/allocated someplace else"
"this is its type so your code can reference it properly, and the linker will match up your references with the declared/defined/allocated storage later"
(i'm using reference in the generic english sense, not pointer or anything. it's "that which can give you not only an r-value but an l-value")
C and its contemporaries introduced automatic or in modern terms local or stack allocated values, often with lexically-scoped lifetimes. extern meaning something outside this file declares the storage for it and register meaning the compiler should keep the value in a register.
However auto has always been the default and thus redundant and style-wise almost no one ever had the style of explicitly specifying auto so it was little-used in the wild. So the C23 committee adopted auto to mean the same as C++: automatically infer the type of the declaration.
You can see some of B's legacy in the design of C. Making everything int by default harkens back to B's lack of types because everything was a machine word you could interpret however you wanted.
Also with original C's function declarations which don't really make sense. The prototype only declares the name and the local function definition then defines (between the closing paren and the opening brace) the list of parameters and their types. There was no attempt whatsoever to have the compiler verify you passed the correct number or types of parameters.
You can think of deduction as crap type inference.
Modern C won't let you put extern declarations inside a function like this, basically because it's bad practice and makes the code less readable. You can of course still put them at global scope (e.g. at top of the source file), but better to put them into a header file, with your code organized into modules of paired .h definition and .c implementation files.
https://www.nokia.com/bell-labs/about/dennis-m-ritchie/bintr...
Have a look at the early history of C document on DMR's site, it mentions that the initial syntax for pointers was that form.
Cool kids may talk about memory safety but ultimately someone had to take care of it, either in their code or abstracted out of it.
If anything the cool kids are rediscovering what we lost in systems programming safety due to the wide adoption of C, and its influence in the industry, because the cool kids from 1980's decided memory safety wasn't something worth caring about.
"A consequence of this principle is that every occurrence of every subscript of every subscripted variable was on every occasion checked at run time against both the upper and the lower declared bounds of the array. Many years later we asked our customers whether they wished us to provide an option to switch off these checks in the interests of efficiency on production runs. Unanimously, they urged us not to--they already knew how frequently subscript errors occur on production runs where failure to detect them could be disastrous. I note with fear and horror that even in 1980 language designers and users have not learned this lesson. In any respectable branch of engineering, failure to observe such elementary precautions would have long been against the law."
-- C.A.R Hoare's "The 1980 ACM Turing Award Lecture"
Guess what programming language he is referring to by "1980 language designers and users have not learned this lesson".
IMO it is young people that have trouble understanding.
The same mistakes are made over and over, lessons learned long ago are ignored in the present
It easier to write than read, easier to talk than listen, build new than expand the old
Reality is not simple. Every language that’s used for real work has to deal with reality. It’s about how the language helps you manage complexity, not how complex the language is.
Maybe Forth gets a pass but there’s good reason why it’s effectively used in very limited circumstances.
The rest of the complexity stems from the language being a thin layer over a von Neumann abstract machine. You can mess up your memory freely, and the language doesn’t guarantee anything.
Representing computation as words of a fixed bit length, in random access memory, is not (See The Art of Computer Programming). And the extent to which other languages simplify is creating simpler memory models.
C forces programs to be simple, because C doesn't offer ways to build powerful abstractions. And as an occasional C programmer, I enjoy that about it. But I don't think it's simple, certainly not from an implementer's perspective.
It uses a different memory model than current hardware, which is optimized for C. While I don't know what goes on under SBCL's hood, the simpler Lisps I'm familiar with usually have a chunk of space for cons cells and a chunk of "vector" space kinda like a heap.
Lisp follows s-expression rules... except when it doesn't. Special forms, macros, and fexprs can basically do anything, and it's up to the programmer to know when sexpr syntax applies and when it doesn't.
Lisp offers simple primitives, but often also very complex functionality as part of the language. Just look at all the crazy stuff that's available in the COMMON-LISP package, for instance. This isn't really all that different than most high level languages, but no one would consider those "simple" either.
Lisp has a habit of using "unusual" practices. Consider Sceme's continuations and use of recursion, for example. Some of those - like first-class functions - have worked their way into modern languages, but image how they would have seemed to a Pascal programmer in 1990.
Finally, Lisp's compiler is way out there. Being able to recompile individual functions during execution is just plain nuts (in a good way). But it's also the reason you have EVAL-WHEN.
All that said, I haven't invested microcontroller Lisps. There may be one or more of those that would qualify as "simple."
But also C itself is very simple language. I do not mean C++, but pure C. I would probably start with this. Yes, you will crash at runtime errors, but besides that its very very simple language, which will give you good understanding of memory allocation, pointers etc.
Best programming joke. Teacher said when your code becomes "recalcitrent", we had no idea what he meant. This was in the bottom floor of the library, so on break, we went upstairs and used the dictionary. Recalcitrant means not obeying authority. We laughed out loud, and then went silent. Opps.
The instructor was a commentator on the cryptic-C challenges, and would often say... "That will not do what you think it will do" and then go on and explain why. Wow. We learned a lot about the pre-processor, and more about how to write clean and useful code.
Deleted Comment
It's still a tad more complicated than it needs to be - e.g. you could drop non-0-based arrays, and perhaps sets and even enums.
C is simple for some use cases, and not for others.
Syntactically, yes. Semantically, no.
There are languages with tons of "features" with far, far less semantic overhead than C.
https://blog.regehr.org/archives/767
FWIW, writing programs in C has been my day job for a long time.
Turing Tarpits like Brainfuck or the Binary Lambda Calculus are a more extreme demonstration of the distinction, they can be very tiny languages but are extremely difficult to actually use for anything non-trivial.
I think difficulty follows a "bathtub" curve when plotted against language size. The smallest languages are really hard to use, as more features get added to a language it gets easier to use, up to a point where it becomes difficult to keep track of all the things the language does and it starts getting more difficult again.
I would say Zig is the spiritual follower from the first two, while Go follows up the Oberon and Limbo heritage.
That's because computers are very complex with tons of nuance.