Readit News logoReadit News
Animats · 9 years ago
I'm still struggling with my own legacy code problem. I'm reviving an old LISP program from the early 1980s. Parts of it were written for the original Stanford AI Lab SAIL system in the 1970s. It last ran under Franz LISP in 1986.

My current struggle is with one line of code:

   (defun getenode (l) (cadr l))
That ought to be simple enough. But it's being applied not to a list, but a "hunk". A "hunk" is an obsolete MacLISP concept.[1]. It's a block of memory which has N contiguous LISP cells, each with two pointers. This is the memory object underlying structures and arrays in MacLISP. Macros were used to create the illusion of structure data objects, with hunks underneath. However, you could still access a "hunk" with car, cdr, cxr, etc.

I'm converting this to Common LISP, which has real structures, but not hunks. That, with some new macro support, works for the regular structure operations. So far, so good.

But which element of the structure does (cadr l), which usually means the same thing as "(car (cdr l))", access? (cadr (list 0 1 2 4)) returns 1, so you'd think it would be field 1 of the structure. But no. It's more complicated and depends on how hunks are laid out in memory.

The Franz LISP manual from 1983 [2] says "Although hunks are not list cells, you can still access the first two hunk elements with cdr and car and you can access any hunk element with cxr†." At footnote "†", "In a hunk, the function cdr references the first element and car the second." This is backwards from the way lists behave.

A blog posting from 2008 about MacLISP says "A Maclisp hunk was a structure like a cons cell that could hold an arbitrary number of pointers, up to total of 512. Each of these slots in a hunk was referred to as a numbered cxr, with a numbering scheme that went like this: ( cxr-1 cxr-2 cxr-3 ... cxr-n cxr-0 ). No matter how many slots were in the hunk, car was equivalent to (cxr 1 hunk) and cdr was equivalent to (cxr 0 hunk)." Note that element 0 is at the end, which is even stranger. The documentation is silent about what "cadr" would do. Does it get element 2, or get element 0 and then apply "car" to it?

The original code [3] contains no relevant comments. I'm trying to figure out from the context what the original author, Greg Nelson, had in mind. He died in 2015.[4]

[1] http://www.mschaef.com/blog/tech/lisp/car-cdr.html [2] http://www.softwarepreservation.org/projects/LISP/franz/Fran... [3] https://github.com/John-Nagle/pasv/blob/master/src/CPC4/z.li... [4] https://en.wikipedia.org/wiki/Greg_Nelson_(computer_scientis...

sillysaurus3 · 9 years ago
There are only a few possibilities for what (cadr hunk) could mean. One way to solve this is to try them all and see which one runs successfully.

Based on †, it sounds like (cadr (hunk (hunk 1 2) 3)) should return 2.

Is the old LISP code available online somewhere? I'm curious to see it.

lokedhs · 9 years ago
You can run it on an emulated PDP-10 running ITS. A friend of mine has been creating an easy-to-use project setting all of this up. MacLisp is included in the the system.

https://github.com/PDP-10/its

Animats · 9 years ago
See the link listed above: https://github.com/John-Nagle/pasv/blob/master/src/CPC4/z.li...

I put all the code on Github. The oldest version of each file is exactly what ran in 1986.

The code is delicate. It's a theorem prover, and there's much manipulation of complex data structures, with little explanation of what's going on. The overall theory is documented; this is the original Oppen-Nelson simplifier and there are published papers. But the code has few comments.

nickynickell · 9 years ago
Having done a bit of lisp spelunking myself, I would suggest that you dig into the source of the lisp your program ran under. Trying to figure out behavior from the surrounding context can be rather difficult when you're playing with an archaic, sparsely documented lisp.
junke · 9 years ago
Note that this is the order of display:

The order of display of hunk slots is historical in nature. For better or worse, the elements of a hunk display in order except that the 0th element is last, not first. e.g., for a hunk of a length n+1, (cxr1 . cxr2 . ... . cxrn . cxr0 .)

It could still make sense to have the layout in memory being sequential, like (cxr-0 == CDR, cxr-1 == CAR, ...others ...).

Note also that CAR extracts the leftmost element of a hunk, just as it addresses the leftmost element of a cons. Similarly, CDR extracts the rightmost element of hunks and conses.

It seems more logical that CADR is just the combination of CAR with CDR. It don't think the designers would try to transpose the fact that it means "second", with proper lists, for hunks. It just seems unlikely, but I have no proof.

Also:

(Note that the operation CAR is undefined on hunk-1's, but CDR is not.) This means that if you want to make a plist for a hunk of your own, you can use its cdr as a hunk; it does not mean that you can blindly assume that any hunk wants its CDR treated that way. The exact use of the slots of a hunk is up to the creator; it's a good idea to mark your hunks (e.g., by placing a distinctive object in their cxr-1 slot) so that you can tell them from hunks created by other programs.

My guess is that there is some metadata associated with a hunk, stored in CRX-0, a.k.a. CDR.

http://www.maclisp.info/pitmanual/hunks.html

larsbrinkhoff · 9 years ago
You should consult the Maclisp manuals. There is one in ITS, and the Pitmanual is here: http://maclisp.info/pitmanual/index.html

You should probably also test some code in Maclisp.

skissane · 9 years ago
Is someone in the US government really still using an IBM 7074? Really? I'm shocked. How could that possibly be cost-effective?

Actually, this blog post makes the story clearer:

http://nikhilism.com/post/2016/systems-we-love/

It isn't a physical IBM 7074.

When it came time to migrate from 7074 to S/360, rather than rewriting their 7074 software, they just wrote a 7074 emulator for S/360. And, it sounds like, they are still running their 7074 software, under their 7074 emulator, most likely on a recent z/Architecture mainframe.

The article makes it sound like people still use "1960s mainframes" when I very much doubt anyone is still running 1960s hardware in production. People use modern machines–modern IBM mainframes, which are multicore 64-bit processors–or other mainframe vendors such as Unisys or Fujitsu use mainly I believe x86-64 running Linux running a software emulator for the old mainframe CPU.

A lot of legacy, sure, but I think this article makes it sound even more legacy than it really is.

Animats · 9 years ago
The picture of an IBM 7074 is from Wikipedia.

IBM offered 7074 emulation as a standard IBM System/360 product.[1] On an S/360, it required some special hardware support. In 1972, IBM gave users a free IBM 7074 emulator, software only, for System/370 machines.[2] They may still be running that program on a Z-series mainframe.

[1] http://bitsavers.trailing-edge.com/pdf/ibm/370/compatibility...

[2] https://books.google.com/books?id=p5zVQgaQ-N0C&pg=PA11

YeGoblynQueenne · 9 years ago
It's possible Ms Belotti and her team didn't realise they were working with an emulator. Mainframes being what they are, the programmers were probably never in the same room as the machines they were progamming, and it does take a bit of digging to figure out that the architecture you see before your eyes is emulated on another machine (like in "The Story of Mel").

Still, even the blog post you link to doesn't make it absolutely clear that java was running on an emulated s/370. It says that the decision was made to emulate the older architecutre rather than rewrite the old programs, but then it goes on to say "These are still operational". Does it mean the old programs? Or the old machines? It's hard to say.

As to how unlikely it is to see a very old machine still in use, instead of one made in more recent times, last year I talked to an engineer who claimed he had seen a PDP still in operation in some transport company if memory serves.

abakker · 9 years ago
I believe (a coworker of mine worked on S/360) that all the old software can be effectively run through emulators on the current system z. The feeling was that once you had done the development and testing of an older system, IBM was never going to force you to rewrite your code. As a result, the upgrades were pretty seamless over the years. Many of those customers never had to upgrade and never had a reason to.
vidarh · 9 years ago
PDP-10 was discontinued in 1983, but PDP-11 wasn't discontinued until 1997, with third-parties continuing to sell parts, so it's really not that unlikely to come across PDPs, depending on which line.
1812Overture · 9 years ago
I 100% buy they're using 1960s hardware. I've talked to some people who had to spend half their day on ebay trawling for parts to keep their ancient systems running. I've personally worked with medical offices still using 1970s hardware, it's not that rare. Many places have an "if it ain't broke don't fix it" attitude.
wwweston · 9 years ago
And it's an open question whether or not that attitude is better. A day or two of office admin time every 3-6 months is quite possibly more cost effective than hiring one of us for weeks/months/years to create a new system...
tomcam · 9 years ago
I/O speed reported in the article seemed highly unlikely for mag tape
mbellotti · 9 years ago
> A lot of legacy, sure, but I think this article makes it sound even more legacy than it really is.

Indeed. The point of the talk was that 1) legacy is often assumed to be bad not for any real technical reasons but just because it is legacy and 2) a lot of what was being presented as legacy wasn't even legacy. Their OS 2200 version was actually newer than the Oracle DB they were using on the "modern" side of the stack.

nickpsecurity · 9 years ago
You might enjoy this article given the kind of work you do:

http://www.pcworld.com/article/249951/computers/if-it-aint-b...

You've probably seen plenty of crazy stuff in legacy systems but I'm hoping at least one surprises you. Maybe the first one. :)

wolf550e · 9 years ago
I know of an S/360 assembly code base that (as of a decade ago), still ran code that assumed 24 bit address space and used 31 bit pointers with tags stuffed in the unused 7 bits. So it had to run in the lowest 16MB of memory, on a 64-bit machine. I don't think they had the budget to rewrite that subsystem to fix it. So yes, new hardware, but some of the legacy problems in the software run very deep.
fennecfoxen · 9 years ago
They're often virtualized but don't underestimate mainframe persistence.

At my last job as a consultant someone made https://github.com/manheim/antimony for testing IBM TN5250 mainframe screens in Ruby - it's like Selenium but even more brittle. :>

cle · 9 years ago
It's "cost effective" because "cost effectiveness" is not as objective as we think, once we start considering things like risk tolerance.
kevin_thibedeau · 9 years ago
ClearPath Dorado still had a Univac on a chip until 2015. This is actual hardware, not just software emulation.
acqq · 9 years ago
Yes. And they never accessed 1960 IBM Mainframe at all with their software. The title is probably intentionally misleading.

And the one they actually accessed and received query results in 6 milliseconds was introduced in 2008, not so old and slow:

https://www.app5.unisys.com/offerings/ClearPathConnection/st...

"Single image performance range of 300 MIPS at the entry level and maximum single image performance of approximately 5,700 MIPS (32 processor system)."

"An expanded memory subsystem supports larger memory capabilities and offers memory configurations that include the ability to expand up to 4GW per cell and up to 32GW for a maximum eight-cell system."

kchoudhu · 9 years ago
Mazal tov, that's containerization...from the 1970s.

Everything old is new again. I love our profession.

facepalm · 9 years ago
I was going to ask if they couldn't just replace the hardware with emulators :-)
kevin_thibedeau · 9 years ago
Does ext4 journal mode not journal everything?
mbellotti · 9 years ago
This article was such a pleasant surprise! I loved talking at Systems We Love and I love talking about legacy architecture in general. Bryan and the Joyent team were very accommodating and understanding. The White House is an amazing place to work (even now), but it's not a system that understands conference talks very well. Wish we could have a video, but it was a heavy lift just getting the bureaucracy okay with the idea that I was going to talk about information that was already public, without naming agencies or projects and without going into detail beyond what was in a user manual and that this was not a security risk.
benballjr · 9 years ago
Hi Marianne! I'm with The New Stack (not the writer of this article), and was wondering about the video myself. Interesting to see you address that there's definitely no public facing version, as it seems a lot of the commenters here would love to see it. I can only imagine the bureaucrat complexity involved.
mbellotti · 9 years ago
Oh yeah, when I talked to the agency that owns the 7074 (more on this in a second) they were like "You can't do this because we will get hacked"

"How are you going to get hacked if I talk about your mainframe? It's not connected to the public internet, is it?"

"No. Well... we don't know... but ... hackers! Hackers are really smart Marianne."

Part of the compromise was that I promised I would only use information that was already available publicly through government reports and news articles. I went back through my talk and documented where each fact was already published somewhere else until they were comfortable with it. So the ambiguity on whether the 7074 was the actual machine or an emulator was deliberate... there were certain things I could not find a public comment on and therefore agreed to avoid making direct statements about.

This all seems super annoying, but it makes sense when you realize how heavily scrutinized public servants are. In the end they are only trying to protect me, my organization and Obama's legacy. Three things that are really important to me. So I can't exactly blame them for it. I was happy to be able to find a middle ground where they felt comfortable, the organizers weren't too badly inconvenienced and I got to give the talk I wanted to.

YeGoblynQueenne · 9 years ago
>> It was at this point that the seasoned data architects in the department began expressing their exasperation. “15 years ago, everybody was telling us ‘Get off the mainframe, get on AT&T applications, build these thick clients. Mainframes are out.’ And now thick clients are out, and everybody’s moving to APIs and microservices, which basically are very similar to the thin client that a terminal uses to interact with a mainframe.”

A.k.a. "Them as not knows their history are doomed to repeat it". But I suppose it's more the case of business imperatives than real ignorance that drive this mad race to make new stuff that works worse than the old stuff only so we can then go back to the old stuff with a different name.

Btw, that lady is my new tech hero:

“The systems that I love are really the systems that other engineers hate,” Bellotti told the audience — “the messy, archaic, half chewing gum and duct tape systems that are sort of patched together.

<3 <3 <3

eternalban · 9 years ago
"15 years ago" doesn't sound right.

Thick clients were being pushed in mid 90s. But then Clipper Chip plans went south ...

toyg · 9 years ago
Consider that state/federal bureaucracies get on "trends" with a delay of 2 or 3 years minimum, and the story is probably from 2015 or thereabout - the agency was launched in 2014.
krylon · 9 years ago
Wish I could upvote this twice.
jonnycowboy · 9 years ago
The thing is that there was a lot about those old systems that was slow, so you were very, very careful how you programmed. You tended not to use vast library stacks, you went close to the metal and you coded in languages like Assembler, COBOL or FORTRAN. I/O was often run through specialised co-processors (such as IBM's channel processors) and the terminals could sometimes help too.

I have friends who have been looking after legacy applications for an airline running on Unisys. The core apps for reservation, Cargo booking and weight/balance were written in FORTRAN. In recent times, the front end was written in Java to give web access. They tried to rewrite the core apps but it was impossible to do so and get the performance.

YeGoblynQueenne · 9 years ago
>> They tried to rewrite the core apps but it was impossible to do so and get the performance.

Well, Cobol is a bit like the C of mainframes - you can manipulate memory directly and so on. You can't really do that sort of thing with Java.

stefs · 9 years ago
a) if it was really running on the old hardware; in that case ruby on a modern machine would have been several magnitudes faster than the original code - at least because of the faster IO

b) if the whole thing was indeed running in an emulator, the emulation overhead would have negated all direct memory access advantages

eru · 9 years ago
> The thing is that there was a lot about those old systems that was slow, so you were very, very careful how you programmed.

That's a common sentiment. I wish I could find the quote by someone who made the transition; it was about how happy they were to be able to compile so much quicker, and how getting immediate feedback made them so much more productive.

3n7r0pY · 9 years ago
The notion of waiting ages for programs to compile or assemble is mostly related to the older hardware.

I compile/assemble COBOL and IBM's assembly language on a z13 daily and it's pretty much instantaneous.

nl · 9 years ago
Did anyone read this and go "hu"?

There's this whole thing about how they are getting data from mainframes where "the data was being returned in between one and six milliseconds".

But then: "harvest that data from the magnetic tape and load them up into more traditional databases. That Java application was extracting the data from the databases"

But then: "But the data from the mainframes was actually arriving (from its new home in the database) in less than six milliseconds. The bottleneck was — of course — the Java application."

So of course it is entirely possibly to write slow Java applications. But then the story seems to end! So what happened? Did they fix the application?

geodel · 9 years ago
I have heard from many of my friends about massive projects of 'modernizing' mainframe applications with Java stack. Java did not deliver improvement that management was expecting. Once the project consumed all the budget for ~25% completion, they were scrapped.

I think, though without any proof, that overreliance of 100s of mixed quality libraries, combined with 'best practices' of enterprise development and heavy application of design patterns creates a very large surface area for change. This makes reasonable translation of functionality to Java almost impossible.

neeleshs · 9 years ago
Was it Java that did not deliver improvement, or was it the team? :)
acdha · 9 years ago
… or organizational culture? A lot of places really wanted to believe the problem was the technology because that's relatively much easier to change than going to a bunch of very senior managers and telling them that the way they're used to doing business is too expensive to continue.
geodel · 9 years ago
I am inclined to say Java as I have not seen mythical teams who work on Java without whole caboodle of 'Enterprise Apps' culture.
walrus01 · 9 years ago
When I first saw the Chromebook, my thought was that Larry Ellison's 1998 dream of a network computer thin terminal had come true. It's smarter than a true thin terminal, but everything lives in the cloud (err, butt).
lokedhs · 9 years ago
Being someone who worked for Sun at the time, I'd rather say it was Scott McNealy's dream.
bitwize · 9 years ago
That Chrome extension is clbuttic.
TeMPOraL · 9 years ago
Indeed.

> but everything lives in my butt (err, butt).

I had to open the comment in porn (er, incognito) mode in order to parse it :D.