Origins of J - Readit News

hackernoteng · 2 years ago

Arthur Whitney is a mad genius. We were an early customer of KX systems and I programmed with KDB for a few years. Arthur Whitney once sat at my desk and helped me debug my code. Very nice guy. Super smart and hilariously knowledgeable about the low level performance of chips, caches, etc. Ask him how many nanoseconds it takes to divide an array of doubles by an array of ints, and he knows. He just knows.

kelas · 2 years ago

(author of the modern port here)

> Arthur Whitney is a mad genius

atw is atw :) our old man is awesome. a bit grumpy at times, but then who isn’t :)

7thaccount · 2 years ago

This kind of story is what I enjoy so much about HN. I wish Kdb+ or Shakti had dramatically lower costs for those industries that don't have access to banking cash. I know open source versions exist, but I understand them to mostly be toys and not really production worthy.

RyanHamilton · 2 years ago

As the author of an open source version (https://www.timestored.com/jq/) I wish the same but I fear the time has passed. Two factors:

1. The other technologies are evolving to take parts of kdb+ that made it special quicker than kdb+ is evolving. See arrow / parquet / numpy / kafka, they each solve parts but kdb+ had them all 10 years ago in <2MB.

2. The ratio of learners to advanced programmers has increased every year for the last 20 years. The languages that have gained popularity in that time range are those with the easiest learning curve. Most beginners no longer want to sit with a book frustrated on 2 characters for half a day.

eismcc · 2 years ago

KlongPy sits on NumPy so gets pretty far. For some features, it’s still early days.

Http://klongpy.org

DrDroop · 2 years ago

K is at the end of the day a fancy calculator, I think for most workloads you can use the open source version called ngn/k

Deleted Comment

anonu · 2 years ago

I love kdb but First Derivatives should take it the MongoDb route... Open source it, make it widely available. Build a community around it. Build a package manager. Could be way bigger than it is... But maybe they're happy with their existing business model. I just don't see how you build a moat. Over time new tech will start to takeover as the key innovator is no longer there.

tangentstorm · 2 years ago

Nice to see this getting some attention again. I hope some people venture out to learn about the actual J language:

https://code.jsoftware.com/wiki/Guides/GettingStarted

For what it's worth, I've also studied this code a bit. This repo has an annotated and (somewhat) reformatted version of the code:

https://github.com/tangentstorm/j-incunabulum

JHonaker · 2 years ago

J is really great. I spent some time last year playing around with it, Dyalog APL, and some other new array languages like BQN.

I was extremely impressed by the breadth of integrations into different ecosystems that the J community had created (like R and the web tech).

Using the language reminds me of using Common Lisp. There are a lot of things that seem odd now, like how you define new words (i.e. functions), how namespaces work, or how the FFI/system calls work (i.e. !: ) [1]. Kind of like how in CL things are named "mapc", "mapcar", "mapcan", etc. Both kinds of quirks come from the fact that these people were really innovating in new frontiers, and Ken Iverson and Roger Hui just kept on developing their ideas.

[1]: https://code.jsoftware.com/wiki/Vocabulary/bangco for how it works and https://code.jsoftware.com/wiki/Vocabulary/Foreigns for what you do with it.

bbwbsb · 2 years ago

Since we are on the topic, I've thought about APLs a decent amount so here are some other resources/notes. I'm not an expert on this topic - I don't work with or research the language or anything. These probably are not good getting-started resources.

There is a VM model for APL languages[1] which can make optimizations comparable to those made by CLP(FD). If you read about CLP(FD) implementations[2], you'll see operations similar to what the "An APL Machine" paper calls beating. I'm not sure if any APL-like languages actually implement such optimizations.

There are different models of arrays (and their types) used by APL-like languages[3]. Also array frame agreement can be statically typed[4], though it usually isn't.

Some other OSS implementations of similar languages include Nial[5], ngn/k[6], and GNU APL[7]. My favorite is ngn/k. If you use a K-like language, a great source of inspiration is nsl[8].

There is an unusual and fun calculus book that uses J, by Iverson, but it moves somewhat quickly and loosely[9]. It perhaps gives a good example of what APL was intended to be(?). On that note, his original paper, "Notation as a Tool of Thought" is interesting[10]. There is also podcast interview with Robert Kowalski, one of the creators of Prolog, who says - if I remember correctly - that he was looking for a better way of thinking when he came up with SLD resolution[11]. It's interesting how these languages came out of different paths towards a similar goal.

Also beware the reverence of Arthur Whitney. His work is definitely inspired, but the community around K can seem schizoid-like[12], in a way comparable to Wolfram's projects[13].

That said, J is an exceptionally fun language to use. My favorite insight from an APL-like language that generalizes is how K encourages writing functions that converge by the easiest-to-use loop operator being one that applies a function to an argument repeatedly until the output stops changing.

---

[1]: https://www.softwarepreservation.org/projects/apl/Papers/197...

[2]: http://cri-dist.univ-paris1.fr/diaz/publications/GNU-PROLOG/... (there are probably more to the point papers, this is just the one I read when I noticed the similarities).

[3]: https://aplwiki.com/wiki/Array_model

[4]: https://www.khoury.northeastern.edu/home/jrslepak/typed-j.pd... (implemented in racket iirc)

[5]: https://www.nial-array-language.org/

[6]: https://codeberg.org/ngn/k (honestly it is a miracle this exists)

[7]: https://www.gnu.org/software/apl/

[8]: https://nsl.com

[10]: https://www.eecg.utoronto.ca/~jzhu/csc326/readings/iverson.p...

[11]: https://thesearch.space/episodes/1-the-poet-of-logic-program...

[12]: https://www.ijpsy.com/volumen3/num2/63/the-schizoid-personal...

[13]: http://genius.cat-v.org/richard-feynman/writtings/letters/wo...

mlochbaum · 2 years ago

Here are my two cents on array compilation. I think a lot of the research goes in the direction of immediately fixing types and breaking array operations into scalar components because it's easy to compile, but this ignores some advantages of dynamic typing and immutable arrays. When you can implement most operations with SIMD, a smaller type always means faster code, so dynamic types with overflow checking can be very powerful on code that deals with a lot of small integers.

https://mlochbaum.github.io/BQN/implementation/compile/intro...

I'm somewhat skeptical of the virtual optimizations on indices, "beating" and similar. They sound nice because you get to eliminate some operations completely! But if you end up with non-contiguous indices then you'll pay for it later when you can't do vector loads. Slicing seems fine and is implemented in J and BQN. Virtual subarrays, reverse, and so on could be okay, I don't know. I'm pretty sure virtual transpose is a bad idea and wrote about it here:

https://mlochbaum.github.io/BQN/implementation/primitive/tra...

moonchild · 2 years ago

> VM model for APL languages

It's cute—but from my skimming a while ago fairly primitive. We can do much better with less effort using more general mechanisms. (Not a knock—it's a product of it's time—a lot of old compiler tech was not very good and even so remains unsurpassed.)

> statically typed

I in principle espouse a much more nuanced view than this, but in short: just don't.

082349872349872 · 2 years ago

> applies a function to an argument repeatedly until the output stops changing

In other words: instead of worrying about which n to use for "loop n times", it just always loops (effectively) an infinite number of times...

lkuty · 2 years ago

What is reference [9] ?

countWSS · 2 years ago

The code in C that suppose to be written like this is usually never written first like that, its like pretending writing minified js by hand from scratch. Usually the code is contracted and "minified" from large program to fit entire program into 1-3 screens, the person who manually "minified" it to that state will know its expansion but other people will dismiss it as obfuscated C, its an old technique to fit lots of code into 80x25 type terminal. Not surprising since J is optimized for code density per screen.

kelas · 2 years ago

> The code in C that suppose to be written like this is usually never written first like that

usually not. but we prefer to write it first this exact way, and there are good reasons for rhat.

> obfuscated c

it is not. this style is extremely regular, very readable and writable, and escapes a whole galaxy of typical C blunders. i can expand on that if you wish.

hnfong · 2 years ago

Please do. Although I'll probably never write C in that style, most of us here will probably learn a few things that will eventually prove useful. (And it probably will also serve as a historical document of a "skill"(?) that is apparently soon to be lost to obscurity...)

n0pa1n · 2 years ago

> it is not. this style is extremely regular, very readable and writable, and escapes a whole galaxy of typical C blunders. i can expand on that if you wish.

Please do! I'd like to learn. If you can expand on this in the README section as well that would be great.

colonwqbang · 2 years ago

> Not surprising since J is optimized for code density per screen.

I don't think that's the whole story. J is dense like traditional mathematical notation, but can be executed by machine. Experienced J programmers use it to convey mathematical ideas. See for instance:

https://www.jsoftware.com/jwiki/Puzzles/Unit_Fraction_Sum

Although I can't read the notation, I appreciate the role it can play. Plain C (or whatever) code isn't an efficient vehicle for ideas like that. Numpy/matlab comes closer but J is a stronger approximation of traditional maths notation.

crabbone · 2 years ago

I like J. Especially because it has a saner way to write it (it doesn't have to look like as if you accidentally forgot a null terminator in C strings, all the traditionally short identifiers have a long and understandable form).

I feel like it's very regrettable that the superficial aspect of J (the very hard to read syntax) is standing in the way of some very nice ideas.

To comment on mathematical notation. Before I was a programmer, I was a typographer. During my study in art academy, I invented a bunch of fonts, one of my long time projects was to make Hebrew look more like Latin fonts for example (this is a long-standing issue in Hebrew typography, with several historical attempts, but still not quite resolved). Afterwards I worked in a printing house, paginated a newspaper, typeset a bunch of books etc.

Among my coworkers (esp. in the newspaper) I was sort of known for trying to automate stuff, so, I was often suggested as a candidate for "difficult" typographical tasks, like setting sports tables, chess diagrams, music sheets and the most damned and hated kind of typographical work: math formulas.

I've helped publish a course book on linear algebra for a university. It was a multi-year project which I joined in the middle. I have never seen so much pain, struggle and reluctance as I've encountered while working on this thing. People tasked with proofreading demanded extra pay for proof-reading this stuff, and still wouldn't do it. Just put it away and later explain that they had other things to do. The lady who had to translate the mostly hand-written, or sometimes typed on a typewriter manuscript would just skip work on the days she was supposed to input the manuscript into our digital system.

Everyone passionately wanted this project to burn in hell. And the reason for this was the mathematical notation. Typical proofreading techniques don't work on math formulas. The text is impenetrable to anyone, often even to the people who wrote it, including both the author and the editor. Parenthesis are a curse, because in the manuscript they are one of the elements that is most commonly forgotten or misplaced. Single-letter variables are the other one. Overloading the same symbols with different meaning is yet another one. It gets worse when the same symbol is used in its normal size, subscript and superscript.

----

When I talked about my experiences to people with degrees in math, they way they tend to respond to this is by saying that "math is overall so hard, that mathematicians don't typically notice the extra struggle they incur on themselves by the bad language choices, it pales in comparison to the difficulty of the main problem they need to solve".

And, I kind of can see it... on the other hand, I see no reason _the students_ have to endure the same torture. They aren't solving any novel mathematical problems. Their task is usually reading-comprehension combined with memorization.

And then I saw Sussman book where he uses Scheme to write math formulas (I think it was about physics, but it still used a lot of math). Dear lord, it was so immeasurably better than the traditional mathematical notation. I really wish more people joined this movement of ditching the mathematical notation in favor of something more regular and typography-friendly as Scheme...

andrewla · 2 years ago

While there are people that do this, I do not think that Whitney is one of them. This code is not obfuscated; it uses macros and strategically defined functions to allow writing code in a style similar to APL that appears natural (-ish?) to someone fluent in that programming style.

kelas · 2 years ago

> not obfuscated

absolutely not. porting it to ISO C was a very fun and smooth ride, also added two adverbs atw forgot to add in 1989 (see over/scan) and a header file with some handy accesssors (atw usually does that, but he was lazy that day)

> to someone fluent in that programming style

what people often don’t realize is just how fast one can pick up atwc style, and how hard it is to ever go back :)

clausecker · 2 years ago

Nope, Whitney just codes like this.

JaumeGreen · 2 years ago

Talked about this in other times (through other links):

* https://news.ycombinator.com/item?id=8533843 [2014]

* https://news.ycombinator.com/item?id=28491562 [2021]

* https://news.ycombinator.com/item?id=25902615 [2021]

* https://news.ycombinator.com/item?id=34050715 [2022]

A commentary about the code https://news.ycombinator.com/item?id=22831931 [2020]

rramadass · 2 years ago

A previous HN thread on Arthur Whitney's "B" is relevant here. Follow my previous comment here - https://news.ycombinator.com/item?id=30416737 where user "yiyus" goes through the code line-by-line adding detailed notes for comprehension.

Also see https://github.com/tlack/b-decoded

zengid · 2 years ago

Obligatory posting of Bryan Cantrill's interview with Arthur Whitney https://queue.acm.org/detail.cfm?id=1531242

bcantrill · 2 years ago

The full audio of that was recorded but was never released -- this is reminding me that I should loop back with ACM to see if they still have it and can release it. In particular, I want to see how long the pause when Arthur responded to my question "What do you think the analog for software is?": I won't give away his answer, but it more or less detonated my brain -- and it took me what felt like minutes (but was surely only seconds?) to put myself back together and ask a follow-up.

082349872349872 · 2 years ago

I agree with Arthur's answer, but unfortunately it runs afoul of Conway's Law: auteurs have a voice* and are capable of producing software in that style, but large organisations? By necessity, they must produce something qualitatively different, that anyone can slot anything into anywhere, optimised for superficial comprehensibility over elegance.

* sometimes small groups? Doug McIlroy said he was lucky to have managed a software group whose members would sit around in the lunch room and brainstorm (reminiscent of the Little Prince?) not what they could add, but what they could remove.

« Par ma foi, il y a plus de quarante ans que je dis de la prose sans que j'en susse rien, et je vous suis le plus obligé du monde de m'avoir appris cela. » —JBP

kelas · 2 years ago

sir, please - make an effort. recover that audio.

it is a f%ck#g shame that CHM failed to do their job and lost the legendary footage of the celebration of KEI. Arthur took the last word and opened with the iconic line “deeds of great men don’t need words, they need more deeds, so i’ll keep it short”. check out who else took the mike that afternoon:

https://computerhistory.org/events/celebration-kenneth-ivers...

kstrauser · 2 years ago

“…and that’s why we have code formatters.”

That’s neat and impressive. I’m glad I’m not required to read or understand it.

gfv · 2 years ago

It really doesn't help that it's written in ancient K&R C, but if you spend ten or so minutes just staring at it, familiar shapes and patterns start to appear. (Give it a try!)

Incidentally, it's in line with how APL code looks like an alien artifact at first, but you get used to it fast if you have spatial reasoning to wrap your head around reshaping and transposing.

hackernoteng · 2 years ago

If you focus on the middle and move your head back and forth eventually you see a 3D image of Dykstra pop out and he doesn't look happy at all.

kelas · 2 years ago

porting from k&r to iso is super easy. fun, too.

rbonvall · 2 years ago

Once I put the effort to understand code like this and it turned out it's straightforward once you learn the conventions:

https://news.ycombinator.com/item?id=19421524

Deleted Comment