Very recently, Los Alamos National Lab published a report An evaluation of risks associated with relying on Fortran for mission critical codes for the next 15 years [1]. In their summary, they write:
<quote>
Our assessment for seven distinct risks associated with continued use of Fortran are that in the next fifteen years:
1. It is very likely that we will be unable to staff Fortran projects with top-rate computer scientists and computer engineers.
2. There is an even chance that we will be unable to staff Fortran projects with top-rate computational scientists and physicists.
3. There is an even chance continued maintenance of Fortran codes will lead to expensive human or financial maintenance costs.
4. It is very unlikely that codes that rely on Fortran will have poor performance on future CPU technologies.
5. It is likely that codes that rely on Fortran will have poor performance for GPU technologies.
6. It is very likely that Fortran will preclude effective use of important advances in computing technology.
7. There is an even chance that Fortran will inhibit introduction of new features or physics that can be introduced with other languages.
</quote>
In my view, a language is destined for being a "maintenance language" if all of these are simultaneously true:
1. There is a dearth of people who know the language well.
2. Few people are opting to learn it in their free time, and/or seek it out for a job.
3. Companies do not seriously invest in training in learning the language.
4. Companies don't pay enough to convince an engineer to use it. who otherwise loves using other languages and has better prospects with them.
I've experienced unique challenges in hiring Lisp programmers, but the fact it remains sufficiently interesting to enough software engineers (who are usually good programmers) has been a boon, and likewise providing space to learn it helps even more.
Fortran though is teetering on its historical significance and prevalence in HPC. Aside from the plethora of existing and typically inscrutable scientific code, I'm not sure what the big iron imperative language offers over the dozens of other superficially similar choices these days, except beaten-path HPC integration. Scientists are more and more writing their code in C++ and Python—definitively more complicated than Fortran but still regarded as premier languages for superlative efficiency and flexibility respectively. Julia has had a few moments, but (anecdotally) it doesn't seem to have taken off with a quorum of hardcore scientist-programmers.
I am one of the co-founders of the fortran-lang effort. I did it while I was at LANL, where I worked as a scientist for almost 9 years. I think the report is overly pessimistic. Here is my full reply on the report: https://fortran-lang.discourse.group/t/an-evaluation-of-risk....
And yet, we now a compiler (Codon) that can compile Python into Fortran code, also allowing direct use of LLVM IR and ability to import C (and Fortran?) libraries.
So, I have a question (I didn't real the LANL paper, save for the highlights above, but I did read your initial post that you linked to).
And this isn't meant as some kind of bait, just honest intellectual curiosity (its the internet so you can't see how it presented, just read the raw words).
Simply, why is it important for Fortran to survive in this space? Why not let the field move on? Why fight this fight?
> big iron imperative language offers over the dozens of other superficially similar choices
Given the capabilities of modern machines and the fact that non-homogenous hardware (GPUs, different processors like in Apple Silicon) is back, the "winning" strategy is to have high-level scripting languages where you can ignore most of the details, which call into hyper-optimized, high performance libraries. For instance, when you're using Scipy, you call into Fortran and C almost interchangeably.
Since most likely you aren't writing full programs in the low-level language, it doesn't need to be a general-purpose language offering the same affordances, say, C++ is supposed to provide.
For numerical/scientific code, Fortran is much, much easier to use, especially for people whose background isn't computer science. Being able to write your code in terms of matrices and arrays rather than "pointer to const restrict" is a godsend when you're programming what is often very difficult mathematical code. When compared to modern C or C++ you won't get any advantage in terms of performance, but you don't lose much either, and in return you get to use a language that's much more suited to the task.
The historical Achilles' heel of Fortran is that it's kind of awkward as a general-purpose language, but that's negated by the "compiled core, interpreted shell" approach that's dominant these days.
> the "winning" strategy is to have high-level scripting languages where you can ignore most of the details, which call into hyper-optimized, high performance libraries. For instance, when you're using Scipy, you call into Fortran and C almost interchangeably.
That gets you about 70% of the way there, but the remaining 30% is tricky. Unfortunately, the library-call stage doesn't easily (with generality) let you fuse operations. For example, consider the example of a finite-volume code, where the flux is some nonlinear function of the interface value, taken to be the average of left and right cells:
flux = nonlinear_flux_op(f[:-1] + f[1:]) # Compute interface fluxes
f[1:-1] += dt/dx*(flux[:-1] - flux[1:]) # Update cell volumes based on flux differences
# Do something special for the boundaries
Each of the substeps here is fast. Sub-indexing these 1D arrays is trivial, and their local operations are highly vectorized. The nonlinear flux operator operates element-wise, so it's easy to compile it into a version that properly (CPU) vectorizes (and/or it might already be provided, like np.exp or *2). This code is also very readable.
However, real performance would require fusing operations. There's no need to compute a full temporary array for the interface values, but that's exactly what would happen in naive NumPy code. Likewise, there's no need to compute all of the fluxes before taking their differences.
A hand-optimized code would perform the full computation in a single scan of the array, keeping all of the data in CPU cache (blocking by a cache line or so in order to use vector registers effectively). Even if given the code naively written, an optimizing compiler would try loop fusion and get most of the way to the hand-optimized result.
Code such as this is relatively FLOPS-light compared to the amount of memory accessed, so effective use of cache can lead to an order of magnitude speed-up.
With a fair amount of work you can still achieve this result in Python, but to do so you need to use one of the optimizing Python compilation tools like numba. Unfortunately, this is not easy for the not-a-programmer domain scientist, since working with numba means understanding the intersection of Python and numba's imposed compilation rules.
> the "winning" strategy is to have high-level scripting languages where you can ignore most of the details, which call into hyper-optimized, high performance libraries. For instance, when you're using Scipy, you call into Fortran and C almost interchangeably.
Well, no. This is python's strategy. Doesn't make it the winning strategy. Python implicitly forces multiple languages upon you. A scripting one, and a performance one. Meanwhile languages such as Julia, Rust, etc. allow you to do the work in a single (fast/compiled) language. Much lower cognitive load, especially if you have multiple cores/machines to run on.
Another point I've been making for 30+ years in HPC, is that data motion is hard. Not simply between machines, but between process spaces. Take large slightly complex data structures in a fast compiled language, and move them back and forth to a scripting front end. This is hard as each language has their own specific memory layout for the structures, and impedance matching between them means you have to make trade-offs. These trade-offs often result in surprising issues as you scale up data structure size. Which is one of the reasons that only the simplest of structures (vectors/arrays) are implemented in a cross language scenario.
Moreover, these cross language boundaries implicitly prevent deeper optimization. Which leads to development of rather different scenarios for code development, including orthogonal not-quite-python based things (Triton, numba, etc.).
Fortran is a great language, and as one of the comments pointed out, its really not that hard to learn/use. The rumors of its demise are greatly exaggerated. And I note with some amusement, that they've been going on since I've been in graduate school some 30-35 years ago. Yet people keep using it.
> the "winning" strategy is to have high-level scripting languages where you can ignore most of the details
From personal experience, I don't believe this works in practice. Invariably, some level of logic is implemented in said scripting language, and it becomes a huge bottleneck. Counterintuitively, when you have a central thread dolling out work to many subroutines, the performance of that thread becomes more critical as opposed to less.
(I should preface this by saying that my claims are all based on what I see in a specific scientific community and US laboratory environment. I can't say for sure my observations are true more generally.)
I think what you say is true in principle, basically on all counts, but Fortran's niche that its advantages serve best has been continually eaten away at by the Python/C++ combination.
The "I don't know pointers" crowd steers toward Python, and the "I care about CPU cycles" crowd steers toward C++, and that relationship is symbiotic.
Julia promised to be a general-purpose language, both as fast (or faster) than Fortran/C++ and as easy (or easier) than Fortran/Python. But it doesn't seem to have panned out as they'd perhaps have hoped.
> "the "winning" strategy is to have high-level scripting languages where you can ignore most of the details, which call into hyper-optimized, high performance libraries. For instance, when you're using Scipy, you call into Fortran and C almost interchangeably"
That's nice as long as it works, but in my experience gets ugly when it doesn't. While we can currently witness this being a successful model, the impedance mismatch between both worlds is often a pain.
I bet on a language like Rust as the winning strategy in the long run - high level like Python but still close to the metal and all in one language, one compiler, one ecosystem.
In my experience "very difficult mathematical code" stems from employing little or no abstraction of the problem domain, mashing together numerical algorithms with physics formula. Using Fortran is IMO as much a prerequisite as a result of this.
I fully agree with you that scripting languages are a great fit for scientific computing. But I just don't see how this is a case for Fortran.
R or Python are still much easier to use and give you the same access to high-performance numerical accelerators via BLAS/LAPACK backends. And if you need a low-level language (e.g when developing an optimised routine that solves a specific problem), I would use a high-performance widely supported systems programming language like C++ or Rust.
Fortran seems to be around mostly for legacy reasons. And I am not sure that legacy considerations alone make a good foundation.
> 1. It is very likely that we will be unable to staff Fortran projects with top-rate computer scientists and computer engineers.
They are mentioning "top-rate", I'm assuming salary is not an issue. I can understand for a scientist who's job is not programming (and programming is just a necessary activity). But for a "top-rate" computer engineer, given a good enough salary and an interesting field, how hard can it be to learn enough Fortran for this?
Or am I misunderstanding what is a computer engineer?
Even for a top-rate scientist actually. Surely the programming language is not the hardest part of the job?
(not saying they should not try to move away from Fortran, there are better solutions for this field now, I mostly agree with this list full of insight)
Salaries for software engineers at a US national labs like LANL don't deviate far from the $100k range [1]. For many projects, a security clearance and on-site presence are required. Add to that the requirement to train and use Fortran (old and new), the job proposition is only looking good to (1) people who love the problem domain (e.g., nuclear stockpile stewardship), (2) people who are specialized scientists who moonlight as programmers, or (3) people who simply like government lab environments (stability, science, ...).
It's not about learning Fortran being hard, obviously if you're good you can acquire any language, especially in a paradigm you already know anyway, but why ?
That's the thing for these "top-rate" posts, you need to sell the candidates on why they want to take your offer, not some other role they could have. Yes, one option is to "just" pay them a tremendous amount of money, but remember you're competing with the banks who have the same strategy and their whole business is making money. So that would be very expensive.
Instead it makes sense to compete on other considerations. Candidate doesn't like to work in the office? No problem. Stay where you are, work for us remotely. Or maybe they'd like to move their grand parents and extended family to your country? You could make that happen. Or could you? Do you think the US Federal Government will accept "We need to naturalize these sixteen random people, immediately to get one Fortran programmer" ?
And the choice of language is one of those other considerations. To a first approximation nobody is more keen to take your offer because they get to write Fortran.
While I agree that a computer scientist should be able to pick up any language with a bit of training, to be effective in using it, it takes – sometimes – years of continued use. That's a steep investment for any company/lab.
I think it's the oldest joke in cs: "I don't know what we will be programming in the year 3000, but I know it will be called FORTRAN"
My own renaissance came via Jetson Nano where CUDA FORTRAN has some nice examples for image processing via convolution kernals. It's blazingly fast: both design time and run time!
This short film from 1982 shows the love scientists feel for "the infantile disorder":
> My own renaissance came via Jetson Nano where CUDA FORTRAN has some nice examples for image processing via convolution kernals. It's blazingly fast: both design time and run time!
CUDA Fortran was amazing. It had a really nice syntax, which did not feel that odd next to standard Fortran, and great performance. But it faced an uphill battle, was not really well managed and suffered from being coupled with the PGI compiler. I wish they’d put it in gfortran instead.
When a job I'm interested in uses a language I haven't used before I say that I've learned a couple already and I'll quickly learn the next one. I'd generally say each additional language has been easier than the last. I've landed contracts to code in languages I didn't know and got positive feedback about my performance.
Is there anything unique about Fortran that prevents experienced engineers from quickly learning it on the job?
> ...prevents experienced engineers from quickly learning it on the job
The problem is in the longevity of Fortran. The variety of codes accumulated through time makes up for more than one Fortran. Modern Fortran may appear as a new language, yet still very compatible with 'classic' Fortran. MPI adds another dimension to learning it, though there's tooling around it.
The aspect of tolerance to single letter variables and cryptic function names could also be a factor, no kidding.
In general, asking for prior experience with Fortran is reasonable, just as hiring someone to "quickly learn" C on the job is quite non-pragmatic.
> Is there anything unique about Fortran that prevents experienced engineers from quickly learning it on the job?
No. I think that Fortran is a convenient scapegoat for LANL, not an actual obstacle. I discuss what I think the real problem is in this comment: https://news.ycombinator.com/item?id=37292569
The language itself? In broad strokes, not really. The typical tooling? Definitely yes, it's usually very HPC centric (MPI, OpenMP, batch job schedulers, ...).
On LANL's points numbers 1 to 3, here's a perspective from the other side:
I think their main problem hiring is that they are looking for perfect candidates. (The same may be true for some of the other DOE labs like LLNL as well.) I've interviewed for at least 3 positions at LANL. I have a security clearance and would love to work for them, but I have never received an offer from them since finishing my PhD. (I did a summer internship with them a long time ago.) Issues like there being relatively few Fortran programmers are convenient scapegoats. They seem to want someone who not only knows Fortran, but also can easily get a security clearance or already has one, can write highly optimized MPI code, is familiar with the unusual numerical methods LANL uses, etc. They seem unwilling to hire people who are 80% of the way there. I've never programmed with MPI, for example, but understand the basics of parallel programming and have programmed shared memory codes. The most recent LANL group I interviewed with also didn't like that I quit a previous postdoc due to disagreements about the direction of the project.
In contrast, in my current job at a DoD contractor, they didn't seem to care much about what I did before or knew previously. They apparently hired me because I had some familiarity with their problem and could learn what I didn't know. I've done well in this job working on things mostly new to me. I'd like to leave because of the bad job security and bad benefits, but otherwise, I have no major problems. (And yes, I do use Fortran on a daily basis here, including some horrendous legacy Fortran.)
Given this, I don't think LANL's hiring problems will stop if they switch all their active softwares to C++ or some other language, because that's only part of the problem.
Edit: Another problem is that LANL's hiring process is convoluted. I almost got an offer from them two years ago, as the group I interviewed with liked me. But they required me to write an internal proposal which would have to be approved by someone outside of the group. I think this is part of LANL's standard post-doc hiring process, though at the time I thought it indicated that they didn't have the money to hire me and were trying to get it. There was no guarantee that the proposal would be approved. I didn't have the time then and backed out, though in retrospect I wish I went through with it as I've heard this is more routine than I thought it was.
Edit 2: Also, for clarity, not every position I applied to required Fortran knowledge. But I think my basic message is correct, in that LANL tends to be fixated on hiring "perfect candidates" when there are plenty of capable people that they could hire instead who just need to learn some things, or might not have a perfect job record (gaps in employment, etc.).
> I think their main problem hiring is that they are looking for perfect candidates.
I’ve noticed this with government and government adjacent stuff: they are choosing beggars.
The salaries they offer tend to cap well below what you can make in the private field, often are in weird areas and don’t offer relocation, and just seem terribly bureaucratic when it comes to advancement.
And yet, it seems they won’t hire you if you don’t hit their checklist to a T. I applied for a NOAA job, that wasn’t too crazy, and for the most part, was a pretty good fit for my resume. However, the application had a checkbox question along the lines of “have you ever written software for [hyper specific domain]”. I answered no and was unsurprisingly rejected. By the way, this wasn’t some wild scientific computing project, it was basically a job writing CRUD software for something that sounded interesting to me.
I really wonder how some roles get filled, I’ve seen some ridiculous asks.
As "way forward" report it covers only half the problem.
Identification of the risks including the likelihoods and hazards is pretty good. I like the notion of contrasting opinions included in the report.
But the big thing missing is mitigations: what can they do about training, about recruiting, and especially about "industry" investing in compilers? This report says nothing about ways to assure continued usability of Fortran-based codes in the next 10 or 15 years. It just lists risks. What can they do to improve the capability of GPU compilers, CPU compilers, training or staff development?
And, setting Fortran aside, what are the plans to assure continued capability in supporting and developing all of their codes, in whatever languages, in the next 10 or 15 years? This evaluation might well be replayed for CUDA, Python or Julia in the next 5 years.
The US budget for supercomputing must be in the $2G to $5G per year range, yet it seems the notion of planning a software strategy for supercomputing is not achievable.
About 4: I think Fortran looks quite like C when it comes to the CPU architecture. Both are somewhat low level procedural languages.
As long as C works well on future CPUs, Fortran is probably fine and I don't see C disappear overnight. At least, Fortran compilers can be updated to produce reasonable code for new CPUs. Which should happen because of the shared compiler architecture (backend).
About 5: GPUs architectures seem less stable and seem to require specific development. If those specific developments require language support, that's probably not coming to Fortran if the assumption is that Fortran is going maintenance-only.
About 6: advances in computing technology are not restricted to CPUs and GPUs.
(disclaimer: I've only seen Fortran code from far away. And code targeted to GPUs too).
Is stack-free function calling equivalent to tail call optimization? If so, then many CL implementations support it, including SBCL. For Scheme, it’s part of the standard.
The report mentions both "modern Fortran" (which they explicitly specify to be Fortran 2008) and "legacy Fortran" (which they don't further specify, but I assume is 77).
Fortran is in a very strange position in the current tech landscape.
There is a race of sorts going on, to simplify and commoditize high performance computing in the post-Moore's era. This means enabling things like easy GPU programming, easy multi-core programming, easy "number crunching" etc. Historically this was the stuff of esoteric HPC (high performance computing) pursued by highly specialized computational scientists etc. (read -> an important but non-economic domain that subsisted largely on government funds).
Yet suddenly the same arcane HPC is effectively also the domain of "AI" and the associated planetary hyperventilation. Now all sorts of modern initiatives and language ecosystems aim to reposition to gainfully intermediate that AI revolution. Python was the scripting language that somewhat unexpectedly enabled all this, but now its warts are under the microscope and people think of various alternatives (julia), supersets (mojo), newcomers like rust etc. But that race is definetely not settled and it might not be so for quite some time. The requirements are highly non-trivial.
As far as taking part in this race, Fortran seems to have missed not one but several boats. Yet the fact remains that it was a language designed for number crunching and evolved as language for HPC, with parallel computing features (since F90) etc. It combines easy to read imperative numerical code with near optimal computational performance.
Its like a former world champion in a sport that was fell out of favor for a long time but is now unexpectedly back in vogue. So the question is whether the champ can get back into shape or make room for new contenders :-)
Fortran-lang's role (as an open-source org) has been 4-pronged: Tooling (build system and package manager, testing, eventually compilers etc.), modernized and maintained libraries (stdlib, minpack, fftpack, etc.), community space (Discourse), and evangelism/marketing (website, Twitter, blog posts etc.). Some members participate in the standardization process of the language, but the groups and processes are separate and complementary.
It's true that one goal may be to pick an important race and try to win it.
Another goal, in my view more important, is to make Fortran more pleasant to use for people/Orgs who need it (there are many) and for people who love it (there are many).
I've found that more often than not, people/teams first like working with a technology, and then come up with technical arguments for why that technology is the best choice. Often the arguments are valid, sometimes they're made up, but ultimately underneath it all you either like it or not and that's all that matters. My goal with Fortran-lang has been to slowly and continuously increase the surface area of Fortran's likability. Fortran is not for everyone, but for people who think it may be, we can work to make it better and more pleasant to use.
As one example, we just released a small library to make high-level HTTP requests from Fortran applications: https://github.com/fortran-lang/http-client. This was a product of one of our Google Summer of Code contributors.
> people/teams first like working with a technology, and then come up with technical arguments for why that technology is the best choice.
Words of wisdom (:
Maybe it's not the way things should be, but it's common and very human. Sometimes I'll even catch myself retconning explanations for my choices in my own head, despite myself.
Very well put. Also a subset of Python: https://lpython.org/, and it shares the internals with https://lfortran.org/, both run at the same speed. If you have any feedback, please let us know.
I once noticed the thing is people balk at the line length in fortran as if it is some archaic hold off (which it is somewhat) that proves its obsolescence but in the next breath berate everyone for not using a formatter that will chunk lines into 100 characters in their favorite "modern" language. I guess the problem is fortran chose 80 as the line length and not 100.
having used both FORTRAN and Julia, Julia generally had less friction and comparable speed. JIT is great for having a very short programming loop. I definitely prefer Fortran to Python though for math programming.
As weird as it may sound, I prefer the old fixed-form source with continuation characters and mandated 7-character indent. Like Python, it made for consistent code formatting across people and libraries.
Looking forward to mojo. Right now it’s not production ready but it’ll great for some of my ML-adjacent code. For other HPC work I’ve used Rust which has some annoying syntax and lacks some of the simplicity of python.
I can recommend this recorded live stream to get an impression of what happens when trying to use Fortran without previous experience:
“Learning Fortran in 2023” https://www.youtube.com/watch?v=PvUQndB8R9s
Very understanding people from what I see. I might pick Fortran to do my course exercises this year alongside the required use of Python.
This might be interesting, and worthwhile.
I wish there were more serious studies on "Matlab risk". There's a lot on $old_lang risk already, and Excel risk is now a serious thread of research, but many in academia (and very possibly greybeards in industrial R&D) just won't budge from Matlab. So e.g. all the codes from my thesis are in Matlab because my advisor wouldn't Python.
Matlab in theory has OO, but it's very slow and who bothers to verify that? So practically everything is matrices and there's very little "semantic expressed in code itself" (to paint issues with a broad brush). Also matrix calculations can get botched numerically, particularly matrix exponentials, but the whole culture (that I can't even expunge from my own thinking) is that I have a theorem so tests schmests.
Matlab is great for a lot of things, though it's not really a great general purpose language. Its main problem is that it's not free (though octave exists, obviously), which limits interoperability. Octave can trivially be embedded in C++ and Python; if it was easier to do that with Matlab I wonder if numpy would have ever existed...(after all, numpy is essentially a port of Matlab semantics to Python... most Python numeric programmers are unwittingly basically writing Matlab already...).
This process of language love and hatred over a short period is what's called language fad. Ten years ago, people wrote articles praising MATLAB over established languages. I do not recall any of those writings ever mentioning Matlab's licensing as an issue. MATLAB is now >1000 fold better than ten years ago. Yet, the new generation throws it under the bus daily because it's not their favorite. Change is the only constant in the world of programing fashion and language fads.
matlab is incredible for "I have some data, I want do a bunch of calculations and then spit out nice plots". It's why matplotlib is a thing. But not at all well suited for OO / building larger software applications / CI & testing, which is partially at least why the former is the case
I am currently working on a project trying to convert some numerical linear algebra from Fortran90 into Python and was frankly shocked at how much more performant some relatively naive Fortran is, even over using Numpy/Scipy for matrix operations. Orders and orders of magnitude faster. No, I still don't totally understand why.
I would absolutely expect python to perform orders if magnitude slower. Even optimal python calling numpy is going through multiple levels of abstraction copying data around. Fortran is raw operations, compiled and with the data packed together.
I would strongly suspect that you are misusing Numpy in some way.
Python is horrifically slow, everyone knows that, but if used correctly numpy can even the playing field a great deal. I've seen lots of python/numpy written using nested loops, and this is always going to be really really slow.
This is somewhat of a link dump since the code is definitely too hard to interpret for someone not familiar but this is me converting some fairly performant Fortran code to python w/ numba at only a 1-2% performance penalty.
You're probably doing some unnecessary data movement. Using a modern profiler like scalene might point out where some improvements are possible: https://github.com/plasma-umass/scalene
lots of avoidable copies in NumPy, and even if you manage to get rid of that, the python interpretation overhead isn't amortized until your matrix dimensions are in the several-hundreds at least.
Consistently separating words by spaces became a general custom about
the tenth century A.D., and lasted until about 1957, when FORTRAN
abandoned the practice.
-- Sun FORTRAN Reference Manual
fortran itself is pretty easy to learn - the stuff people usually have problems with are things like linking with external libraries, job control and the execution environment generally. meh, same with most languages, i suppose.
I've been a bit immersed in pre-Fortran-77 (so FORTRAN-IV and FORTRAN-66). It was an interesting time: FORTRAN was the most standard language, so people used it for things like compilers. An example I have is Motorola's 6800 assembler, which was delivered as a tape for various common machines of the day:
But this was before FORTRAN had strings.. or even bit-wise operators.. But you could read an input line into an array of integers. Then to read each character from this array, you used integer multiplication and division to perform shifting and masking. Although in this version, I can see calls to assembly language helper functions.
<quote> Our assessment for seven distinct risks associated with continued use of Fortran are that in the next fifteen years:
1. It is very likely that we will be unable to staff Fortran projects with top-rate computer scientists and computer engineers.
2. There is an even chance that we will be unable to staff Fortran projects with top-rate computational scientists and physicists.
3. There is an even chance continued maintenance of Fortran codes will lead to expensive human or financial maintenance costs.
4. It is very unlikely that codes that rely on Fortran will have poor performance on future CPU technologies.
5. It is likely that codes that rely on Fortran will have poor performance for GPU technologies.
6. It is very likely that Fortran will preclude effective use of important advances in computing technology.
7. There is an even chance that Fortran will inhibit introduction of new features or physics that can be introduced with other languages. </quote>
In my view, a language is destined for being a "maintenance language" if all of these are simultaneously true:
1. There is a dearth of people who know the language well.
2. Few people are opting to learn it in their free time, and/or seek it out for a job.
3. Companies do not seriously invest in training in learning the language.
4. Companies don't pay enough to convince an engineer to use it. who otherwise loves using other languages and has better prospects with them.
I've experienced unique challenges in hiring Lisp programmers, but the fact it remains sufficiently interesting to enough software engineers (who are usually good programmers) has been a boon, and likewise providing space to learn it helps even more.
Fortran though is teetering on its historical significance and prevalence in HPC. Aside from the plethora of existing and typically inscrutable scientific code, I'm not sure what the big iron imperative language offers over the dozens of other superficially similar choices these days, except beaten-path HPC integration. Scientists are more and more writing their code in C++ and Python—definitively more complicated than Fortran but still regarded as premier languages for superlative efficiency and flexibility respectively. Julia has had a few moments, but (anecdotally) it doesn't seem to have taken off with a quorum of hardcore scientist-programmers.
[1] https://permalink.lanl.gov/object/tr?what=info:lanl-repo/lar...
Perhaps, interoperability is the way forward.
And this isn't meant as some kind of bait, just honest intellectual curiosity (its the internet so you can't see how it presented, just read the raw words).
Simply, why is it important for Fortran to survive in this space? Why not let the field move on? Why fight this fight?
Given the capabilities of modern machines and the fact that non-homogenous hardware (GPUs, different processors like in Apple Silicon) is back, the "winning" strategy is to have high-level scripting languages where you can ignore most of the details, which call into hyper-optimized, high performance libraries. For instance, when you're using Scipy, you call into Fortran and C almost interchangeably.
Since most likely you aren't writing full programs in the low-level language, it doesn't need to be a general-purpose language offering the same affordances, say, C++ is supposed to provide.
For numerical/scientific code, Fortran is much, much easier to use, especially for people whose background isn't computer science. Being able to write your code in terms of matrices and arrays rather than "pointer to const restrict" is a godsend when you're programming what is often very difficult mathematical code. When compared to modern C or C++ you won't get any advantage in terms of performance, but you don't lose much either, and in return you get to use a language that's much more suited to the task.
The historical Achilles' heel of Fortran is that it's kind of awkward as a general-purpose language, but that's negated by the "compiled core, interpreted shell" approach that's dominant these days.
That gets you about 70% of the way there, but the remaining 30% is tricky. Unfortunately, the library-call stage doesn't easily (with generality) let you fuse operations. For example, consider the example of a finite-volume code, where the flux is some nonlinear function of the interface value, taken to be the average of left and right cells:
Each of the substeps here is fast. Sub-indexing these 1D arrays is trivial, and their local operations are highly vectorized. The nonlinear flux operator operates element-wise, so it's easy to compile it into a version that properly (CPU) vectorizes (and/or it might already be provided, like np.exp or *2). This code is also very readable.However, real performance would require fusing operations. There's no need to compute a full temporary array for the interface values, but that's exactly what would happen in naive NumPy code. Likewise, there's no need to compute all of the fluxes before taking their differences.
A hand-optimized code would perform the full computation in a single scan of the array, keeping all of the data in CPU cache (blocking by a cache line or so in order to use vector registers effectively). Even if given the code naively written, an optimizing compiler would try loop fusion and get most of the way to the hand-optimized result.
Code such as this is relatively FLOPS-light compared to the amount of memory accessed, so effective use of cache can lead to an order of magnitude speed-up.
With a fair amount of work you can still achieve this result in Python, but to do so you need to use one of the optimizing Python compilation tools like numba. Unfortunately, this is not easy for the not-a-programmer domain scientist, since working with numba means understanding the intersection of Python and numba's imposed compilation rules.
Well, no. This is python's strategy. Doesn't make it the winning strategy. Python implicitly forces multiple languages upon you. A scripting one, and a performance one. Meanwhile languages such as Julia, Rust, etc. allow you to do the work in a single (fast/compiled) language. Much lower cognitive load, especially if you have multiple cores/machines to run on.
Another point I've been making for 30+ years in HPC, is that data motion is hard. Not simply between machines, but between process spaces. Take large slightly complex data structures in a fast compiled language, and move them back and forth to a scripting front end. This is hard as each language has their own specific memory layout for the structures, and impedance matching between them means you have to make trade-offs. These trade-offs often result in surprising issues as you scale up data structure size. Which is one of the reasons that only the simplest of structures (vectors/arrays) are implemented in a cross language scenario.
Moreover, these cross language boundaries implicitly prevent deeper optimization. Which leads to development of rather different scenarios for code development, including orthogonal not-quite-python based things (Triton, numba, etc.).
Fortran is a great language, and as one of the comments pointed out, its really not that hard to learn/use. The rumors of its demise are greatly exaggerated. And I note with some amusement, that they've been going on since I've been in graduate school some 30-35 years ago. Yet people keep using it.
From personal experience, I don't believe this works in practice. Invariably, some level of logic is implemented in said scripting language, and it becomes a huge bottleneck. Counterintuitively, when you have a central thread dolling out work to many subroutines, the performance of that thread becomes more critical as opposed to less.
I think what you say is true in principle, basically on all counts, but Fortran's niche that its advantages serve best has been continually eaten away at by the Python/C++ combination.
The "I don't know pointers" crowd steers toward Python, and the "I care about CPU cycles" crowd steers toward C++, and that relationship is symbiotic.
Julia promised to be a general-purpose language, both as fast (or faster) than Fortran/C++ and as easy (or easier) than Fortran/Python. But it doesn't seem to have panned out as they'd perhaps have hoped.
That's nice as long as it works, but in my experience gets ugly when it doesn't. While we can currently witness this being a successful model, the impedance mismatch between both worlds is often a pain.
I bet on a language like Rust as the winning strategy in the long run - high level like Python but still close to the metal and all in one language, one compiler, one ecosystem.
R or Python are still much easier to use and give you the same access to high-performance numerical accelerators via BLAS/LAPACK backends. And if you need a low-level language (e.g when developing an optimised routine that solves a specific problem), I would use a high-performance widely supported systems programming language like C++ or Rust.
Fortran seems to be around mostly for legacy reasons. And I am not sure that legacy considerations alone make a good foundation.
> 1. It is very likely that we will be unable to staff Fortran projects with top-rate computer scientists and computer engineers.
They are mentioning "top-rate", I'm assuming salary is not an issue. I can understand for a scientist who's job is not programming (and programming is just a necessary activity). But for a "top-rate" computer engineer, given a good enough salary and an interesting field, how hard can it be to learn enough Fortran for this?
Or am I misunderstanding what is a computer engineer?
Even for a top-rate scientist actually. Surely the programming language is not the hardest part of the job?
(not saying they should not try to move away from Fortran, there are better solutions for this field now, I mostly agree with this list full of insight)
[1] A SWE4 (BS + 12 YoE) has a pay scale of $109k - $182k and requires a Q clearance (top secret for nuclear): https://lanl.jobs/search/jobdetails/senior-software-applicat...
That's the thing for these "top-rate" posts, you need to sell the candidates on why they want to take your offer, not some other role they could have. Yes, one option is to "just" pay them a tremendous amount of money, but remember you're competing with the banks who have the same strategy and their whole business is making money. So that would be very expensive.
Instead it makes sense to compete on other considerations. Candidate doesn't like to work in the office? No problem. Stay where you are, work for us remotely. Or maybe they'd like to move their grand parents and extended family to your country? You could make that happen. Or could you? Do you think the US Federal Government will accept "We need to naturalize these sixteen random people, immediately to get one Fortran programmer" ?
And the choice of language is one of those other considerations. To a first approximation nobody is more keen to take your offer because they get to write Fortran.
The challenge isn't that people can't learn it, it's that they don't want to because it's a bad career move.
My own renaissance came via Jetson Nano where CUDA FORTRAN has some nice examples for image processing via convolution kernals. It's blazingly fast: both design time and run time!
This short film from 1982 shows the love scientists feel for "the infantile disorder":
The Beginnings of FORTRAN (Complete)
https://www.youtube.com/watch?v=KohboWwrsXg
CUDA Fortran was amazing. It had a really nice syntax, which did not feel that odd next to standard Fortran, and great performance. But it faced an uphill battle, was not really well managed and suffered from being coupled with the PGI compiler. I wish they’d put it in gfortran instead.
Is there anything unique about Fortran that prevents experienced engineers from quickly learning it on the job?
The problem is in the longevity of Fortran. The variety of codes accumulated through time makes up for more than one Fortran. Modern Fortran may appear as a new language, yet still very compatible with 'classic' Fortran. MPI adds another dimension to learning it, though there's tooling around it.
The aspect of tolerance to single letter variables and cryptic function names could also be a factor, no kidding.
In general, asking for prior experience with Fortran is reasonable, just as hiring someone to "quickly learn" C on the job is quite non-pragmatic.
Perceived prestige, or rather the lack thereof. People with CS background often feel that learning Fortran is beneath them.
No. I think that Fortran is a convenient scapegoat for LANL, not an actual obstacle. I discuss what I think the real problem is in this comment: https://news.ycombinator.com/item?id=37292569
I think their main problem hiring is that they are looking for perfect candidates. (The same may be true for some of the other DOE labs like LLNL as well.) I've interviewed for at least 3 positions at LANL. I have a security clearance and would love to work for them, but I have never received an offer from them since finishing my PhD. (I did a summer internship with them a long time ago.) Issues like there being relatively few Fortran programmers are convenient scapegoats. They seem to want someone who not only knows Fortran, but also can easily get a security clearance or already has one, can write highly optimized MPI code, is familiar with the unusual numerical methods LANL uses, etc. They seem unwilling to hire people who are 80% of the way there. I've never programmed with MPI, for example, but understand the basics of parallel programming and have programmed shared memory codes. The most recent LANL group I interviewed with also didn't like that I quit a previous postdoc due to disagreements about the direction of the project.
In contrast, in my current job at a DoD contractor, they didn't seem to care much about what I did before or knew previously. They apparently hired me because I had some familiarity with their problem and could learn what I didn't know. I've done well in this job working on things mostly new to me. I'd like to leave because of the bad job security and bad benefits, but otherwise, I have no major problems. (And yes, I do use Fortran on a daily basis here, including some horrendous legacy Fortran.)
Given this, I don't think LANL's hiring problems will stop if they switch all their active softwares to C++ or some other language, because that's only part of the problem.
Edit: Another problem is that LANL's hiring process is convoluted. I almost got an offer from them two years ago, as the group I interviewed with liked me. But they required me to write an internal proposal which would have to be approved by someone outside of the group. I think this is part of LANL's standard post-doc hiring process, though at the time I thought it indicated that they didn't have the money to hire me and were trying to get it. There was no guarantee that the proposal would be approved. I didn't have the time then and backed out, though in retrospect I wish I went through with it as I've heard this is more routine than I thought it was.
Edit 2: Also, for clarity, not every position I applied to required Fortran knowledge. But I think my basic message is correct, in that LANL tends to be fixated on hiring "perfect candidates" when there are plenty of capable people that they could hire instead who just need to learn some things, or might not have a perfect job record (gaps in employment, etc.).
I’ve noticed this with government and government adjacent stuff: they are choosing beggars.
The salaries they offer tend to cap well below what you can make in the private field, often are in weird areas and don’t offer relocation, and just seem terribly bureaucratic when it comes to advancement.
And yet, it seems they won’t hire you if you don’t hit their checklist to a T. I applied for a NOAA job, that wasn’t too crazy, and for the most part, was a pretty good fit for my resume. However, the application had a checkbox question along the lines of “have you ever written software for [hyper specific domain]”. I answered no and was unsurprisingly rejected. By the way, this wasn’t some wild scientific computing project, it was basically a job writing CRUD software for something that sounded interesting to me.
I really wonder how some roles get filled, I’ve seen some ridiculous asks.
Identification of the risks including the likelihoods and hazards is pretty good. I like the notion of contrasting opinions included in the report.
But the big thing missing is mitigations: what can they do about training, about recruiting, and especially about "industry" investing in compilers? This report says nothing about ways to assure continued usability of Fortran-based codes in the next 10 or 15 years. It just lists risks. What can they do to improve the capability of GPU compilers, CPU compilers, training or staff development?
And, setting Fortran aside, what are the plans to assure continued capability in supporting and developing all of their codes, in whatever languages, in the next 10 or 15 years? This evaluation might well be replayed for CUDA, Python or Julia in the next 5 years.
The US budget for supercomputing must be in the $2G to $5G per year range, yet it seems the notion of planning a software strategy for supercomputing is not achievable.
- "Very likely" poor or non-existent robust GPU support,
- Unlikely robust support for specialized processors (i.e., non-general purpose CPU) like "Data Flow processors" and "processing in/near memory",
- Fortran advocacy and advancement is likely limited to DOE, which the report implies is insufficient, and
- Vendors of hardware have prioritized C++.
About 4: I think Fortran looks quite like C when it comes to the CPU architecture. Both are somewhat low level procedural languages.
As long as C works well on future CPUs, Fortran is probably fine and I don't see C disappear overnight. At least, Fortran compilers can be updated to produce reasonable code for new CPUs. Which should happen because of the shared compiler architecture (backend).
About 5: GPUs architectures seem less stable and seem to require specific development. If those specific developments require language support, that's probably not coming to Fortran if the assumption is that Fortran is going maintenance-only.
About 6: advances in computing technology are not restricted to CPUs and GPUs.
(disclaimer: I've only seen Fortran code from far away. And code targeted to GPUs too).
I studied Fortran in my college programming languages class (in the early 2010s) and loved it! Always wondered why it isn't used more.
Does any other language have built-in COMPLEX type? Or funny (but fast!) stack-free function calling?
Is stack-free function calling equivalent to tail call optimization? If so, then many CL implementations support it, including SBCL. For Scheme, it’s part of the standard.
Deleted Comment
I guess most of the legacy code for which they do this risk assessment is Fortran 77. Not any more modern.
And getting it up to date to modern fortran would be as much effort as any other modern language.
There is a race of sorts going on, to simplify and commoditize high performance computing in the post-Moore's era. This means enabling things like easy GPU programming, easy multi-core programming, easy "number crunching" etc. Historically this was the stuff of esoteric HPC (high performance computing) pursued by highly specialized computational scientists etc. (read -> an important but non-economic domain that subsisted largely on government funds).
Yet suddenly the same arcane HPC is effectively also the domain of "AI" and the associated planetary hyperventilation. Now all sorts of modern initiatives and language ecosystems aim to reposition to gainfully intermediate that AI revolution. Python was the scripting language that somewhat unexpectedly enabled all this, but now its warts are under the microscope and people think of various alternatives (julia), supersets (mojo), newcomers like rust etc. But that race is definetely not settled and it might not be so for quite some time. The requirements are highly non-trivial.
As far as taking part in this race, Fortran seems to have missed not one but several boats. Yet the fact remains that it was a language designed for number crunching and evolved as language for HPC, with parallel computing features (since F90) etc. It combines easy to read imperative numerical code with near optimal computational performance.
Its like a former world champion in a sport that was fell out of favor for a long time but is now unexpectedly back in vogue. So the question is whether the champ can get back into shape or make room for new contenders :-)
It's true that one goal may be to pick an important race and try to win it.
Another goal, in my view more important, is to make Fortran more pleasant to use for people/Orgs who need it (there are many) and for people who love it (there are many).
I've found that more often than not, people/teams first like working with a technology, and then come up with technical arguments for why that technology is the best choice. Often the arguments are valid, sometimes they're made up, but ultimately underneath it all you either like it or not and that's all that matters. My goal with Fortran-lang has been to slowly and continuously increase the surface area of Fortran's likability. Fortran is not for everyone, but for people who think it may be, we can work to make it better and more pleasant to use.
As one example, we just released a small library to make high-level HTTP requests from Fortran applications: https://github.com/fortran-lang/http-client. This was a product of one of our Google Summer of Code contributors.
Words of wisdom (:
Maybe it's not the way things should be, but it's common and very human. Sometimes I'll even catch myself retconning explanations for my choices in my own head, despite myself.
And here is the reaction of the fortran-lang.org people to that and later streams where he went on to implement tic-tac-toe in Fortran using for the GUI the C library raylib (https://www.raylib.com/), for which he made his own bindings live on screen: https://fortran-lang.discourse.group/t/tsoding-on-fortran/61...
Matlab in theory has OO, but it's very slow and who bothers to verify that? So practically everything is matrices and there's very little "semantic expressed in code itself" (to paint issues with a broad brush). Also matrix calculations can get botched numerically, particularly matrix exponentials, but the whole culture (that I can't even expunge from my own thinking) is that I have a theorem so tests schmests.
Deleted Comment
Python is horrifically slow, everyone knows that, but if used correctly numpy can even the playing field a great deal. I've seen lots of python/numpy written using nested loops, and this is always going to be really really slow.
This is somewhat of a link dump since the code is definitely too hard to interpret for someone not familiar but this is me converting some fairly performant Fortran code to python w/ numba at only a 1-2% performance penalty.
Deleted Comment
http://www.bitsavers.org/components/motorola/6800/cross-asse...
But this was before FORTRAN had strings.. or even bit-wise operators.. But you could read an input line into an array of integers. Then to read each character from this array, you used integer multiplication and division to perform shifting and masking. Although in this version, I can see calls to assembly language helper functions.
Note the Hollerith constants.. (no strings).