Personally, I think Python's success is down to the productivity of its peudocode-like syntax letting you hack prototypes out fast and easy. In turn, that makes building libraries more attractive, and these things build on each other. FORTRAN is very fast but it's a less forgiving syntax, especially coming from Python.
In that regard, I'm surprised Nim hasn't taken off for scientific computing. It has a similar syntax to Python with good Python iterop (eg Nimpy), but is competitive with FORTRAN in both performance and bit twiddling. I would have thought it'd be an easier move to Nim than to FORTRAN (or Rust/C/C++). Does anyone working in SciComp have any input on this - is it just a lack of exposure/PR, or something else?
Most code in science is written by grad students and postdocs. For them, trying new language is an enormous career risk. Your advisor might not understand it, and you might be all alone in your department if you try Nim.
That makes any sort of experimentation a really tough sell.
As a rule, I have found scientific computing (at least in astronomy, where I work) to be very socially pressured. Technical advantages are not nearly as important as social ones for language or library choice.
Change does happen, but extremely slowly. I am not exaggerating when I say that even in grant applications to the NSF as recently as 2020, using Python was considered a risky use of unproven technology that needed justification.
So, yeah, Nim is going to need a good 30 years before it could plausibly get much use.
Yep, going against the grain in graduate school is counterproductive unless there's a compelling reason.
Many grad students forget that their main purpose is to generate research results and to publish papers that advance the field, not to play around with cool programming languages (unless their research is about coding).
Here's a bunch of mistakes I made in grad school which unnecessarily lengthened my time in the program (and nearly made me run out of stipend money):
* Started out in Ruby because I liked the language, but my research involved writing numerical codes, and at the time there just wasn't much support for it so I ended up wasting a lot of time writing wrappers etc. There was already an ecosystem of tools I could use in MATLAB and Python but nooo, I wanted to use Ruby. This ended up slowing me down. I eventually gave in to MATLAB and Python and boy everything just became a lot easier.
* Using an PowerPC-based iBook instead of an Intel Linux machine. Mac OS X is a BSD (plus I was using a PPCarch) and Brew didn't exist back then, so I ended up troubleshooting a lot of compile errors and tiny incompatibilities because I liked being seen to be using a Mac. When I eventually moved to Linux on Intel, things became so much easier. I could compile stuff without any breakages in the one pass.
I also knew a guy who used Julia in grad school because it was the hot new performant thing when all the tooling was in Python. I think he spent a lot of time rejigging his tooling and working around stuff.
Ah the follies of youth. If only someone had pulled me aside to tell me to work backwards from what I really needed to achieve (3 papers for a Ph.D.) and to play around with cool tech in my spare time.
I guess the equivalent of this today is a grad student in deep learning wanting to use Rust (fast! memory-safe! cool!) even though all the tooling is in Python.
A grad student using a new language definitely definitely does not face any career risk IMO... I cant imagine a single professor or recruiter caring about something like this over material progress in their work.
My guess is that grad students are swamped and are looking for the shortest path to getting an interesting result, and that is most likely done with a tool they already somewhat know.
The question for Nim, like many other new products, is: why is it worth the onboarding cost?
Python itself isn't really used for scientific computing. Pythons bindings to high performance libraries, many of which use Fortran under the hood, are used for scientific computing.
Combined with the ease of displaying results ala Matlab but much less of the jank, and you have an ideal general purpose sci comp environment
Back when I worked in scientific programming, we adopted a similar approach. The heavy lifting functions we wrote in C, but they were called from R which allowed us to plot the results, etc., easily. And the libraries we used (for solving differential equations) were all old school Fortran libraries.
If I were to start again today, I think I'd give Julia a look, though.
I love Nim and would absolutely use it for every piece of native code I need to write. Unfortunately, I find it suffers from a few big problems. First, the developer tooling and documentation is kind of inconsistent. The build tool, which is also used to install packages, supports a different set of args than the main compiler, which causes some weirdness. Second, the network effect. Most libraries are maintained by a single person, who put in a lot of effort, but a lot of bugs, edge cases, missing features and other weirdnesses remain. It's usually best to use libraries made for C or Python instead, really.
I work in scientific computing and I'm a huge fan of nim. I started writing a few tools at work in nim and was quite quickly asked to stop by the software development team. In their eyes, they are responsible for the long term maintenance of projects (it's debatable how much they actually carry out this role), and they didn't want the potential burden of a codebase in a language none of them are familiar with.
It's sad, as I feel nim would be easier to maintain compared to a typical c or R codebase written by a biologist, but that's what's expected.
I second to this. There's often a huge difference between the languages and tools we'd love to be using, and those that we are allowed / forced to use on the workplace.
I for instance just moved to a company where the data stack is basically OracleSQL and R. And I dislike both. But as _Wintermute pointed out, a whole company / department won't change their entire tech stack just to please one person.
Python is very easy to teach because syntax doesn't get as much in the way as with other languages. You van basicallly start with mostly english and then slowly introduce more complex concepts. With C for example you would have to delve into data types as soon as you declare the first variable.
I'm trying to switch from traditional software engineering to something sciencier--I've been taking computational biology classes and learning Nim.
I like Nim a lot. And I know that it'll scratch a necessary itch if I'm working with scientists. I also know that it's too much to ask that the scientists just buckle down and learn Rust or something like that.
But as someone who is not afraid of Rust but is learning Nim because of its applicability to the crowd that I want to help... The vibrancy of the Rust community is really tempting me away from this plan.
I've really enjoyed the Nim community also. I even contributed some code into the standard library (a first) and was surprised at how easy they made it.
But I have also written issues against Nim libraries which have gone unanswered for months. Meanwhile, certain rust projects (helix, wezterm, nushell) just have a momentum that only Nim itself can match.
Python benefitted from there being no nearby neighbors which resembled it (so far as I'm aware). If you needed something like python, you needed python.
Rust and Go and Zig are not for scientists, but they're getting developer attention that Nim would get if they didn't exist. Also, Julia is there to absorb some of the scientist attention. It's a Tower of Babel problem.
I can't say why the scientists aren't flocking to Nim, but as someone who wants to support them wherever they go, this is why I'm uncertain if Nim is the right call. But when I stop and think about it, I can't see a better call either.
> I can't say why the scientists aren't flocking to Nim, but as someone who wants to support them wherever they go, it's why I'm uncertain if it was the right call.
Because most scientists are only using programming as a tool and don't care one bit about it beyond what they need it to do. They don't go looking for new tools all the time, they just ask their supervisor or colleague and then by default/network effects you get Python, Fortran, or C"++". You need a killer argument to convince them to do anything new. To most of them suggesting a new language is like suggesting to use a hammer of a different color to a smith - pointless. With enough time and effort you can certainly convince people, but even then it's hard. It took me years to convince even just one person to use matplotlib instead of gnuplot when I was working in academia. You can obviously put that on my lack of social skills, but still.
Why is Go often lumped in with languages that don't have garbage collectors? I'm always confused by this. Is Go suitable for systems programming? I myself use Go, but for web development.
Yes I agree that Python success most probably due to its productory of its peudocode-like syntax that makes building libraries more attractive.
In addition to Nim, D programming is also Phytonic due to its GC by default approach and it is a very attractive Fortran alternative for HPC, numerical computation, bit twiddling, etc. D support for C is excellent and the latest D compiler can compile C codes natively, and it is in GCC eco-system similar to Fortran. Heck, D native numerical library GLAS is already faster than OpenBLAS and Eigen seven years ago [1]. In term of compilation speed D is second to none [2].
[1] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:
The nim syntax only looks like python on the surface. It actually feels quite different when more complex language features are involved. Nim is more restrictive than python and harder to write. IMHO, nim is not the language that common python programmers would like especially if they only know python.
Absolutely. Fortran about 500 lines of code vs <20 lines for Python. The ease of use and flexibility of Python across so many application types makes for a good reason for its popularity. The rise of hardware computing performance makes speed tradeoff trivial.
For code implementing a numerical algorithm, I think the ratio of lines needed in Fortran vs. Python is much less than 25, maybe 2 or 3. And once the code is written in Fortran you just compile with -O3 etc. to get good performance and don't need to thnk about translating to Cython, Numba, or some other language.
I think I asked this in a Nim thread a month or two ago, but to me I don’t see a chance at competing in scientific computing without a good interactive EDA story, and python with a good out-of-the-box IDE and Jupiter Notebooks and iPython has an amazing story for interactive scientific computing.
IMO the popularity of Python has as much, if not a lot more, to do with the available libraries and frameworks as the language itself. The language itself seems more inherently suited as a successor to Perl - as a powerful scripting language, rather than one really suited to large, complex, multi-person projects.
What seems to have bootstrapped the success of Python for ML and scientific use was early adoption by people in these communities who were not hard core programmers, and found it easy to get started with. Once SciPy and NumPy were available, and NumPy became used for ML, then the momentum of ML helped further accelerate the adoption of Python.
> popularity of Python has as much, if not a lot more, to do with the available libraries and frameworks as the language itself
> What seems to have bootstrapped the success of Python for ML and scientific use was early adoption by people in these communities who were not hard core programmers
What if these people (non-hard-core programmers) were attracted to the language itself because it is almost pseudo-like? So it becomes a gift that keeps on giving. Attract domain experts and you get more batteries attached for your project.
> hard core programmers
What if these people are 'hard-core' in their specific domain, but not 'hard-core' in whatever hardware architecture carries the day due to historical mishaps and marketing trends of the day?
The availability of libraries is because of the language. It's a (good) flexible general purpose dynamically typed language which makes writing libraries and good code in general easy.
Why it's deemed unsuitable for large, complex, multi-person projects is that enterprise types know only the byzantine OOP mess. And when all you have is OOP, everything is a FactoryFactoryFactory and everything else is "unmanageable".
The reason it's not well suited for larger/etc projects has nothing to do with OOP - it's about things like dynamic typing and indent-based structure that make it good for interactive REPL use, scripting and rapid prototyping, but less suited for more complex cases where you'd prefer more compile-time vs run-time checking, and where ease of lifetime maintenance trumps up-front coding time.
I disagree that the availability of good libraries for Python is because of the language - especially if we're talking about scientific and ML libraries. In many of these cases the Python libraries are just pass-thrus to the underlying libraries written in C, which was chosen because of it's performance and suitability to the task.
> Why it's deemed unsuitable for large, complex, multi-person projects is that enterprise types know only the byzantine OOP mess. And when all you have is OOP, everything is a FactoryFactoryFactory and everything else is "unmanageable".
Have you ever worked in a large, dynamically typed codebase written by other people?
Can you elaborate on the distinction you're making?
Python is the default language in which people express their scientific computations. It may execute C code in the end, but so does any language that ever executes a system call.
Little fun fact: Numpy doesn't even come with an efficient, blocked matmul procedure. It has to be linked against a BLAS implement to really provide any decent performance. This also explains why Numpy performance can vary from distribution to distribution. Anaconda ships it with a different BLAS than Pypi.
As someone that was at CERN when Python started to be adopted in the early 2000's, Python got popular as saner Perl alternative for UNIX scripting, build tools (CMT), and as means to provide a REPL to C++ and Fortran libraries instead of dealing with ROOT.
HEP is a somewhat peculiar community. They tend to rely on their own tools, their own libraries. I wouldn't take them as a model to understand historical python adoption in science.
I'd bet that their python usage is still mostly as a REPL to ROOT (which by the way has its own REPL), so no numpy, maybe little pandas, no matplotlib.
Programming languages are a bit like social networks. There's some network effect. People go where other people are. Python is currently where things happen.
I imagine part of it is also that a lot of the code isn't the science part. It's all the setup, things like parsing data for input or output. Languages like Python and Perl have very rich standard library stuff for massaging strings, data formats, etc.
If your data is big you may be amazed at how expensive string parsing can be. You can do a lot of FLOPS in the time it takes to serialize and deserialize a large matrix to ASCII for instance.
I'm so old that Fortran was actually my first language. Over the years I've seen language bindings to the old Fortran numerical libraries we all rely on but Python/numpy is the first wrapper I've actually enjoyed using. It's more than a wrapper in that it brings a compact representation of slices and vectors.
However, if I didn't know how things work underneath I'd be a little uneasy. You can always profile after the fact but it helps knowing how to avoid inefficient approaches.
The slowness of Python meant that nobody thought "it'll be easier just to write this routine" as opposed to looking to re-use existing (most often compiled) code. And if you are doing science, the less time you spend re-inventing code, the more science you will get done.
In that regard, I'm surprised Nim hasn't taken off for scientific computing. It has a similar syntax to Python with good Python iterop (eg Nimpy), but is competitive with FORTRAN in both performance and bit twiddling. I would have thought it'd be an easier move to Nim than to FORTRAN (or Rust/C/C++). Does anyone working in SciComp have any input on this - is it just a lack of exposure/PR, or something else?
That makes any sort of experimentation a really tough sell.
As a rule, I have found scientific computing (at least in astronomy, where I work) to be very socially pressured. Technical advantages are not nearly as important as social ones for language or library choice.
Change does happen, but extremely slowly. I am not exaggerating when I say that even in grant applications to the NSF as recently as 2020, using Python was considered a risky use of unproven technology that needed justification.
So, yeah, Nim is going to need a good 30 years before it could plausibly get much use.
Many grad students forget that their main purpose is to generate research results and to publish papers that advance the field, not to play around with cool programming languages (unless their research is about coding).
Here's a bunch of mistakes I made in grad school which unnecessarily lengthened my time in the program (and nearly made me run out of stipend money):
* Started out in Ruby because I liked the language, but my research involved writing numerical codes, and at the time there just wasn't much support for it so I ended up wasting a lot of time writing wrappers etc. There was already an ecosystem of tools I could use in MATLAB and Python but nooo, I wanted to use Ruby. This ended up slowing me down. I eventually gave in to MATLAB and Python and boy everything just became a lot easier.
* Using an PowerPC-based iBook instead of an Intel Linux machine. Mac OS X is a BSD (plus I was using a PPCarch) and Brew didn't exist back then, so I ended up troubleshooting a lot of compile errors and tiny incompatibilities because I liked being seen to be using a Mac. When I eventually moved to Linux on Intel, things became so much easier. I could compile stuff without any breakages in the one pass.
I also knew a guy who used Julia in grad school because it was the hot new performant thing when all the tooling was in Python. I think he spent a lot of time rejigging his tooling and working around stuff.
Ah the follies of youth. If only someone had pulled me aside to tell me to work backwards from what I really needed to achieve (3 papers for a Ph.D.) and to play around with cool tech in my spare time.
I guess the equivalent of this today is a grad student in deep learning wanting to use Rust (fast! memory-safe! cool!) even though all the tooling is in Python.
My guess is that grad students are swamped and are looking for the shortest path to getting an interesting result, and that is most likely done with a tool they already somewhat know.
The question for Nim, like many other new products, is: why is it worth the onboarding cost?
Combined with the ease of displaying results ala Matlab but much less of the jank, and you have an ideal general purpose sci comp environment
If I were to start again today, I think I'd give Julia a look, though.
End users are using python - the advantage of modern computing is that whatever happens afterwards is irrelevant.
It's tempting to get lazy and just use a for-loop to iterate over an array sometimes and that will absolutely kill your performance.
Python's success starts with academia's movement of replacing Matlab with free software, namely numpy/scipy.
It's sad, as I feel nim would be easier to maintain compared to a typical c or R codebase written by a biologist, but that's what's expected.
I for instance just moved to a company where the data stack is basically OracleSQL and R. And I dislike both. But as _Wintermute pointed out, a whole company / department won't change their entire tech stack just to please one person.
I like Nim a lot. And I know that it'll scratch a necessary itch if I'm working with scientists. I also know that it's too much to ask that the scientists just buckle down and learn Rust or something like that.
But as someone who is not afraid of Rust but is learning Nim because of its applicability to the crowd that I want to help... The vibrancy of the Rust community is really tempting me away from this plan.
I've really enjoyed the Nim community also. I even contributed some code into the standard library (a first) and was surprised at how easy they made it.
But I have also written issues against Nim libraries which have gone unanswered for months. Meanwhile, certain rust projects (helix, wezterm, nushell) just have a momentum that only Nim itself can match.
Python benefitted from there being no nearby neighbors which resembled it (so far as I'm aware). If you needed something like python, you needed python.
Rust and Go and Zig are not for scientists, but they're getting developer attention that Nim would get if they didn't exist. Also, Julia is there to absorb some of the scientist attention. It's a Tower of Babel problem.
I can't say why the scientists aren't flocking to Nim, but as someone who wants to support them wherever they go, this is why I'm uncertain if Nim is the right call. But when I stop and think about it, I can't see a better call either.
Because most scientists are only using programming as a tool and don't care one bit about it beyond what they need it to do. They don't go looking for new tools all the time, they just ask their supervisor or colleague and then by default/network effects you get Python, Fortran, or C"++". You need a killer argument to convince them to do anything new. To most of them suggesting a new language is like suggesting to use a hammer of a different color to a smith - pointless. With enough time and effort you can certainly convince people, but even then it's hard. It took me years to convince even just one person to use matplotlib instead of gnuplot when I was working in academia. You can obviously put that on my lack of social skills, but still.
In addition to Nim, D programming is also Phytonic due to its GC by default approach and it is a very attractive Fortran alternative for HPC, numerical computation, bit twiddling, etc. D support for C is excellent and the latest D compiler can compile C codes natively, and it is in GCC eco-system similar to Fortran. Heck, D native numerical library GLAS is already faster than OpenBLAS and Eigen seven years ago [1]. In term of compilation speed D is second to none [2].
[1] Numeric age for D: Mir GLAS is faster than OpenBLAS and Eigen:
http://blog.mir.dlang.io/glas/benchmark/openblas/2016/09/23/...
[2]C++ Compilation Speed:
https://news.ycombinator.com/item?id=1617133
What seems to have bootstrapped the success of Python for ML and scientific use was early adoption by people in these communities who were not hard core programmers, and found it easy to get started with. Once SciPy and NumPy were available, and NumPy became used for ML, then the momentum of ML helped further accelerate the adoption of Python.
> What seems to have bootstrapped the success of Python for ML and scientific use was early adoption by people in these communities who were not hard core programmers
What if these people (non-hard-core programmers) were attracted to the language itself because it is almost pseudo-like? So it becomes a gift that keeps on giving. Attract domain experts and you get more batteries attached for your project.
> hard core programmers
What if these people are 'hard-core' in their specific domain, but not 'hard-core' in whatever hardware architecture carries the day due to historical mishaps and marketing trends of the day?
Why it's deemed unsuitable for large, complex, multi-person projects is that enterprise types know only the byzantine OOP mess. And when all you have is OOP, everything is a FactoryFactoryFactory and everything else is "unmanageable".
I disagree that the availability of good libraries for Python is because of the language - especially if we're talking about scientific and ML libraries. In many of these cases the Python libraries are just pass-thrus to the underlying libraries written in C, which was chosen because of it's performance and suitability to the task.
Have you ever worked in a large, dynamically typed codebase written by other people?
Python is a popular user interface for scientific computing, data science or high-performance computing.
Python is the default language in which people express their scientific computations. It may execute C code in the end, but so does any language that ever executes a system call.
Little fun fact: Numpy doesn't even come with an efficient, blocked matmul procedure. It has to be linked against a BLAS implement to really provide any decent performance. This also explains why Numpy performance can vary from distribution to distribution. Anaconda ships it with a different BLAS than Pypi.
That was all, it could have been Tcl instead.
I'd bet that their python usage is still mostly as a REPL to ROOT (which by the way has its own REPL), so no numpy, maybe little pandas, no matplotlib.
Deleted Comment
However, if I didn't know how things work underneath I'd be a little uneasy. You can always profile after the fact but it helps knowing how to avoid inefficient approaches.