I recently began learning Julia and initially everything was amazing, except for 1 based indexing but with everything else I could overlook that. Then I attempted building something medium sized and it all fell apart. I feel like it needs some serious work on tooling, the module system, packages, etc.
Has anyone built something medium-large sized in Julia? Maybe I'm missing something. When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them. So why do they even exist? That really put me off from Julia despite really liking it otherwise.
I would want to build things with it not just play in a REPL and notebooks.
Yeah, I built several libraries that replace IEEE floating points with alternatives. You should really be using modules to organize your code. I don't know who told you that they are more trouble than they are worth.
I would also suggest aggressively unit testing all the parts. Numerical developers are not often in that habit, which is a shame.
> Numerical developers are not often in that habit, which is a shame.
For certain! I don't know how anyone can do any serious numerical development without a whole lot of test cases, including esoteric PhD-level numerical analysis stuff.
I remember working with the IMSL libraries back in the 1970's and being in awe of the huge set of numerical tests they had in the test suite which gave them, from my point of view, an unbeatable lead in development. These were tests that were often about highly esoteric aspects of numerical stability in floating-point algorithms and were written by extremely well educated mathematicians with tons of numerical experience.
For those who aren't greybeards IMSL (the International Mathematical Subroutine [now Statistical] Library) was one of the first uses of re-usable software in application programming. It started as decks of punch cards that you would put in a request for and the computing facility would punch a copy of the appropriate routines for you to include in your program deck.
> When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them.
Really?
I normally break up my code across modules, but also across libraries.
I'd say a few of my projects are at least medium sized. I discuss an example here, where much of the code is split across many separate modules: https://bayeswatch.org/2019/01/29/optimizing-a-gibbs-sampler...
I achieve roughly 1700x better performance in an example than a JAGS model. A C++ version does a bit better at 2000x when compiled with Clang.
Having all these dependencies checked out for development is not best practice, so I wouldn't recommend strictly following my example.
Agreed, that's my experience as well. Julia is amazing for scripts and smaller projects, but I wish there was something like Swift (which I'm increasingly convinced is closest to the ultimate general-purpose language) with all the nice things that Julia has.
Specifically, these things make Julia less suitable for larger projects:
- Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.
- Dynamic typing.
- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
Sorry, why do you need OOP? I haven't coded OO in about 10 years now. (Mostly Julia, elixir, and functional JavaScript). It's great. Would never go back.
- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
What do you mean by that? It absolutely is compiled to e.g. x86_64 instructions by default, going through LLVM. Do you mean no standalone binaries? If so, that's actively being worked on by e.g. PackageCompiler.
I'm not sure what you mean by this. You obviously can't call a function before it is defined, but you can use a function in another function without any problems:
julia> foo(x) = bar(x)
foo (generic function with 1 method)
julia> bar(x) = x+2
bar (generic function with 1 method)
julia> foo(3)
5
> - Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.
...and neither is OOP? For doing data things I greatly prefer functional programming for ease of understanding how the data flows and gets changed.
> - Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
Neither does Python, neither does R. AOT compilation is being worked on, I expect it to get pretty great. C and FORTRAN do, sure, but most people are not writing C and FORTRAN in production.
> - Module system doesn't seem to be design well.
It’s maybe not perfect, but it’s honestly miles better than Python.
> - You can't use a function before it's defined.
This feels like an incredibly unfair and unreasonable complaint given not many languages that aren’t AOT compiled actually support this and Python and R certainly do not.
This is something I don't fundamentally understand. As someone who works a lot with MATLAB I'm very used to and like 1-based indexing. But when I use C or Python, 0-based indexing is not something I complain about or hold against the language. It's just the way things are.
Maybe if you don't think of it in terms of a different index basis and instead you think of it as indexing vs. offsets then it becomes easier to switch between the two?
My favourite fact about this stuff: in VB (or was it VBA?) when you asked for an array of size n, you actually got an array of size n+1. So people could do 0-based or 1-based indexing and be none the wiser...
I do not recall the exact issue anymore unfortunately, I abandoned the project because of it, but the general sentiment on the Julia discourse seemed to be just avoid them. This blog post seems to sum up my issues with modules and namespaces pretty well though:
I think that these issues are generally acknowledged although I don't know if they will be addressed. Seems like major pain points for library development should have been addressed before 1.0.
The main issue I encountered as a Julia user is that multiple dispatch doesn't scale very well.
When you start building out a project, it's easy to keep track and debug if multiple dispatch starts failing (i.e. <any> type starts spreading everywhere and Julia slows to Python like speeds).
In medium-to-large projects, it becomes extremely cumbersome to manage this. It's doable, but adds a layer of complexity management to projects that simply doesn't exist in strictly typed or pure scripting languages.
Of course, you can just decide to explicitly type everything - but the issue here again is the lack of enforcement.
In a nutshell: Julia is great when you're a grad student working mostly by yourself on small scale projects! But not so great in prod.
And there's really no problem with that; that's who the language was designed for!
This was published on their own website, it is therefore biased toward a good perception of them. It doesn't seem to be a fair and independent review to help build an opinion.
I think multiple dispatch will scale perfectly fine for large projects when it has better IDE support.
There's no reason Juno or any other IDE couldn't display the output of a static analyser inline, or allow you to command-click a function call to go to the site of the exact function being called, or show a list of alternatives, and so on.
Give Julia and its IDEs a few years to improve and you might find it much better suited to large projects. I wouldn't consider Java any good for large projects either, if it didn't have the excellent IDE support it now enjoys.
The optional typing seems like the perfect solution to this... skip explicit typing for small scale projects, but make sure you add it for production...
Both cassette based performance linting and static type checking will solve this problem. There are already the beginnings of the tooling being formed.
That certainly depends on your use case. For instance, the launch time is ridiculously slow, so that you cannot realistically run a small matrix computation in julia from within a shell loop. It is better to use octave for that, where the startup time is almost negligible (just a bit slower than starting a subshell).
I agree. I meta-programmed (enormous) expressions from analytical expressions exported from Mathematica to Julia, because I found Julia to be ~3000 times faster than Mathematica when it comes to calculating eigenvalues. Using BigFloat for higher precision, my matrix function in Julia took ~20 minutes to compile on the first run and ~20 GB of RAM. Smooth once compiled, but I was the only one of my collaborators that had the capacity to run it.
Note, that I still do like using Julia. I'm a physicist and need to do a lot of computations in a hassle-free way, then jupyter+Julia (+ SymPy) is(/are) the best available tool(s).
The above may only be an issue of BigFloat, to be fair, since Float64 compiled in an instant (never measured, and the time never bothered me).
So Julia has solved a lot of problems for me, and I see great potential for it in the future.
> I think it requires a jvm mindset, warmup once and iterate inside rather than outside.
I do not have this mindset then. I prefer tools who are mindset oblivious. They are really useful!
For example, imagine I have a collection of a few hundred images with their projection matrices (in text files). I want to crop them and apply a simple imagemagick operation (which is not available from inside julia). The elementary solution is to run a shell loop to apply the crop, and call julia to perform a simple adaptation of each projection matrix. This is impossible today: most of the running time of such loop is spent on julia initialization. Half a second to do nothing is simply unacceptable in a serious scripting language.
$ time ./julia -e 'print(1)'
1
real 0m0.184s
user 0m0.156s
sys 0m0.168s
So yes, that’s a slow launch time. If you use Julia in a shell script and it’s starting up Julia on each calculation it will be brutally slow. Five orders of magnitude slower, in this case.
First of all that is ridiculously slow. Secondly, unlike Matlab you don't have everything you need available after starting the REPL. For instance if you want to plot something you might run `using Gadfly`. How long does that take?
16 seconds. Sixteen seconds. For real. This is not usable.
I like Julia a lot, and thought I understood some of it, but this article puzzles me:
> This output is saying that a floating point multiplication operation is performed and the answer is returned.
But "this output" is:
%2 = mul i64 %1, %0
ret i64 %2
which looks very much like an int64 operation and return.
Similarly:
> Here we get an error. In order to guarantee to the compiler that ^ will give an Int64 back, it has to throw an error. If you do this in MATLAB, Python, or R, it will not throw an error.
while the quoted input and output is
In [6]: 2^-5
Out[6]: 0.03125
(ie, no error, but the correct (floating point) result.)
This was written back in Julia v0.5, and the compiler got smarter so I need to update my examples :). Here, Julia specializes now on the fact that `-5` is a literal, and then inlines the literal and corrects the output type using that value. If you define it as a variable and stop constant propogation, it'll error. Stuff like this are making it harder to write tutorials to show what's actually going on, because literals and constants are all getting optimized on now!
Another confounding fact is that Julia optimizes on small unions, so the generated code isn't that bad anymore. Now it just creates a branch. It used to have to do all of inference and dynamic dispatching on the fly, which is what it has to do in fully uninferrable code of course, but a union of two things just does a type check and splits. So now... that example is not as bad as it used to be...
I did a 5000 line dissertation project in Octave after rejecting Julia. Reason : I had derived the math in linear algebra including Kronecker products; the math mapped to Octave pretty directly, but Julia requred me to translate all the Kronecker products to loops —yuck! kron(A, B) would become 12 lines of weird indices and for loops. On the listserv I was told that Julia was great because it didn't require vectorization for performance, but I only wanted vectorization for graceful expression.
Plus I got annoyed with extra weird syntax, but I can't remember the specifics.
Basically, Julia required more lines and characters and wasn't as close to the math.
Aside:I think Matlab / Octave is a lot like SQL and Tcl: lots of haters, unfashionable, but usually the most elegant solution .
As notation for array based algorithms, Octave/Matlab is vastly better than anything else I've found. Some guy did PRML in Matlab, others have done it in Python. The Matlab version is like reading a book; clear, concise, correct.
For such things, Julia is often very close to Matlab indeed, or at least it can be used that way.
You could translate most of those files line-for-line, with quite a few lines identical or trivially changed (bracket shape, or max -> maximum). This is often a useful thing to do, get a transliterated version running, and then re-write bits of it more idiomatically, while checking that the output is identical.
I haven't dipped into Julia's macro side, but I wonder how much work it would be to just create macros to create syntactic sugar that maps infix Kronecker products to the Kronecker function.
There are so many Julia packages that do similar stuff that I imagine it can't be all that hard for people who have become fluent with the macro system.
I want to like Julia but after a decade of Python, every time I try it, it’s a death by a thousand cuts (and outdated Google results). I just can’t afford to have productivity drop to near zero for the learning curve plus reimplement everything.
Also I ha e found multiple dispatch to be harder than regular OO methods to locate (for IDEs but grep also).
When I search Julia questions on Google I always restrict the search to the past year or six months. You are right that there is an enormous amount of irrelevant info out there from earlier versions of the language.
Yes - doing @edit(arg1,arg2,....) opens the file at the relevent method definition (ie version the same type arguments as given) in my editor. In the julia source code or my own modules.
Honestly my biggest bug with Julia was the lack of good programming environment. I detest MATLAB, but I can be very productive in that IDE. PyCharm is pretty good too when using NumPy. For Julia though, the tooling just didn't seem there yet.
I didn't know about the Unicode tab completion. I gave it a whirl on the cli and loved it. Then I tried it out in julia-mode in Emacs, and it works there too! What other editors support this?
vim: https://github.com/JuliaEditorSupport/julia-vim
I have an F-key shortcut in my .vimrc to turn this on mainly for non-Julia files, so that I can type (maths-y) unicode in Markdown / text files.
Has anyone built something medium-large sized in Julia? Maybe I'm missing something. When I was trying to use modules to organize my code the response I got was that modules are more trouble than they are worth so just don't use them. So why do they even exist? That really put me off from Julia despite really liking it otherwise.
I would want to build things with it not just play in a REPL and notebooks.
I would also suggest aggressively unit testing all the parts. Numerical developers are not often in that habit, which is a shame.
https://github.com/REX-Computing/unumjl
https://github.com/interplanetary-robot/SigmoidNumbers
For certain! I don't know how anyone can do any serious numerical development without a whole lot of test cases, including esoteric PhD-level numerical analysis stuff.
I remember working with the IMSL libraries back in the 1970's and being in awe of the huge set of numerical tests they had in the test suite which gave them, from my point of view, an unbeatable lead in development. These were tests that were often about highly esoteric aspects of numerical stability in floating-point algorithms and were written by extremely well educated mathematicians with tons of numerical experience.
For those who aren't greybeards IMSL (the International Mathematical Subroutine [now Statistical] Library) was one of the first uses of re-usable software in application programming. It started as decks of punch cards that you would put in a request for and the computing facility would punch a copy of the appropriate routines for you to include in your program deck.
Really? I normally break up my code across modules, but also across libraries. I'd say a few of my projects are at least medium sized. I discuss an example here, where much of the code is split across many separate modules: https://bayeswatch.org/2019/01/29/optimizing-a-gibbs-sampler... I achieve roughly 1700x better performance in an example than a JAGS model. A C++ version does a bit better at 2000x when compiled with Clang.
Having all these dependencies checked out for development is not best practice, so I wouldn't recommend strictly following my example.
Specifically, these things make Julia less suitable for larger projects:
- Lack of support for OOP. And no, purely functional programming is not the best way to code all projects.
- Dynamic typing.
- Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
- Module system doesn't seem to be design well.
- You can't use a function before it's defined.
What do you mean by that? It absolutely is compiled to e.g. x86_64 instructions by default, going through LLVM. Do you mean no standalone binaries? If so, that's actively being worked on by e.g. PackageCompiler.
https://github.com/JuliaLang/PackageCompiler.jl
I'm not sure what you mean by this. You obviously can't call a function before it is defined, but you can use a function in another function without any problems:
Julia doesn't rely on functional programming and there are also structs and operator overloading. What feature do you think is missing specifically?
> Dynamic typing.
Julia is not dynamically typed, though if you write a type unstable function, you will get back an Any type.
- Module system doesn't seem to be design well.
What specifically is problematic?
...and neither is OOP? For doing data things I greatly prefer functional programming for ease of understanding how the data flows and gets changed.
> - Doesn't compile to native code (there are ways to do it, but too complicated and there are caveats).
Neither does Python, neither does R. AOT compilation is being worked on, I expect it to get pretty great. C and FORTRAN do, sure, but most people are not writing C and FORTRAN in production.
> - Module system doesn't seem to be design well.
It’s maybe not perfect, but it’s honestly miles better than Python.
> - You can't use a function before it's defined.
This feels like an incredibly unfair and unreasonable complaint given not many languages that aren’t AOT compiled actually support this and Python and R certainly do not.
Please. No. Languages that try to do everything are crap. If you want to do something OOP, why don't you grab a language built for it?
It makes sense with matrices.
Maybe if you don't think of it in terms of a different index basis and instead you think of it as indexing vs. offsets then it becomes easier to switch between the two?
- zero-indexed: i×m+j
- one-indexed: (i-1)×m+j
OTOH one-based is slightly better for trees stored in 1D arrays:
- zero-indexed: parent=(child-1)/2; children=2×parent+(1, 2).
- one-indexed: parent=child/2; children=2×parent+(0, 1).
My favourite fact about this stuff: in VB (or was it VBA?) when you asked for an array of size n, you actually got an array of size n+1. So people could do 0-based or 1-based indexing and be none the wiser...
https://docs.julialang.org/en/v1/devdocs/offset-arrays/
http://luthaf.fr/julia-some-criticism.html
I think that these issues are generally acknowledged although I don't know if they will be addressed. Seems like major pain points for library development should have been addressed before 1.0.
When you start building out a project, it's easy to keep track and debug if multiple dispatch starts failing (i.e. <any> type starts spreading everywhere and Julia slows to Python like speeds).
In medium-to-large projects, it becomes extremely cumbersome to manage this. It's doable, but adds a layer of complexity management to projects that simply doesn't exist in strictly typed or pure scripting languages.
Of course, you can just decide to explicitly type everything - but the issue here again is the lack of enforcement.
In a nutshell: Julia is great when you're a grad student working mostly by yourself on small scale projects! But not so great in prod.
And there's really no problem with that; that's who the language was designed for!
Some people would disagree with that
https://juliacomputing.com/case-studies/celeste.html
I would also argue that the large open source Julia packages are also great examples of Julia "in prod".
Just highlighting what I think is a significant con in a language with many pros!
There's no reason Juno or any other IDE couldn't display the output of a static analyser inline, or allow you to command-click a function call to go to the site of the exact function being called, or show a list of alternatives, and so on.
Give Julia and its IDEs a few years to improve and you might find it much better suited to large projects. I wouldn't consider Java any good for large projects either, if it didn't have the excellent IDE support it now enjoys.
That certainly depends on your use case. For instance, the launch time is ridiculously slow, so that you cannot realistically run a small matrix computation in julia from within a shell loop. It is better to use octave for that, where the startup time is almost negligible (just a bit slower than starting a subshell).
The above may only be an issue of BigFloat, to be fair, since Float64 compiled in an instant (never measured, and the time never bothered me).
So Julia has solved a lot of problems for me, and I see great potential for it in the future.
That said, instanciating julia every step of a bash loop.. I think it requires a jvm mindset, warmup once and iterate inside rather than outside.
I do not have this mindset then. I prefer tools who are mindset oblivious. They are really useful!
For example, imagine I have a collection of a few hundred images with their projection matrices (in text files). I want to crop them and apply a simple imagemagick operation (which is not available from inside julia). The elementary solution is to run a shell loop to apply the crop, and call julia to perform a simple adaptation of each projection matrix. This is impossible today: most of the running time of such loop is spent on julia initialization. Half a second to do nothing is simply unacceptable in a serious scripting language.
REPL:
Bash: So yes, that’s a slow launch time. If you use Julia in a shell script and it’s starting up Julia on each calculation it will be brutally slow. Five orders of magnitude slower, in this case.16 seconds. Sixteen seconds. For real. This is not usable.
Deleted Comment
> This output is saying that a floating point multiplication operation is performed and the answer is returned.
But "this output" is:
which looks very much like an int64 operation and return.Similarly:
> Here we get an error. In order to guarantee to the compiler that ^ will give an Int64 back, it has to throw an error. If you do this in MATLAB, Python, or R, it will not throw an error.
while the quoted input and output is
(ie, no error, but the correct (floating point) result.)Another confounding fact is that Julia optimizes on small unions, so the generated code isn't that bad anymore. Now it just creates a branch. It used to have to do all of inference and dynamic dispatching on the fly, which is what it has to do in fully uninferrable code of course, but a union of two things just does a type check and splits. So now... that example is not as bad as it used to be...
Plus I got annoyed with extra weird syntax, but I can't remember the specifics.
Basically, Julia required more lines and characters and wasn't as close to the math.
Aside:I think Matlab / Octave is a lot like SQL and Tcl: lots of haters, unfashionable, but usually the most elegant solution .
You can even use nice Unicode notation such as A ⊗ B ⊗ C.
So, `kron` is actually provided by the standard library [1], are you saying that this kronecker product didn't do the job?
[1] search in this file: https://github.com/JuliaLang/julia/blob/master/stdlib/Linear...
I wasn't comparing Julia to python or c+, but to Matlab octave.
https://github.com/locklin/PRMLT
IMO for most people, for array based algos, Matlab hits the "notation as thought" Iverson saying in a way that APL didn't quite make it.
You could translate most of those files line-for-line, with quite a few lines identical or trivially changed (bracket shape, or max -> maximum). This is often a useful thing to do, get a transliterated version running, and then re-write bits of it more idiomatically, while checking that the output is identical.
There are so many Julia packages that do similar stuff that I imagine it can't be all that hard for people who have become fluent with the macro system.
const ⊗ = kron
A = rand(5,5)
B = rand(3,3)
A ⊗ B
Tada! I'm not sure how MATLAB/Octave's kron(A,B) looks more like math than A⊗B, but everyone can have their own opinion.
Also I ha e found multiple dispatch to be harder than regular OO methods to locate (for IDEs but grep also).