I’ve given a bit of thought to Rust since it’s polars native and I want to move away from pandas.
Is nim a good place to go?
I’ve given a bit of thought to Rust since it’s polars native and I want to move away from pandas.
Is nim a good place to go?
My initial impression though is that the scope is very broad. Trying to be both sci-kit learn and numpy and torch seems like a recipe for doing none of these things very well.
Its interesting to contrast this with the visions/aspirations of other new-ish deep learning frameworks. Starting with my favorite, Jax offers "composable function transformations + autodiff". Obviously there is still a tonne of work to do this well, support multiple accelerators etc. etc. But notably I think they made the right call to leave high level abstractions (like fully-fledged NN libraries or optimisation libraries) out of the Jax core. It does what it says on the box. And it does it really really well.
TinyGrad seems like another interesting case study, in the sense that it is aggressively pushing to reduce complexity and LOC while still providing the relevant abstractions to do ML on multiple accelerators. It is quite young still, and I have my doubts about how much traction it will gain. Still a cool project though, and I like to see people pushing in this direction.
PyTorch obviously still has a tonne of mind-share (and I love it), but it is interesting to see the complexity of that project grow beyond what it is arguably necessary. (e.g. having a "MultiHeadAttention" implementation in PyTorch is a mistake in my opinion).
Personally I’d like Arraymancer to be a great tensor library (basically a very good and ideally faster alternative to numpy and base Matlab). Frankly I think that it’s nearly there already. I’ve been using Arraymancer to port a 5G physical layer simulator from Matlab to nim and it’s been a joy. It’s not perfect by any means but it’s already very good. And given how fast nim’s scientific ecosystem keeps improving it will only get much better.
Programming in nim feels a lot like programming in a statically typed Python, except that you get small, fast, single file executables out of the box.
Compared to C++ you don’t need to worry about memory allocations too much except when you really want to, and when you do it is much simpler than in C or C++. It is also much easier to control mutability and the type system is much better. Also you can use most of the language at compile time in a much more elegant way than in modern C++.
Another surprising thing is that while the ecosystem is obviously much smaller than on more popular languages, the fact that it has a built-in package manager gives you access to it much more easily than in C++. There are a lot of high quality packages (such as datamancer, arraymancer and ggplotnim) which makes nim very productive in a lot of domains.
That’s not even including advanced things such as nim’s excellent macro system which I personally don’t use (but which enable some of the best nim libraries).
Oh, and I _love_ nim’s uniform function call syntax. Every other language should copy that feature.
I almost forgot to list the drawbacks. The main one of course is that it is not the most popular language (but the ecosystem is big enough that I didn’t find it is a big problem for my use case in practice). Other than that the editing and debugging experience could be improved. There is a decent VS Code plug-in (look for the one made by “saem”) but it is just OK, not great. There is some integration with gdb but it is a bit rough. I usually end up adding logs when I need to debug something.
https://forum.nim-lang.org/t/5734
The objective is to make nim suitable for embedded programming and other use cases for which garbage collection is a non starter. This new —gc:arc mode is already available in the nightly builds and the benchmarks are already impressive. I believe that the plan is to make arc the default “garbage collector mode” in nim 1.2.
Disclaimer: I'm a PM on PowerShell at Microsoft, and the following is based on my personal observations. There's also no current plan to do any of what I suggested (but it's certainly giving me a lot to think about at the start of my work week).
The default ExecutionPolicy was engrained and controversial long before I joined the PowerShell Team (or even Microsoft), and I'll be the first to admit that, as a long time Linux user, I didn't GET why I couldn't just run arbitrary scripts.
The public Microsoft opinion on the matter has always been that ExecutionPolicy is there to try and prevent yourself from shooting yourself in the foot. In Linux, this is very similar to adding a +x to a script (although that's clearly much simpler than signing a script).
I'd say it's also akin to the red browser pages warning you about self signed certificates, or Microsoft/Apple "preventing" you from running certain app packages that are "untrusted" in one form or another. In one sense, you could actually argue that PowerShell was ahead of the curve.
Now as a power user with much less at a stake than, say, an IT administrator at a Fortune 500 company, the first thing I do with all these restrictions, warnings, and confirmation prompts is to turn them off.
But those warnings (often with no clear way to disable them) are there for a reason. PowerShell was built at a time when Microsoft admins were GUI heavy, and PowerShell did its best to herald them into a scripting world while fostering best practices. And if you're using a bunch of scripts sourced from SMB shares within your domain as a domain administrator, you don't want to accidentally run a script that hadn't gone through your internal validation process (hopefully culminating in the script getting signed by your domain CA).
So let me assume that you agree with everything so far. Why does this experience still stink? Any Microsoft administration worth their salt uses PowerShell, and many of even the power-est of users finds ExecutionPolicy annoying.
In my opinion, it's too hard to sign scripts. We should be following in the footsteps of the drive to get everything on HTTPS (props to Let's Encrypt and HTTPS Everywhere, along with many others). We should have guidance on signing if you have a CA, we should have guidance on getting scripts signed by third party CAs, and we should probably offer a cheap/free service for signing stuff that goes up on the PowerShell Gallery.
Oh, and we should make it easier to read the red error that gets thrown explaining how to lookup a help topic that tells you how to bypass/change ExecutionPolicy.
Unfortunately, that's all easier said than done. But the default being what it is puts the onus on us to do something to make security easier.
I understand your reasoning. I like some of your proposals. For example I'm all for making it easier to sign scripts, for example. Yet I feel that the possible solutions that you mention ignore the fact that you can bypass this "security" mechanism can be bypassed with a simple BAT file.
Why do PowerShell scripts require more security than a BAT file or an executable? Are users really safer thanks to the ExecutionPolicy check? Or are they simply worse off because people will either use less powerful BAT files or completelly opaque executables? At least with a PowerShell script you are able to inspect the code if you are so inclined. By pushing people to use executables instead they are less likely to know what changes will be done to the system.
If the problem is admins accidentally double clicking on unsigned scripts, by all means show a confirmation dialog (if the script is not signed) when a non signed script is _first_ executed. Actually, do that for BAT files and perhaps even for executables as well. But don't do it (by default at least) when someone calls a script explicitly from a command line or from another script. IMHO that would really make us all safer and would make PowerShell a real replacement for BAT files.
I understand that this is done for security reasons, but Windows already lets you execute any executable or BAT file that you might have downloaded from the Internet. So I'm not sure that disabling PowerShell script execution really gains you much (and there are probably other better solutions anyway).
So IMHO as long as DOS is available and PowerShell is so limited by default BAT files will not go away, which is unfortunate.
- Commit 1 -> Commit 2 -> Commit 3 (master/default) - current branch
Now Commit 3 is the HEAD of your current branch (master, trunk, default, whatever). The way I understand your "problem" is if you want to create a new branch based on an older, non-HEAD commit.For instance you checkout the current branch on based on the state of the branch as per "Commit 2", and make a new Commit "Commit 4". This creates a new implicit branch:
- Commit 1 -> Commit 2 +-> Commit 3 (master/default)
+-> Commit 4 (implicit branch) - new current branch
Now you have two branches. How do you switch between them? How do you operate on those branches? Do you cycle them? Do you have to remember the last commit-message of a branch to find it again?To be absolutely clear: What I'm curious about is how do you relate to and navigate between branches when they are all anonymous? What mechanisms do you have to aid identification?
So in your example you would do:
"hg update 2" to checkout Commit 3, and "hg update 3" to checkout Commit 4
Note that these revision numbers are _local_ to your repository. That is, another clone of the same repository may have different revision numbers assigned to different revisions. The revision numbers are assigned in the order in which commits have been added to that particular clone.
You could of course also use the revision id (i.e. the SHA1 hash) to identify them. You do not need to use the whole revision id, just the part that makes it unique, usually the first few (e.g. 6-8) characters of the hash.
In addition Mercurial has a very rich mini-language to query and identify revisions in your repository. These queries are called "revsets". Most mercurial commands accept revsets wherever they would need to receive a revision identifier. With it you can identify revisions in many ways such as by date, by belonging to a given branch, commited by a certain author, containing a certain file, etc (and any combination of those and many others).
Finally, if you use a good mercurial GUI (such as TortoiseHg) the whole thing is moot because the GUI will show you the whole DAG and you can just click on the revision that you want and click the "Update" button.
I actually find the ability to create these "anonymous" branches really useful. Naming things is hard so I find that coming up with a name for each new short lived branch is annoying.
There are all these different levels of "saveyness" and the lower levels are just not as interesting.
In particular, I don't want stupid untested typos in the commit history because they just aren't interesting or helpful.
But I'm willing to say that I'm probably missing something… I just don't see what it is yet. :-)
Those hidden, obsolete revisions are not shown on your DAG and they are generally not pushed nor pulled. In most respects they behave as if they were not even there. It is only when you need them that you can show them or go back to them (by using the --hidden flag of some of mercurial's command such as hg log or hg update). This gives you a nice safety net (since rewriting history is no longer a destructive operation) that you can use _if you want_. It also makes it possible to rewrite revisions that you have already shared with other users (since when you push a successor revision you also push the list of revisions that it is the successor of).
I think evolve is a significant step forward on the DVCS paradigm as it enables safe, distributed, collaborative history rewriting. This is something that, AFAIK, was not possible up until now.
Thanks for the tip.
You can work around this surprising behavior by always using inline dependencies, or by using the `--project` argument but this requires that you type the script path twice which is pretty inconvenient.
Other than that uv is awesome, but this small quirk is quite annoying.