ezquerra (u/ezquerra)

ezquerra commented on Uv: Running a script with dependencies docs.astral.sh/uv/guides/... · Posted by u/Bluestein

ezquerra · 8 months ago

I recently found a small issue with `uv run`. If you run a script from outside the project folder, it looks for the pyproject.toml on the folder from which you are calling `uv run`, not on the folder where the python script is located (or its parents)! Because of that scripts that store their dependencies in a pyproject.toml cannot be run successfully using a “bare” `uv run path/to/my/script.py` from outside the project folder.

You can work around this surprising behavior by always using inline dependencies, or by using the `--project` argument but this requires that you type the script path twice which is pretty inconvenient.

Other than that uv is awesome, but this small quirk is quite annoying.

ezquerra commented on Arraymancer – Deep learning Nim library github.com/mratsim/Arraym... · Posted by u/archargelod

wodenokoto · 2 years ago

Having grown up with JavaScript Python and R, I’m kinda looking towards learning a compiled language.

I’ve given a bit of thought to Rust since it’s polars native and I want to move away from pandas.

Is nim a good place to go?

ezquerra · 2 years ago

Definitely. Nim is a great language and coming from Python it might be the easiest compiled language for you to get into.

ezquerra commented on Arraymancer – Deep learning Nim library github.com/mratsim/Arraym... · Posted by u/archargelod

angusturner · 2 years ago

I would love for a non-python based deep learning framework to gain traction.

My initial impression though is that the scope is very broad. Trying to be both sci-kit learn and numpy and torch seems like a recipe for doing none of these things very well.

Its interesting to contrast this with the visions/aspirations of other new-ish deep learning frameworks. Starting with my favorite, Jax offers "composable function transformations + autodiff". Obviously there is still a tonne of work to do this well, support multiple accelerators etc. etc. But notably I think they made the right call to leave high level abstractions (like fully-fledged NN libraries or optimisation libraries) out of the Jax core. It does what it says on the box. And it does it really really well.

TinyGrad seems like another interesting case study, in the sense that it is aggressively pushing to reduce complexity and LOC while still providing the relevant abstractions to do ML on multiple accelerators. It is quite young still, and I have my doubts about how much traction it will gain. Still a cool project though, and I like to see people pushing in this direction.

PyTorch obviously still has a tonne of mind-share (and I love it), but it is interesting to see the complexity of that project grow beyond what it is arguably necessary. (e.g. having a "MultiHeadAttention" implementation in PyTorch is a mistake in my opinion).

ezquerra · 2 years ago

I am one of the Arraymancer contributors. I believe that what mratsim (Arraymancer’s creator) has done is pretty amazing but I agree that the scope is a quite ambitious. There’s been some talk about separating the deep learning bits into its own library (which I expect would be done in a backwards compatible way). Recently we worked on adding FFT support but instead of adding it to Arraymancer it was added to “impulse” (https://github.com/SciNim/impulse) which is a separate, signal processing focused library. There is also Vindaar’s datamancer (a pandas like dataframe library) and ggplotnim (a plotting library inspired by R’s ggplot). The combination of all of these libraries makes nim a very compelling language for signal processing, data science and ML.

Personally I’d like Arraymancer to be a great tensor library (basically a very good and ideally faster alternative to numpy and base Matlab). Frankly I think that it’s nearly there already. I’ve been using Arraymancer to port a 5G physical layer simulator from Matlab to nim and it’s been a joy. It’s not perfect by any means but it’s already very good. And given how fast nim’s scientific ecosystem keeps improving it will only get much better.

ezquerra commented on Choosing Nim out of a crowded market for systems programming languages forum.nim-lang.org/t/9655... · Posted by u/generichuman

ezquerra · 3 years ago

I recently started using nim for some little side projects and I am having so much fun. Somehow they’ve managed to combine the strengths of high level languages such as python with those of low level, systems languages like C with few (if any) of the drawbacks. It’s pretty amazing to be honest.

Programming in nim feels a lot like programming in a statically typed Python, except that you get small, fast, single file executables out of the box.

Compared to C++ you don’t need to worry about memory allocations too much except when you really want to, and when you do it is much simpler than in C or C++. It is also much easier to control mutability and the type system is much better. Also you can use most of the language at compile time in a much more elegant way than in modern C++.

Another surprising thing is that while the ecosystem is obviously much smaller than on more popular languages, the fact that it has a built-in package manager gives you access to it much more easily than in C++. There are a lot of high quality packages (such as datamancer, arraymancer and ggplotnim) which makes nim very productive in a lot of domains.

That’s not even including advanced things such as nim’s excellent macro system which I personally don’t use (but which enable some of the best nim libraries).

Oh, and I _love_ nim’s uniform function call syntax. Every other language should copy that feature.

I almost forgot to list the drawbacks. The main one of course is that it is not the most popular language (but the ecosystem is big enough that I didn’t find it is a big problem for my use case in practice). Other than that the editing and debugging experience could be improved. There is a decent VS Code plug-in (look for the one made by “saem”) but it is just OK, not great. There is some integration with gdb but it is a bit rough. I usually end up adding logs when I need to debug something.

ezquerra commented on The Odin Programming Language odin-lang.org/... · Posted by u/gingerBill

Touche · 6 years ago

I really like Nim but GC makes it a nonstarter for embedded devices. So to me Nim is in the same category as Go.

ezquerra · 6 years ago

The nim team is currently working on removing the garbage collector by means of a new reference counting based “garbage collector mode” called “arc” (for automatic reference counting). You can get more info in the following link, where it is described as “plain old reference counting with optimizations thanks to move semantics”:

https://forum.nim-lang.org/t/5734

The objective is to make nim suitable for embedded programming and other use cases for which garbage collection is a non starter. This new —gc:arc mode is already available in the nightly builds and the benchmarks are already impressive. I believe that the plan is to make arc the default “garbage collector mode” in nim 1.2.

ezquerra commented on The increased use of PowerShell in attacks [pdf] symantec.com/content/dam/... · Posted by u/selmat

joeyaiello · 9 years ago

(Thanks to ezquerra for bringing me back to the conversation here via Twitter.)

Disclaimer: I'm a PM on PowerShell at Microsoft, and the following is based on my personal observations. There's also no current plan to do any of what I suggested (but it's certainly giving me a lot to think about at the start of my work week).

The default ExecutionPolicy was engrained and controversial long before I joined the PowerShell Team (or even Microsoft), and I'll be the first to admit that, as a long time Linux user, I didn't GET why I couldn't just run arbitrary scripts.

The public Microsoft opinion on the matter has always been that ExecutionPolicy is there to try and prevent yourself from shooting yourself in the foot. In Linux, this is very similar to adding a +x to a script (although that's clearly much simpler than signing a script).

I'd say it's also akin to the red browser pages warning you about self signed certificates, or Microsoft/Apple "preventing" you from running certain app packages that are "untrusted" in one form or another. In one sense, you could actually argue that PowerShell was ahead of the curve.

Now as a power user with much less at a stake than, say, an IT administrator at a Fortune 500 company, the first thing I do with all these restrictions, warnings, and confirmation prompts is to turn them off.

But those warnings (often with no clear way to disable them) are there for a reason. PowerShell was built at a time when Microsoft admins were GUI heavy, and PowerShell did its best to herald them into a scripting world while fostering best practices. And if you're using a bunch of scripts sourced from SMB shares within your domain as a domain administrator, you don't want to accidentally run a script that hadn't gone through your internal validation process (hopefully culminating in the script getting signed by your domain CA).

So let me assume that you agree with everything so far. Why does this experience still stink? Any Microsoft administration worth their salt uses PowerShell, and many of even the power-est of users finds ExecutionPolicy annoying.

In my opinion, it's too hard to sign scripts. We should be following in the footsteps of the drive to get everything on HTTPS (props to Let's Encrypt and HTTPS Everywhere, along with many others). We should have guidance on signing if you have a CA, we should have guidance on getting scripts signed by third party CAs, and we should probably offer a cheap/free service for signing stuff that goes up on the PowerShell Gallery.

Oh, and we should make it easier to read the red error that gets thrown explaining how to lookup a help topic that tells you how to bypass/change ExecutionPolicy.

Unfortunately, that's all easier said than done. But the default being what it is puts the onus on us to do something to make security easier.

ezquerra · 9 years ago

Thank you Joeyaiello for your reply.

I understand your reasoning. I like some of your proposals. For example I'm all for making it easier to sign scripts, for example. Yet I feel that the possible solutions that you mention ignore the fact that you can bypass this "security" mechanism can be bypassed with a simple BAT file.

Why do PowerShell scripts require more security than a BAT file or an executable? Are users really safer thanks to the ExecutionPolicy check? Or are they simply worse off because people will either use less powerful BAT files or completelly opaque executables? At least with a PowerShell script you are able to inspect the code if you are so inclined. By pushing people to use executables instead they are less likely to know what changes will be done to the system.

If the problem is admins accidentally double clicking on unsigned scripts, by all means show a confirmation dialog (if the script is not signed) when a non signed script is _first_ executed. Actually, do that for BAT files and perhaps even for executables as well. But don't do it (by default at least) when someone calls a script explicitly from a command line or from another script. IMHO that would really make us all safer and would make PowerShell a real replacement for BAT files.

ezquerra commented on The increased use of PowerShell in attacks [pdf] symantec.com/content/dam/... · Posted by u/selmat

ezquerra · 9 years ago

PowerShell is obviously a much better scripting language than the ancient DOS BAT "language" (if you can call it that). In theory it's also mostly ubiquitous on Windows which means that you can rely on it being there when creating a script. Yet I've found that often people keep using BAT files (e.g. in build scripts, etc). I think it is because you cannot just execute a PowerShell script unless it is signed or the user has manually enabled non signed script execution (by executing a command on the PowerShell command). This means you cannot rely on it just working, at which point it's often best to fall back to DOS or use another scripting language such a Python.

I understand that this is done for security reasons, but Windows already lets you execute any executable or BAT file that you might have downloaded from the Internet. So I'm not sure that disabling PowerShell script execution really gains you much (and there are probably other better solutions anyway).

So IMHO as long as DOS is available and PowerShell is so limited by default BAT files will not go away, which is unfortunate.

ezquerra commented on Ten Years of Git: An Interview with Linus Torvalds linux.com/news/featured-b... · Posted by u/lclark

josteink · 11 years ago

I don't think you understand my dilemma. Let me try to clear it up with an example. Let's say you have the following history:

    - Commit 1 -> Commit 2 -> Commit 3 (master/default) - current branch

Now Commit 3 is the HEAD of your current branch (master, trunk, default, whatever). The way I understand your "problem" is if you want to create a new branch based on an older, non-HEAD commit.

For instance you checkout the current branch on based on the state of the branch as per "Commit 2", and make a new Commit "Commit 4". This creates a new implicit branch:

    - Commit 1 -> Commit 2 +-> Commit 3 (master/default)
                           +-> Commit 4 (implicit branch) - new current branch

Now you have two branches. How do you switch between them? How do you operate on those branches? Do you cycle them? Do you have to remember the last commit-message of a branch to find it again?

To be absolutely clear: What I'm curious about is how do you relate to and navigate between branches when they are all anonymous? What mechanisms do you have to aid identification?

ezquerra · 11 years ago

It is actually pretty simple. Mercurial automatically assigns a _local_, sequential revision number to every commit in your repository clone. In fact those revision numbers almost match the numbers in your examples (Commit 1 would have revision number 0, Commit 2 would be revision number 1, etc).

So in your example you would do:

"hg update 2" to checkout Commit 3, and "hg update 3" to checkout Commit 4

Note that these revision numbers are _local_ to your repository. That is, another clone of the same repository may have different revision numbers assigned to different revisions. The revision numbers are assigned in the order in which commits have been added to that particular clone.

You could of course also use the revision id (i.e. the SHA1 hash) to identify them. You do not need to use the whole revision id, just the part that makes it unique, usually the first few (e.g. 6-8) characters of the hash.

In addition Mercurial has a very rich mini-language to query and identify revisions in your repository. These queries are called "revsets". Most mercurial commands accept revsets wherever they would need to receive a revision identifier. With it you can identify revisions in many ways such as by date, by belonging to a given branch, commited by a certain author, containing a certain file, etc (and any combination of those and many others).

Finally, if you use a good mercurial GUI (such as TortoiseHg) the whole thing is moot because the GUI will show you the whole DAG and you can just click on the revision that you want and click the "Update" button.

I actually find the ability to create these "anonymous" branches really useful. Naming things is hard so I find that coming up with a name for each new short lived branch is annoying.

ezquerra commented on What's New in Mercurial 3.0 hglabhq.com/blog/2014/4/2... · Posted by u/anton_gogolev

__david__ · 12 years ago

Perhaps, but I think it's more fundamental than that. Like I said earlier, I don't save after ever character or word typed into a file, and I don't even commit changes until I think they might be ready. I don't push until it works (for some definition of "works" depending on how lazy I am and what project it is).

There are all these different levels of "saveyness" and the lower levels are just not as interesting.

In particular, I don't want stupid untested typos in the commit history because they just aren't interesting or helpful.

But I'm willing to say that I'm probably missing something… I just don't see what it is yet. :-)

ezquerra · 12 years ago

When you enable mercurial's evolve, history rewriting operations are no longer destructive. Mercurial's evolve creates a sort of repository "meta history" by saving every revision before you modify it (e.g. by using amend). It makes those saved, old versions "obsolete" and it "hides" them (so that they won't show on your DAG when you do "hg log", for example, unless you use the --hidden flag). Evolve also keeps track of the relationship between obsolete revisions and their "successors". That is, it will know whether a certain revision in your DAG is the result of amending one revision, or perhaps of folding several revisions into one, or splitting one revision in two or perhaps just removing a revision from the DAG (this is what I called repository meta history above).

Those hidden, obsolete revisions are not shown on your DAG and they are generally not pushed nor pulled. In most respects they behave as if they were not even there. It is only when you need them that you can show them or go back to them (by using the --hidden flag of some of mercurial's command such as hg log or hg update). This gives you a nice safety net (since rewriting history is no longer a destructive operation) that you can use _if you want_. It also makes it possible to rewrite revisions that you have already shared with other users (since when you push a successor revision you also push the list of revisions that it is the successor of).

I think evolve is a significant step forward on the DVCS paradigm as it enables safe, distributed, collaborative history rewriting. This is something that, AFAIK, was not possible up until now.

ezquerra commented on Mercurial 2.8 Released mercurial.selenic.com/wik... · Posted by u/japaget

ngoldbaum · 12 years ago

Not quite - graft doesn't support remote repositories, transplant does. I didn't know about hg log --graph, I think the muscle memory I have for hg glog will take a little while to break :)

Thanks for the tip.

ezquerra · 12 years ago

You are right. That is a small limitation of the graft command compared to the transplant extension. As far as I know that is the only difference between them. I have never needed to do that so I forgot to mention it, sorry!