Readit News logoReadit News
Posted by u/tillahoffmann 5 months ago
Show HN: Localscope–Limit scope of Python functions for reproducible executionlocalscope.readthedocs.io...
localscope is a small Python package that disassembles functions to check if they access global variables they shouldn't. I wrote this a few years ago to detect scope bugs which are common in Jupyter notebooks. It's recently come in handy writing jax code (https://github.com/jax-ml/jax) because it requires pure functions. Thought I'd share.
nine_k · 5 months ago
Nice! This approach can be used to implement coeffects (as e.g. seen in Hack [1]), by only passing explicit effect-producing objects. Imagine a function that's guaranteed to not write to any files, because it can't access `open()`, or can only access a version of `open()` that only accepts read-only access.

[1]: https://docs.hhvm.com/hack/contexts-and-capabilities/introdu...

simonw · 5 months ago
I was curious how this works - it's using the Python standard library disassembler: https://docs.python.org/3/library/dis.html

    for instruction in dis.get_instructions(code):
        ...
Code here: https://github.com/tillahoffmann/localscope/blob/092392d5bdb...

kazinator · 5 months ago
Simple localscope in TXR Lisp:

  $ txr -i localscope.tl
  Pour le service en la langue Shell, appuyez sur Ctrl-D.
  1> (localscope (+ %pi% %pi%)) ;; OK, %pi% is built-in
  6.28318530717959
  2> (localscope *print-base*) ;; likewise
  10
  3> (localscope (+ a b c)) ;; bad
  ** expr-3:1: localscope: global variables (c b a) used
Code:

  (defmacro localscope (:form f :env e . forms)
    (let ((body ^(progn ,*forms)))
      (tree-bind (exp fv-inner ff-inner fv-outer ff-outer) (expand-with-free-refs
                                                             body e)
        (ignore ff-inner fv-outer ff-outer)
        (let* ((usr (find-package :usr))
               (globals [keep-if (do or (not (boundp @1))
                                        (neq (symbol-package @1) usr))
                                 fv-inner]))
          (if globals
            (compile-error f "global variables ~s used" globals)))
        body)))
We get a list of the free variable references emanating from the enclosed forms, and filter them: we are interested in all that do not have bindings as globals in the standard library. I.e. either are not bound at all, or else if they are, are not symbols in the usr: ("user space") package.

kazinator · 5 months ago
Of course, we need test cases like this:

  1> (let (x) (localscope (+ x y)))
  ** expr-1:1: localscope: global variables (y) used

mpeg · 5 months ago
I wrote myself a similar decorator for a completely different purpose – ensuring a function I'm going to serialise over the network doesn't have any outside dependencies

This is actually a cleaner API so might switch my code to it, amazing work

dleeftink · 5 months ago
> Everything works nicely, and you package the code in a function for later use but forget about the scale factor introduced earlier in the notebook.

You see a problem, you fix it with library, and I applaud that. You have to wonder though, how many years does it take for a reproducible notebook environment to implement out of scope variable guards..

jampekka · 5 months ago
Jupyter-style notebooks are a good example of a deep architectural mistake that needs hacks on hacks on hacks to remain barely serviceable.

Luckily there are new approaches, e.g. Marimo and Pluto, that don't have the same root issue.

tillahoffmann · 5 months ago
Yes, Jupyter notebooks are flawed (there's a great talk at https://www.youtube.com/watch?v=7jiPeIFXb6U). But they are also very convenient for scrappy work and exploratory data analysis.
nerdponx · 5 months ago
This is for people who don't want to switch notebook environments, because Jupyter(lab) is getting better faster than alternatives (which support things like reactive cell execution) are becoming usable for day-to-day work.

It's just a safeguard for well-intentioned people to prevent themselves from making mistakes with their existing tools, instead of changing to a completely different set of tools.

escapecharacter · 5 months ago
Since I've started using Jupyter notebooks, I've wanted a feature to "undo" running a cell. This feels so important to for spontaneous exploration. This work feels like an important building block for this!
olejorgenb · 5 months ago
A crazy, but cool idea to implement this is using fork: https://github.com/thomasballinger/rlundo
Vaslo · 5 months ago
Only solved by the ever efficient clear outputs, restart, then run all the code all over again…
mythrowaway49 · 5 months ago
agree! I feel like there must be a good workaround. Currently, I just need to go back and run a bunch of cells again..
nathan_compton · 5 months ago
I have this idea of tools which help you do the right thing and tools that let you do the wrong thing longer. Jupyter is already the latter sort of tool, and this is the bad kind of tool on top of the bad kind of tool.
anitil · 5 months ago
I think it's a neat approach to a practical problem. I'd love an equivalent in C to be able to smoke out this sort of issue (though of course we'd have to fix 'errno' first)
veilgen · 5 months ago
This looks like a handy tool, especially for catching unintended global variable access in environments like Jupyter notebooks, where scope bugs can easily creep in. The ability to disassemble functions and analyze their dependencies could be particularly useful for debugging and enforcing best practices in functional programming, as seen in JAX.

Would be interesting to see how it compares to static analysis tools like mypy or linters—does it catch edge cases they might miss? Nice work!