Netflix open-sources Polynote, an IDE-inspired polyglot notebook

neves · 6 years ago

I just find reproducible notebooks at the internet. It is really rare to find them from coworkers. If they aren't trained as developers, it is almost impossible. Their solution for this problem looks really efficient and is really simple and brilliant:

> Writing Polynote’s code interpretation from scratch allowed us to do away with this global, mutable state. By keeping track of the variables defined in each cell, Polynote constructs the input state for a given cell based on the cells that have run above it. Making the position of a cell important in its execution semantics enforces the principle of least surprise, allowing users to read the notebook from top to bottom. It ensures reproducibility by making it far more likely that running the notebook sequentially will work.

friggeri · 6 years ago

Some related links if you’re interested in this kind of approach:

https://nbviewer.jupyter.org/github/friggeri/notebooks/blob/...

https://github.com/jupytercalpoly/reactivepy

https://dataflownb.github.io/

https://github.com/stitchfix/nodebook

type_enthusiast · 6 years ago

Thanks for the kind feedback. It's a young project to be honest, but I'm pretty proud of what we've done with only two contributors so far. With community participation I think we could support many more languages pretty quickly!

yomritoyj · 6 years ago

I really have always wished for reproducibility. Thanks for taking up this feature. How do you handle aliasing and references inside objects? Suppose I have

    #Cell 1
    a = [1,2,3]
    b = (a,True)

    #Cell 2
    b[0][0] = 5

    #Cell 3
    print(sum(a))

Now if I change Cell 2 to

   # Cell 2'
   b[0][0] = 4

and execute, Cell 3's result becomes stale. Do you track such dependencies? Would really love to read more about the underlying implementation.

datagy · 6 years ago

You could also try Voilà to make reproducible standalone app from notebook https://blog.jupyter.org/and-voilà-f6a2c08a4a93

aargh_aargh · 6 years ago

According to the article, the most interesting feature compared to Jupyter is no hidden state - if you delete a cell, the variables it set are gone. Also, you can mix languages - you'll be able to access variables filled by prevously executed cells in another language.

Personally, I'm looking forward to trying out the SQL support. I haven't seen an elegant solution for SQL notebooks in Jupyter, it was always second-class via Python or some such. Or have I missed something?

capableweb · 6 years ago

> Also, you can mix languages

Interesting. Judging by that it seems to be implemented with a JVM language and a screenshot shows "Scala" as a supported language, I'm guessing at least all the JVM languages are supported (personally hope for Clojure) but can't seem to find a list of supported languages anywhere in the post or on the website.

What languages are supported by Polynote?

type_enthusiast · 6 years ago

Currently just Scala and Python (via jep). Looking to add more (probably starting with Java and clojure) but haven't had time yet. There's just two of us working on it so far. PRs welcome!

lalaithion · 6 years ago

Looks like just Scala and Python right now:

https://github.com/polynote/polynote/blob/08f0751138e2991cf7...

mch82 · 6 years ago

> Currently Scala, Python, and SQL cell types are supported.

type_enthusiast · 6 years ago

The SQL support is done through Spark, so it's not particularly novel – Zeppelin for example supports SQL similarly. We've talked about adding a more general SQL interpreter, though. Happy to hear any suggestions about it!

chrisjc · 6 years ago

Do you know of any generalized SQL interpreter that allows push-downs to the underlying engine where possible, but can also arbitrate compute resources to post-push down operations. Eg: such as merging disparate result-sets or make up for the lack of features from the underlying engines.

Closest thing that comes to mind is something like Apache Drill, which coincidentally also uses Apache Calcite as the SQL interpreter.

Also wondering why I would use this over Zeppelin which can support other interpreters like Flink?

truculent · 6 years ago

R notebooks do a lot of this stuff reasonably well, too. But then you have R as the primary language (other langs are well supported though).

As ever, the best answer is the Notebooks Are Bad, Actually

vilos1611 · 6 years ago

If anyone would like a docker image, I created one today: https://hub.docker.com/r/greglinscheid/polynote https://github.com/Vilos92/polynote

airstrike · 6 years ago

I like this as a concept, but the JDK / jep requirements are a bit of a turn off, personally... I understand they want it to speak Spark but that's not exactly how I would imagine it worked from the name or the "polyglot notebook" description

zmmmmm · 6 years ago

While the reproducibility problem is definitely a issue, I'm not sure it's such a big issue that I'd switch to a whole different notebook solution for it. For most notebook scenarios, running from scratch works fine to ensure it reproduces. Apart from this one feature, BeakerX does all the same things and fits a lot better into the existing jupyter ecosystem.

type_enthusiast · 6 years ago

To be clear, we're not out to supplant Jupyter. Anybody who's happy with their Jupyter setup will likely find little value in Polynote. But it has plugged some gaps we've had in our Scala ML research team at Netflix, so we thought others might see some value as well.

type_enthusiast · 6 years ago

And there are lots of teams at Netflix that are investing in Jupyter as well! Plenty of room for both options.

airstrike · 6 years ago

Somewhat off-topic, but what's with the lambda replacing the "n" letter? I'm no expert in Greek but I thought lambda was the equivalent to the letter "l"...

type_enthusiast · 6 years ago

The logo was hastily designed by an amateur (me). I figured most people would figure it out, pedantic people would complain, and we'd all have a good time :)

We've had some better options contributed in the past couple of weeks, but as long as we're going to change it I didn't want to rush that. So we stuck with my questionable typographic treatment for the blog post.

(Edit: autocorrect typo)

nsgf · 6 years ago

Atm it reads 'polilote' in Greek. You might want to substitute 'λ' for ΄ν'.

kazinator · 6 years ago

Right? It would work as Poλynote.

type_enthusiast · 6 years ago

See, we tried that, and to me it just looked like "ponynote". So far everyone who's mentioned "polylote" has been a current or former physicist, so maybe there's an interesting correlation there...

aaronbrethorst · 6 years ago

Perhaps the product name is actually pronounced "Polllote"

type_enthusiast · 6 years ago

You're more than welcome to pronounce it that way! :)

Deleted Comment

gen3 · 6 years ago

It looks like the editor this uses is Monaco, the editor in vscode, that’s pretty cool.

type_enthusiast · 6 years ago

It does! Monaco is one of the many awesome open source libraries that made Polynote possible. We'll be discussing that at Scale by the Bay; check out our talk if you're going!

prestonh · 6 years ago

It seems like the tool was mainly invented to deal with the issue of hidden state in notebooks, but I don't honestly see what the big deal is. Jupyter notebook is a tool with hidden state being a gotcha that you can learn how to deal with extremely quickly. I've been a Jupyter notebook for several years so haven't had this problem often in recent memory, but I've led workshops where we teach users how to use the notebook. Inevitably hidden state issues come up, but students very quickly learn that restarting the kernel is a necessary part of the workflow and figure out when they need to do it.