Readit News logoReadit News
simonw · 2 years ago
Wow. A now-CC-licensed 4 day (when presented in-person) Python training course that's been iterated on for 16 years!

David wrote https://www.dabeaz.com/generators/ which remains one of my all-time favourite Python tutorials. Looking forward to digging into this.

sebk · 2 years ago
Beazley's Concurrency From the Ground Up is one of my favorite tech talks ever: In about 45 minutes he builds an async framework using generators, while live coding in an emacs screen that shows only about 20 lines and without syntax highlighting, and not breaking stride with his commentary and engaging with the audience.

It's 8 years old, but definitely worth a watch: https://www.youtube.com/watch?v=MCs5OvhV9S4

boredemployee · 2 years ago
Exactly what I thought while watching that video, it's as if he's spitting out the characters as he speaks.:

"A fantastic, entertaining and highly educational talk. It always bothers me that I can't play the piano and talk at the same time (my wife usually asks me things while I'm playing). But David can even type concurrent Python code in Emacs in Allegro vivace speed and talk about it at the same time. An expert in concurrency in every sense of the word. How enviable!"

globular-toast · 2 years ago
There's also the one where he live codes a Webassembly interpretor. But my favourite is his talk on lambda calculus. It's incredibly fun to follow through.
heap_perms · 2 years ago
Yes, that talk is legendary. Very impressive how he is able to both talk and write the code at the same time.
philipov · 2 years ago
Is that the one where he modified his interpreter to provide slapstick comedy as part of the talk?
_ank_it · 2 years ago
the love of programming and engineering .. this is what motivated me at first place to do CS
samstave · 2 years ago
Wow - that was awesome - wish I had known of this vid for years - but thank you.
terlfgy · 2 years ago
It is entertaining and intelligent, but you won't learn Python from it and you won't get anywhere near a production ready implementation, since it glosses over all the hard parts.
drexlspivey · 2 years ago
Yes! My favorite Python talk, dude is a wizard
throwaway290 · 2 years ago
I wish they turned off ads on this. If this is PyCon surely they get PSF sponsorship money anyway...
faizshah · 2 years ago
Since we are sharing resources Fluent Python is my favorite reference on Python. It covers so many advanced features like concurrency, functools etc. It’s not the kind of book you read cover to cover it’s one that you go to as you need it, when I was working on python stuff I would read it once a month.

My favorite introductory book (not an introduction to programming but an introduction to the language) is “Introducing Python by Lubanovic” because it’s one of the only beginner books that actually covers the python module system with enough depth and the second half of the book gives a quick overview of a lot of different python libraries.

manvillej · 2 years ago
If we're talking favorite python tutorials, I am a huge fan of this tutorial on python entrypoints: https://amir.rachum.com/python-entry-points/
Al0neStar · 2 years ago
I'm guessing this was submitted after the "Ask HN" about leveling up to a production python programmer and i'm surprised no one mentioned these books:

1. Test-Driven Development with Python

2. Architecture Patterns with Python

The 2nd one is the closest you're gonna get to a production-grade tutorial book.

Related to this topic, these resources by @dbeazley:

Barely an Interface

https://github.com/dabeaz/blog/blob/main/2021/barely-interfa...

Now You Have Three Problems

https://github.com/dabeaz/blog/blob/main/2023/three-problems...

A Different Refactoring

https://github.com/dabeaz/blog/blob/main/2023/different-refa...

His youtube channel:

https://youtube.com/@dabeazllc

DarkNova6 · 2 years ago
Am I the only one who is not particularly impressed by any of these links? Maybe I should see it as exemplatory examples, but they would not make it through a Code Review.

-> Currently serving as Application Architect for a medium sized Python application.

cauthon · 2 years ago
For the purposes of discussion, it'd probably be helpful to describe the issues you would identify during review
synergy20 · 2 years ago
'written by the same author' -- thought it's dbeazley, it's somebody else in fact.
Al0neStar · 2 years ago
Fixed it, i meant that both of the books mentioned are written by Harry Percival (and the 2nd was written as a sequel) .
mafuku · 2 years ago
damn, I feel really dumb trying to follow the logic of all those function compositions in the three problems post
ddejohn · 2 years ago
You shouldn't. Author seems pretty tongue-in-cheek about it:

> lambda has the benefit of making the code compact and foreboding. Plus, it prevents people from trying to add meaningful names, documentation or type-hints to the thing that is about to unfold.

Disclaimer, I did not read the entire post.

george_____t · 2 years ago
Python isn't the best language for exploring this sort of thing, by the author's own admission.
underdeserver · 2 years ago
David Beazley will forever have my respect for his talk where he uses Python to untangle 1.5T of C++ code on an airgapped computer, as an expert witness in a court case:

https://youtu.be/RZ4Sn-Y7AP8

It's 47 minutes and totally worth it.

faitswulff · 2 years ago
I don't even write Python but after watching Beazley's concurrency talk I will watch David Beazley talks all day. Adding this to the list
jamesdutc · 2 years ago
I taught this course to corporate clients for three or four years before developing my own materials.

The course materials for this course and the introductory course (“Practical Python”[1]) are quite thorough, but I've always found the portfolio analysis example very hokey.

There's enormous, accessible depth to these kinds of P&L reporting examples, but the course evolves this example in a much less interesting direction. Additionally, while the conceptual and theoretical materials is solid, the analytical and technical approach that the portfolio example takes quickly diverges from how we would actually solve a problem like this. (These days, attendees are very likely to have already been exposed to tools like pandas!) This requires additional instructor guidance to bridge the gap, to reconcile the pure Python and “PyData” approaches. (Of course, no other Python materials or Python instruction properly address and reconcile these two universes, and most Python materials that cover the “PyData” universe—especially those about pandas—are rife with foundational conceptual errors.)

Overall, David is an exceptional instructor, and his explanations and his written materials are top notch. He is one of the most thoughtful, most intelligent, and most engaging instructors I have ever worked with.

I understand from David that he rarely teaches this course or Practical Python to corporate audience, instead preferring to teach courses direct to the public. (In fact, I took over a few of his active corporate clients when he transitioned away from this work, which is what led me to drafting my own curricula.) I'm not sure if he still teaches this course at all anymore.

However, I would strongly encourage folks to look into his new courses, which cover a much broader set of topics (and are not Python-specific)! [2]

Also, if you do happen to be a Python programmer, be sure to check out his most recent book,“Python Distilled”[3]!

[1] https://dabeaz-course.github.io/practical-python/

[2] https://www.dabeaz.com/courses.html

[3] https://www.amazon.com/Python-Essential-Reference-Developers...

diimdeep · 2 years ago
> https://www.dabeaz.com/courses.html

Well, unfortunately 5-day courses listed there are $1500 each.

If the free of charge course discussed here is really that good, it is a nice promo to go and pay for another. Ed Tech Lo-Fi style.

kingkongjaffa · 2 years ago
Something I hate in my own code is this pattern of instantiating an empty list and then iterating on it when reading files. Is there a better way than starting lst= [] and then later doing lst.append()

This is an example from the linked course https://github.com/dabeaz-course/python-mastery/blob/main/Ex...:

``` # readport.py

import csv

# A function that reads a file into a list of dicts

def read_portfolio(filename):

    portfolio = []

    with open(filename) as f:

        rows = csv.reader(f)

        headers = next(rows)

        for row in rows:

            record = {

                'name' : row[0],

                'shares' : int(row[1]),

                'price' : float(row[2])

            }

            portfolio.append(record)

    return portfolio
```

nicwolff · 2 years ago
Whenever you see this pattern, think of using a generator instead:

    def read_portfolio(filename):
        with open(filename) as f:
            rows = csv.reader(f)
            headers = next(rows)
            for row in rows:
                yield {
                    'name' : row[0],
                    'shares' : int(row[1]),
                    'price' : float(row[2]),
                }
Now you can call read_portfolio() to get an iterable that lazily reads the file and yields dicts:

    portfolio = read_portfolio()
    for record in portfolio:
        print '{shares} shares of {name} at ${price}'.format_map(record)

drexlspivey · 2 years ago
or use the built-in csv.DictReader :)
PennRobotics · 2 years ago
It's not better than a generator, but I'm surprised nobody has mentioned the very terse and still mostly readable

    header, *records = [row.strip().split(',') for row in open(filename).readlines()]
but then you need a way to parse the records, which could be Template() from the string library or something like...

    type_record = lambda r : (r[0], int(r[1]), float(r[2]))
At this point, the two no longer mesh well, unless you would be able to unpack into a function/generator/lambda rather than into a variable. (I don't know but my naive attempts and quick SO search were unfruitful.) Also, you're potentially giving up benefits of the CSV reader. Plus, as others have clarified, brevity does not equal readability or relative lack of bugs:

In the course example, it's reasonably easy to add some try blocks/error handling/default values while assigning records, giving you the chance to salvage valid rows without affecting speed or readability. In fact, error handling would be a necessity if that CSV file is externally accessible. Contrast that with my two lines, where there's not an elegant way to handle a bad row or escaped comma or missing file or virtually any other surprise.

Anything else I can think of off-hand (defaultdict, UserList, "if not portfolio:") has the same initialization step, endures some performance degradation, is more fragile, and/or is needlessly unreadable, like this lump of coal:

    portfolio = [record] if 'portfolio' not in globals() else portfolio + [record]
So... your technique and generators. Those are safe-ish, readable, relatively concise, etc.

CogitoCogito · 2 years ago
> It's not better than a generator, but I'm surprised nobody has mentioned the very terse and still mostly readable

> header, *records = [row.strip().split(',') for row in open(filename).readlines()]

Better would be:

    header, *records = [row.strip().split(',') for row in open(filename)]
No need to read the lines all into memory first.

Edit: Also if you want to be explicit with the file closing, you could do something like:

    with open(filename) as infile:
        header, *records = [row.strip().split(',') for row in infile]
That is if we wanted to protect against future changes to semantics for garbage collection/reference counting. I always do this, but I kind of doubt it will ever really matter in any code I write.

mekoka · 2 years ago

    def read_portfolio(filename):
        record = lambda r: {
            'name': r[0],
            'shares': int(r[1]),
            'price': float(r[2]),
        }
        with open(filename) as f:
            rows = csv.reader(f)
            headers = next(rows)
            return [record(r) for r in rows]

KMnO4 · 2 years ago
Swap the square brackets for parentheses in the return statement and it will return a generator expression.

That will read the file as needed (ie as you iterate over it) instead of loading the entire thing in memory.

    for record in read_portfolio(fn):
        # do stuff

kstrauser · 2 years ago
Or even:

  def read_portfolio(filename):
      with open(filename) as f:
          rows = csv.reader(f)
          headers = next(rows)
          return [
              {
                  "name": r[0],
                  "shares": int(r[1]),
                  "price": float(r[2]),
              }
              for r in rows
          ]

tomn · 2 years ago
I don't think you can really improve on this.

You could use a list comprehension, but that can be unclear and hard to extend, depending on the situation. It can be a nice option if most of the parts in the generator can be broken out into functions with their own name, though.

You could turn it into a generator, which can cause some fun bugs (e.g. everything works fine when you first iterate over it, but not afterwards), so IMO that's best used when it needs to be a generator, for semantics or performance.

You could turn it into a generator, then add a wrapper that turns it into a list (keeping the inner function private), or use a decorator that does the same, but it's less clear than this pattern.

So, i'd just learn to live with it.

CogitoCogito · 2 years ago
Yeah I think using a list comprehension is overkill. The main reason I like list comprehensions is because I don't introduce variables (even temporarily) that I don't really need. I think that clarifies the code. But putting the code in a separate function also avoids introducing those variables to the current scope only at a cost of putting the code somewhere else (which I personally think has a cost). In this case I would just use a function or (probably) just inline it as you don't like.
thrdbndndn · 2 years ago
There is always list comprehension.

So `portfolio = [{'name': row[0], 'shares': int(row[1]), 'price': float(row[2]) for row in rows]`

But if it's more complicated than this (like if there is conditional(s) inside the loop), I'd recommend just stick with the current approach. It's possible to have even multiple conditionals in list comprehension, but it's not really very readable. If you do want to, walrus operator can make things better

(something like `numbers = [m[1] for s in array if (m := re.search(r'^.*(\d+).*$', s))]`)

nurbl · 2 years ago
They can be more readable than that at least, e.g.:

    keys = "name", "shares", "price"
    portfolio = [
        dict(zip(keys, row))
        for row in rows
    ]
If I had to do more complex stuff than building a dict like this I'd move it into a function. That tends to make the purpose more clear anyway.

That said, it's fine to append to a list too, I just prefer comprehensions when they fit the job. In particular, if you're just going to iterate once over this list anyway, you can turn it into a iterator comprehension by replacing [] by () and save some memory.

roywiggins · 2 years ago
You can alternately stick the logic into a function, which maintains the readability.

    def get_record(row):
        return {
                'name': row[0],
                'shares': int(row[1]),
                'price': float(row[2])
        }
    return [ get_record(r) for r in rows ]
or

    return list(map(get_record, rows))

twism · 2 years ago
Sigh (re: sibling comments) whatever happened to PEP 20 in particular:

``` There should be one-- and preferably only one --obvious way to do it. ```

BerislavLopac · 2 years ago
This is absolutely the most misinterpreted line in the Zen of Python. The key word there is obvious, not one.
WesolyKubeczek · 2 years ago

    with open(filename) as f:
        rows = csv.reader(f)
        next(rows)
        return [
            {
                'name': row[0],
                'shares': int(row[1]),
                'price': float(row[2]),
            } for row in rows
        ]

ayhanfuat · 2 years ago
Why don't you like it? I am asking because you are asking for a better way. Better in what way?
senex · 2 years ago
You could “yield” the record instead of constructing the list. This makes “read_portfolio” into an iterator instead of returning a list. Use a list comprehension or list constructor to convert the iterator to a list if needed.

Deleted Comment

rjh29 · 2 years ago
You could make read_portfolio a generator ( https://wiki.python.org/moin/Generators ). But that might confuse inexperienced Python programmers.

Personally the way you've done it is the most Pythonic IMO. List comprehensions are great but would be less readable in this case.

felixhummel · 2 years ago
If there is no library for your case like pandas (or even csv.DictReader), you could always use an iterator:

    def iter_portfolio(rows):
         for row in rows:
             yield {'name': row[0]}
    
    rows = ...
    portfolio = list(iter_portfolio(rows))

Sentack · 2 years ago
I too have run into this situation, and while list comprehension makes it possible, it's never clean looking.

Honestly, this is the approach I've been using even though I hate it. Specially if your code is going to be read by anyone other than you.

Deleted Comment

drcongo · 2 years ago
I actually kinda like that pattern in terms of readability, though I think a generator would outperform it.
travisjungroth · 2 years ago
If it’s a lot of data and part of a pipeline you’ll get memory saving. 1,000 lines and reading from a CSV it won’t matter really.
BeetleB · 2 years ago
You could try to squeeze it all into a list comprehension.
rami3l · 2 years ago
I have also encountered this quite often. I'll say the ideal solution would be "postfix streaming methods" like `.filter` and `.map`. Unfortunately, Python doesn't have those (prefix `filter`s and `map`s are not even close), and you have comprehension expressions at best. To make things worse, complex comprehensions can also create confusion, so for your particular example I'll probably say it's acceptable. It could be better if you use unpacking instead of indexing though, as others have pointed out.
charlysl · 2 years ago
Another good (and entertaining) resource is James Powell's talk "So you want to be a Python expert" [1], the best explanation I've seen of decorators, generators and context managers. Good intro to the Python data (object) model too.

[1] https://youtu.be/cKPlPJyQrt4

agumonkey · 2 years ago
Let's assemble an 'expert python' curated list.
xwowsersx · 2 years ago
Yep, just recently rewatched this. James is an extremely clear presenter. I recommend this to everyone trying to get to that next level with Python.
greatpostman · 2 years ago
David beazley, also known as the Jimi Hendrix of python
bshipp · 2 years ago
I saw his talk live in 2014 and the dude is amazing. I loved his summary of building Python libraries from the ground up during legal discovery because he discovered a hidden Python installation on the terminal his opponents gave him that allowed him to parse thousands of documents very quickly.

https://youtube.com/watch?v=RZ4Sn-Y7AP8

mark_l_watson · 2 years ago
This is very cool, good for Beazley for making this freely available. I really should take the time to work through this material. For 40 years I have been a “Lisp guy”, slightly looking down on other languages I sometimes used at work like C++, Java, etc.

However, because of available ML/DL/LLM frameworks and libraries in Python, Python has been my go to language for years now. BTW, I love the other comment here that Beazley is the Jimi Hendrix of Python. Only those of us who enjoyed hearing Hendrix live really can get this.

heap_perms · 2 years ago
You've listened to Hendrix live? Well now I'm envious ;)
mark_l_watson · 2 years ago
Well, it was just the one time.