Python 3.7: Introducing Data Classes

As noted in the PEP data classes is a less fully-featured stdlib implementation of what attrs already provides. Unless you’re constrained to the stdlib (as those who write CPython itself of course are) you should consider taking a look at attrs first.

http://www.attrs.org/en/stable/

shoyer · 7 years ago

Less fully-featured, sure, but also a bit more cleanly designed (in my opinion). Are there features from attrs that you would miss with dataclasses?

metalliqaz · 7 years ago

This is spot on. The design of attrs remines me a little bit of the syntax from a declarative ORM, for example. I'm sure it can do very powerful things that I've not had occasion to use, but it is heavy. The @dataclass format is very clean and seems more like the syntactic sugar that I expect from Python.

One of the prime uses of a dataclass is to be a mutable namedtuple. And the syntax can be almost identical:

    Part = make_dataclass('Part', ['part_num', 'description', 'quantity'])

(from Raymond Hettinger's twitter)

This has the added benefit of not requiring type hinting, if you don't want to bother with such things.

zackelan · 7 years ago

attrs also has a feature that dataclasses don't currently [0]: an easy way to use __slots__ [1].

It cuts down on the per-instance memory overhead, for cases where you're creating a ton of these objects. It can be useful even when not memory-constrained, because it will throw AttributeError, rather than succeeding silently, if you make a typo when assigning to an object attribute.

0: https://www.python.org/dev/peps/pep-0557/#support-for-automa...

1: http://www.attrs.org/en/stable/examples.html#slots

kstrauser · 7 years ago

Does that still matter now that PEP 412 (https://www.python.org/dev/peps/pep-0412/) is implemented in Python 3.3 and newer?

adamc · 7 years ago

Does attrs support type hints? I didn't see it in a quick skim...

One thing the stdlib implementation has going for it: better naming. attr.ib() is not exactly crystal-clear.

nicwolff · 7 years ago

You can `from attr import attrib, attribs` and use those instead of `@attr.s` and `attr.ib()`.

hynek · 7 years ago

It does: http://www.attrs.org/en/stable/examples.html#types

oblio · 7 years ago

This library? http://www.attrs.org/en/stable/

reaperhulk · 7 years ago

Yep, I’ll add the URL in my comment, sorry about that.

Coming from C++ it feels really weird that you can simply assign instance.new_name = value from anywhere without properly declaring it beforehand. You also never really know what you get or if somebody modified your instance members from the outside.

BerislavLopac · 7 years ago

I can only imagine how weird it must seem that you can override methods of instance objects and even classes, or even replace a whole class of an instance with another.

    >>> class Foo:
    ...     def bar(self):
    ...         print('foo')
    ... 
    >>> class Baz:
    ...     def bar(self):
    ...         print('baz')
    ... 
    >>> f = Foo()
    >>> f
    <__main__.Foo object at 0x7fa311e7a278>
    >>> f.bar()
    foo
    >>> f.__class__ = Baz
    >>> f
    <__main__.Baz object at 0x7fa311e7a278>
    >>> f.bar()
    baz

alkonaut · 7 years ago

Does that work even if the types had fields? What about it the fields had a different total size? What if Baz had no parameterless constructor (I.e only had a contractor that guaranteed arg > 0 for example)?

Is this like an unsafe pointer cast where “you are responsible, and it will likely blow up spectacularly if you don’t know what you are doing” or is it something safer that will magically work e.g with types of different size?

amyjess · 7 years ago

This is one of my favorite explorations of the crazy things that are possible in Python: https://www.youtube.com/watch?v=H2yfXnUb1S4

goatlover · 7 years ago

JS & PHP let you do this as well. One advantage is that you don't have to adhere to a rigid class structure and be forced to refactor or create a new class every-time you need add a new property or method. And sometimes you want a property/method for just that particular instance, and not all members.

As with most things, there are trade-offs.

weberc2 · 7 years ago

> One advantage is that you don't have to adhere to a rigid class structure and be forced to refactor or create a new class every-time you need add a new property or method.

I wouldn't qualify this as an advantage; it encourages bad code and it precludes a lot of good tooling (including tooling which would automate the sort of refactoring you'd like to avoid).

mixmastamyk · 7 years ago

Doesn't happen maliciously in practice, also can be very handy when you need to attach a little extra data for the ride. If you need extra assurance there are techniques to make the instance "very" read-only.

gtaylor · 7 years ago

If you run a linter, the cases where you are doing this outside of __init__ will usually be pointed out. You can silence the warning/error on a case by case basis if you really need to do it.

craigds · 7 years ago

Which linter does that? flake8 doesn't AFAIK.

Raymond Hettinger had a pretty good presentation on Data Classes and how they relate to things like named tuples and a few recipes/patterns. It was linked on Reddit[0] but it looks like the video has been removed from YouTube. His slides are online[1], though.

[0] https://www.reddit.com/r/Python/comments/7tnbny/raymond_hett...

[1] https://twitter.com/i/web/status/959358630377091072

tommikaikkonen · 7 years ago

I love using attrs, like the idea of bringing something similar to the standard library, but strongly disagree with the dataclasses API. It treats untyped Python as a second class citizen.

This is what I'd prefer

  from dataclasses import dataclass, field

  @dataclass
  class MyClass:
    x = field()

but it produces an error because fields need to be declared with a type annotation. This is the GvR recommended way to get around it:

  @dataclass
  class MyClass:
    x: object

You could use the typing.Any type instead of object, but then you need to import a whole typing library to use untyped dataclasses. I highly prefer the former code block.

There's a big thread discussing the issue on python-dev somewhere. Also some discussion in https://github.com/ericvsmith/dataclasses/issues/2#issuecomm...

Anyway, it's not a huge issue—attrs is great and there's no reason not to use it instead for untyped Python.

sleavey · 7 years ago

Yeah, it seems strange to force people to use type hints when it has had such a mixed reception. I really tried to use type hints with a new project a few months ago, but ended up stripping it all out again because it's just so damn ugly. I wish it were possible to fully define type hints in a separate file for linters, and not mix it in with production code. It's kind of possible to do it, but not fully [1], and mixing type hints inline and in separate files is in my opinion even worse than one or the other.

[1] https://stackoverflow.com/questions/47350570/is-it-possible-...

rezistik · 7 years ago

I've always wanted a programming UI similar to RapGenius's UI. With annotations and docs being opened in a form panel.

Rotareti · 7 years ago

It's great that we have simple/clean declarations for NamedTuples an (Data)classes now. But I wonder why they chose two different styles for creating them. This for NamedTuples:

    from typing import NamedTuple

    class Foo(NamedTuple):
        bar: str
        baz: int

and this for DataClasses:

    @dataclass
    class Foo:
        bar: str
        baz: int

When you write it that way it makes me wonder why there isn't a DataClass type

joshuamorton · 7 years ago

The short answer is that the only way to do what dataclasses do as a base class is via python metaclasses, and you can only have one metaclass. So this way, you can dataclassify something that inherits from a metaclass.

Deleted Comment

aserafini · 7 years ago

I love that Peter Norvig left an improvement to the __post_init__ method in the comments section of the JetBrains blog. I wonder if he uses PyCharm?

I'm happy to see data classes. I think something like this exists in 3.6:

    class Person(typing.NamedTuple):
        name: str
        age: int

But I don't think it supports the __post_init__; however, constructors have no business doing parsing like this anyway, so unless I'm missing something, deriving from `typing.NamedTuple` seems strictly better than `@dataclass` insofar as it seems less likely to be abused.

Tuples are read-only.

Ah, of course. Good point. I tend to write things in an immutable style, so I don't usually pay attention to this.

caioariede · 7 years ago

Also, AFAIK you cannot add methods to NamedTuple, making it much less flexible than data classes.

std_throwaway · 7 years ago

foxhop · 7 years ago

I have to be honest, coming from Python 2.3 (2004ish), I don't recognize "new" Python anymore. I think it's mostly regarding type definitions.

toopsss · 7 years ago

It’s not too bad I think, it’s just an evolution really. You can probably grok the basics of the type annotations in a short sit down. I can’t even remember when decorators were introduced but that even more greatly changed how python was written. I’ve been using python since 1.6 and I always thought the amount of repetition was ridiculous. I bet I’m not the only one that has written a “dsl” of what attrs and this pep does 1000 times using the facilities python had at the time: metaclasses, then decorators. Of course all these implementations were rushed, half assed and barely production quality. Despite any warts attrs is a pleasure to use. Type annotations boost IntelliJ/pycharm already quite clever assistance. One lingering thing is attrs named_attrs that while syntactically the best approach in my mind doesn’t work well with IntelliJ. So hopefully this will address it.

gdwatson · 7 years ago

It's relatively recent. IMHO Python 3.5 to 3.7 feel like the language is going in a different direction than it did before -- type hints and the handling of asynchrony in particular.

If been using a lot of Python/JS/TypeScript in the last couple of years and it seems like each new release brings them closer together.

prlin · 7 years ago

After seeing the huge improvements that JavaScript has gone through over the years I'm all for language updates. Same with Java and C++ (although not as much for Java and I don't know C++ but I always hear C++11 is "new").

Deinumite · 7 years ago

Java 8 (and 9 more so) bring a lot of changes but more focused on core libraries than the language itself. Not as drastic as the changes JS has seen.

C++11 though is quite a bit different. Probably the biggest change being that raw pointers * for the most part should not be used anymore.

cup-of-tea · 7 years ago

Python has grown a lot since then. Back then it was this "better scripting language" that every Linux user kinda knew. Now it's being used much more widely and that just wouldn't cut it any more.