Deep Learning Systems

smokel · 2 years ago

I really like the sneers about some terrible naming, e.g. in the slide on "The self-attention operation":

> “keys”, “queries”, “values”, in one of the least-meaningful semantic designations we have in deep learning

And in the context of LSTMs:

> throwing in some other names, like “forget gate”, “input gate”, “output gate” for good measure

This makes me feel more confident about actually understanding these topics. Before, I was totally misled by the awkward terminology.

sva_ · 2 years ago

It gets even worse when you see how the ML community likes to bastardize terms from neuroscience.

But I think it is good to have memorable names that one can use to talk about the concepts verbally.

jeron · 2 years ago

simply operator overloading

visarga · 2 years ago

> “forget gate”, “input gate”, “output gate”

These are legit:

cell_h = cell_(h-1) * forget_gate + tanh(linear(input_h)) * input_gate

out_h = cell_h * output_gate

see? forget_gate masks input by multiplying with numbers in [0, 1], input_gate controls the external input, output_gate controls the output of course

p1esk · 2 years ago

The names make sense to me once I understood what they represent. How else would you like to call them?

smokel · 2 years ago

I understand the terminology in the context of the original papers, but to me the metaphors don't seem to generalize well, or at least not in the suggested direction.

This is probably just an unfortunate situation, due to progressive understanding. Pointing this out in the slides gave me a sense of relief.

Personally, I wouldn't put names to every minor part of an algorithm or formula that was discovered to work empirically. But then again, I haven't discovered anything, and the authors of the respective papers certainly deserve some credit for their inventions!

0cf8612b2e1e · 2 years ago

I also enjoy how ML likes to twist statistical nomenclature enough to be irritating.

chefandy · 2 years ago

I find open educational resources just so dang heartwarming.

junrushao1994 · 2 years ago

This is a particular unique course offering introduction on ML compilation and deployment :)

__rito__ · 2 years ago

I really liked the style of the instructor (Kolter), and the reason I like this course very much is because each lecture is followed by an implementation video along with the notebook file.

In most Deep Learning courses, the implementation is left to TAs and neither recorded nor made available. This course is an exception. Another bright exception is NYU Deep Learning course [0] by Yann LeCun and Alfredo Canziani. In that course, too, all recitations ("Practica") are recorded and made available. And Canziani is a great teacher.

[0]: https://atcold.github.io/pytorch-Deep-Learning

lyapunova · 2 years ago

I also really like the instructor for this course!

Seems like he really cares. I looked him up and I guess he was a student of Andrew Ng (the legendary ML lecturer!!) so it makes sense.

Deleted Comment

borg16 · 2 years ago

Thanks, this is a wonderful recommendation

a-dub · 2 years ago

very nice! I'm also a big fan of the vu amsterdam deep learning lectures on youtube. less systems focus but a really good intro to modern neural network based ML.

amelius · 2 years ago

Link: https://dlvu.github.io/

osti · 2 years ago

Are they going to offer this course again this Fall? I think you have to sign up in order to submit assignments so I'd like it if they offered it again soon.

gdiamos · 2 years ago

Excited to see the MLSys growing.

Deep learning methods are so computationally intensive, many advances have come through new algorithms and optimization methods.

abalaji · 2 years ago

Took this class when it was offered for the first time when I was at CMU--it's a really great course and well organized!

quickthrower2 · 2 years ago

This looks good as it cover’s hardware acceleration which is a gap in my knowledge I would like to start to understand