So out of interest: During inference, the embedding is simply a lookup table "token ID -> embedding vector". Mathematically, you could represent this as encoding the token ID as a (very very long) one-hot vector, then passing that through a linear layer to get the embedding vector. The linear layer would contain exactly the information from the lookup table.
My question: Is this also how the embeddings are trained? I.e. just treat them as a linear layer and include them in the normal backpropagation of the model?
(If you're curious about the details, there's an example of making indexing differentiable in my minimal deep learning library here: https://github.com/sradc/SmallPebble/blob/2cd915c4ba72bf2d92...)
Yes, I agree with the author, list comprehensions are readible, and I'd add, practical.
> it gets worse as the complexity of the logic increases
Ok, well this is something that someone would be unlikely to write... unless they wanted to make a contrived example to prove a point.It would be written more like:
See also the Google Python style guide, which says not to do the kind of thing in the contrived example above: https://google.github.io/styleguide/pyguide.html(Surely in any language it's possible to write very bad confusing code, using some feature of the language...)
And note:
^- list comprehension is just a convenient shorthand for a `for loop`, i.e.: Just moving the `line.split()` to the front and removing the empty list creation and append.