Readit News logoReadit News
whimsicalism commented on XSLT removal will break multiple government and regulatory sites   github.com/whatwg/html/is... · Posted by u/colejohnson66
whimsicalism · 2 days ago
i’m strongly in favor of simplifying the standard
whimsicalism commented on CEO pay and stock buybacks have soared at the largest low-wage corporations   ips-dc.org/report-executi... · Posted by u/hhs
tw04 · 2 days ago
What’s the motivation of the CEO to increase employee wages if his compensation isn’t tied to theirs?

There’s this perverse belief that companies should exist to enrich the wealthy shareholders at the expense of the workers and it’s put us dangerously close to a complete collapse of the social contract.

whimsicalism · 2 days ago
> What’s the motivation of the CEO to increase employee wages if his compensation isn’t tied to theirs?

To pay the asking price for the needed labor?

whimsicalism commented on AI tooling must be disclosed for contributions   github.com/ghostty-org/gh... · Posted by u/freetonik
hodgehog11 · 3 days ago
How does this not lead to a situation where no honest person can use any AI in their submissions? Surely pull requests that acknowledge AI tooling will be given significantly less attention, on the grounds that no one wants to read work that they know is written by AI.
whimsicalism · 2 days ago
i'm happy to read work written by AI and it is often better than a non-assisted PR
whimsicalism commented on "Remove mentions of XSLT from the html spec"   github.com/whatwg/html/pu... · Posted by u/troupo
dev0001 · 4 days ago
The vast majority of feedback on the GitHub issue was respectful — unless you consider opposing the proposal disrespectful.
whimsicalism · 4 days ago
There’s not nearly enough comments for “vast majority” to be a useful descriptor, and I saw a significant number of uncivil, rude comments.
whimsicalism commented on "Remove mentions of XSLT from the html spec"   github.com/whatwg/html/pu... · Posted by u/troupo
Y_Y · 5 days ago
> @whatwg whatwg locked as too heated and limited conversation to collaborators

Too heated? Looked pretty civil and reasonable to me. Would it be ridiculous to suggest that the tolerance for heat might depend on how commenters are aligned with respect to a particular vendor?

whimsicalism · 4 days ago
I disagree - I saw a number of comments I would consider rude and unprofessional and once a PR gets posted on HN, frankly it typically gets much worse.

I find people on HN are often very motivated reasoners when it comes to judging civility, but there’s basically no excuse for calling people “fuckers” or whatever.

whimsicalism commented on Who Invented Backpropagation?   people.idsia.ch/~juergen/... · Posted by u/nothrowaways
DoctorOetker · 5 days ago
Not at all.

There are mainly 2 forms of AD: forward mode (optimal when the function being differentiated has more outputs than latent parameter inputs) and reverse mode (when it has more latent parameter inputs than outputs). If you don't understand why, you don't understand AD.

If you understand AD, you'd know why, but then you'd also see a huge difference with symbolic differentiation. In symbolic differentiation, input is an expression or DAG, the variables being computed along the way are similar such symbolic expressions (typically computed in reverse order in high school or uni, so the expression would grow exponentially with each deeper nested function, and only at the end are the input coordinates filled into the final expression, to end up with the gradient). Both forward and reverse mode have numeric variables being calculated, not symbolic expressions.

The third "option" is numeric differentiation, but for N latent parameter inputs this requires (N+1) forward evaluations: N of the function f(x1,x2,..., xi + delta, ..., xN) and 1 reference evaluation at f(x1, ..., xN). Picking a smaller delta makes it closer to a real gradient assuming infinite precision, but in practice there will be irregular rounding near the pseudo "infinitesimal" values of real world floats; alternatively take delta big enough, but then its no longer the theoretical gradient.

So symbolic differentiation was destined to fail due to ever increasing symbolic expression length (the chain rule).

Numeric differentiation was destined to fail due to imprecise gradient computation and huge amounts (N+1, many billions for current models) of forward passes to get a single (!) gradient.

AD gives the theoretically correct result with a single forward and backward pass (as opposed to N+1 passes), without requiring billions of passes, or lots of storage to store strings of formulas.

whimsicalism · 5 days ago
I simply do not agree that you are making a real distinction and I think comments like "If you don't understand why, you don't understand AD" are rude.

AD is just simple application of the pushbacks/pullforwards from differential geometry that are just the chain rule. It is important to distinguish between a mathematical concept and a particular algorithm/computation for implementing it. The symbolic manipulation with an 'exponentially growing nested function' is a particular way of applying the chain rule, but it is not the only way.

The problem you describe with symbolic differentiation (exponential growth of expressions) is not inherent to symbolic differentiation itself, but to a particular naïve implementation. If you represent computations as DAGs and apply common subexpression elimination, the blow-up you mention can be avoided. In fact, forward- and reverse-mode AD can be viewed as particular algorithmic choices for evaluating the same derivative information that symbolic differentiation encodes. If you represent your function as a DAG and propagate pushforwards/pullbacks, you’ve already avoided swell

https://emilien.ca/Notes/Notes/notes/1904.02990v4.pdf

whimsicalism commented on Who Invented Backpropagation?   people.idsia.ch/~juergen/... · Posted by u/nothrowaways
dicroce · 6 days ago
Isn't it just kinda a natural thing once you have the chain rule?
whimsicalism · 5 days ago
yes
whimsicalism commented on Who Invented Backpropagation?   people.idsia.ch/~juergen/... · Posted by u/nothrowaways
whimsicalism · 5 days ago
Despite the common refrain about how different symbolic differentiation and AD are, they are actually the same thing.
whimsicalism commented on Anna's Archive: An Update from the Team   annas-archive.org/blog/an... · Posted by u/jerheinze
baq · 5 days ago
> ability to fund shadow libraries without fear of censorship

Bitcoin is much worse than cash in that regard

whimsicalism · 5 days ago
sure except for all the reasons cash doesn’t work for this
whimsicalism commented on OpenAI Progress   progress.openai.com... · Posted by u/vinhnx
ACCount37 · 7 days ago
GPT-2 was the first wake-up call - one that a lot of people slept through.

Even within ML circles, there was a lot of skepticism or dismissive attitudes about GPT-2 - despite it being quite good at NLP/NLU.

I applaud those who had the foresight to call it out as a breakthrough back in 2019.

whimsicalism · 7 days ago
i think it was already pretty clear among practitioners by 2018 at the latest

u/whimsicalism

KarmaCake day15917October 2, 2020
About
consequentialist | i [care about improving people's lives | research inference on sequences for work] | opinions absolutely my own
View Original