Readit News logoReadit News
ASpring commented on AccountingBench: Evaluating LLMs on real long-horizon business tasks   accounting.penrose.com/... · Posted by u/rickcarlino
Havoc · 2 months ago
Remember that test where you ask a LLM whether 9.11 or 9.9 is the bigger number? [Just checked gpt-4o still gets it wrong]

I don't think you'll find many sane CFOs willing to send the resulting numbers to the IRS based on that. That's just asking to get nailed for tax fraud.

It is coming for the very bottom end of bookkeeping work quite soon though, especially for first draft. There are a lot of people doing stuff like expense classification. And if you give an LLM an invoice it can likely figure out whether it's stationary or rent with high accuracy. OCR and text classification is easier for LLMs than numbers. Things like concur can basically do this already.

ASpring · 2 months ago
> Remember that test where you ask a LLM whether 9.11 or 9.9 is the bigger number? [Just checked gpt-4o still gets it wrong]

Interesting, 4o got this right for me in a couple different framings including the simple "Which number is larger, 9.9 or 9.11?". To be a full apologist, there are a few different places (a lot of software versioning as one) where 9.11 is essentially the bigger number so it may be an ambiguous question without context anyway.

ASpring commented on Reversal of autism symptoms among dizygotic twins: A case report   telegraph.co.uk/news/2024... · Posted by u/doener
ASpring · a year ago
Aside from the methodological concerns (very valid) this is also published in a set of journals of controversial repute: https://en.wikipedia.org/wiki/MDPI
ASpring commented on Relating t-statistics and the relative width of confidence intervals   statmodeling.stat.columbi... · Posted by u/luu
mjburgess · 2 years ago
One important caveat to all these methods is that the central limit theorem must hold for the sample means and this is an empirical condition, not something you can know statistically.

Another important caveat: many things we want to measure are not well-distributed to allow the CLT to hold. If it doesnt, the bulk of statistical methods don't work and the results are bunk.

Many quantities follow power-law distributions which would require trillions+ data points for the CLT to do its magic, ie., for the sample means of set A to be statistically-significantly different from set B would require 10^BIG if the property measured in A/B is powerlaw distributed.

Now, even worse: many areas of "science" study phenomena which is almost certainly power-law distributed, and use these methods to do so.

ASpring · 2 years ago
I'm not sure I'm fully understanding your point. Is it that constructing confidence intervals using t-statistics is inappropriate for a lot of real data that isn't distributed somewhat normally?
ASpring commented on Bayesian Structural Equation Modeling using blavaan (2022)   mc-stan.org/users/documen... · Posted by u/gravitate
3abiton · 2 years ago
I feel the technical barrier of adoption of bayesian methods is still high enough to deter potential users.
ASpring · 2 years ago
It is still the Linux of statistical software. Rstanarm is getting quite good though for R users.
ASpring commented on A11y Is Not Accessible   mobilea11y.com/blog/inacc... · Posted by u/PennRobotics
Garvi · 2 years ago
> The common abbreviation of accessibility is a11y.

What?

> We take the A and Y from the beginning and end of accessibility, and 11 for the number of letters in between. This abbreviation also creates a pleasing homophone for ‘ally.’

It took me way to long to figure out what this story is about and that it's not about feminism "allies" as men who align with feministic view are often called. It's all very confusing and the overtures on how clear it is feel out of touch with reality.

Don't get me wrong, I'm in full support of this. Of course I am, as it could happen to anyone with no fault of their ownm, to find themselves in a situation needing help.

A11y, really? This reeks of marketing. But we're talking about it, so it's effective.

ASpring · 2 years ago
It's pretty common in large tech companies to see this but agreed it is not a super intuitive abbreviation.
ASpring commented on Stanford Graduate Students Won Their Union Vote   twitter.com/StanfordGWU/s... · Posted by u/xavierstein
hedora · 2 years ago
When was this? Just before the pandemic, the grad students protested at UAW meetings and went on a wildcat strike after UAW ratified a contract the campus voted against.

Later, Janet Nepolitano released police drones and set up barricades to try to shut down the picket lines. Eventually covid ended the drama, but only after some students were deported (I assume. The plan was to deport them, but the story stopped making news once the 2020 lockdowns hit.)

Anyway, the UAW was a similar disaster at UC Berkeley a while back. There weren’t widespread protests, but there were salary caps for grad students, and the union eliminated health care coverage for a number of female problems (over student objections).

ASpring · 2 years ago
The wildcat strike is exactly what I was referring to as "We at UCSC didn't always agree with the course of the larger UAW 2685..."

The wildcat strike was led by the local union leadership after they abdicated their official positions iirc. Having that previous level of organization and identified leadership certainly made organizing wildcat actions easier.

Unions are more than just the highest level of leadership.

ASpring commented on Stanford Graduate Students Won Their Union Vote   twitter.com/StanfordGWU/s... · Posted by u/xavierstein
djcapelis · 2 years ago
> The reason I had healthcare during grad school was because the union won it right before I joined.

I’m sorry if this is weird, but as someone who also went to UCSC for grad school I found this a bit confusing. So I looked it up and you started at UCSC at 2014, yeah?

UCSC grad students had GSHIP coverage for years before that time. I myself was on it when I joined starting in 2009, and there’s plenty of documentation of fights folks had over trying to get better rates and coverage on GSHIP well before both our times: https://www.indybay.org/newsitems/2007/05/13/18415831.php (Which personally I thought was pretty good especially after the expansion of airlift coverage which was an unfortunately common problem for UCSC’s location “over the hill” from many tier 1 emergency rooms.)

Maybe I missed something when I was there 2009-2015. But what did the union representation and bargaining bring to the table there?

From a couple years ago, it doesn’t seem to have resulted in anything close to a reasonable or even livable stipend for a researcher. It was bad when I was in grad school, but I was pretty appalled to hear during the wildcat strikes ten years later that despite the increase in costs there didn’t seem to be that much change in the stipend amounts for graduate researchers. The students who were wildcatting out of frustration seemed to have a pretty good reason IMO.

I think that meets a pretty similar pattern of unions focusing on fighting about healthcare while leaving wages to stagnate over years of price increases, which I guess also applies in many unrepresented UC roles and in dynamics elsewhere. I personally didn’t see much difference between UAW’s representation and not when I was there, but I guess I didn’t have a huge point of comparison.

I hope whatever this new swell of support is provides livable stipends for young researchers though. So I hope I’m either wrong or grad student unions are able to win more in bargaining in the future. :)

ASpring · 2 years ago
Nice to see a fellow slug! I think you are correct on the timeline being further back. The narrative I recalled was that there was a major victory around health care fee remission before I joined but it looks as if that was part of the original contract the union negotiated [1].

I spent my final years at UCSC working through the systems they had set up internally (administration meetings with GSA, getting on committees of administrators as a grad student voice, working with on campus housing developers[2]) in order to improve housing availability and cost. We had marginal wins if anything. The strike the next year won everyone thousands of dollars toward housing every year. I understand the nuance of it being a wildcat strike but the entire organizing infrastructure there was from the union.

I agree with your final points and hope stipends will follow upwards in the near future.

[1] https://livinghistory.as.ucsb.edu/tag/uaw-local-2865/ [2] https://payusmoreucsc.com/history-of-students-attempts-to-en...

ASpring commented on Stanford Graduate Students Won Their Union Vote   twitter.com/StanfordGWU/s... · Posted by u/xavierstein
CraigRo · 2 years ago
It was a disaster at Wisconsin. Union favored the 8th year humanities students at the expense of the 1 and 2 year STEM students, and traded cash for benefits that really only affected people with families. Stipends were the lowest in the country among major research universities, and the union refused to let wages rise for in-demand TAs, so a bunch of them went off and got private sector jobs instead, limiting the size of the undergrad population. Scott Walker killed it, and those problems haven't come back.
ASpring · 2 years ago
It was great for us at UC Santa Cruz. The reason I had healthcare during grad school was because the union won it right before I joined. The reason they have a housing stipend now (in the most expensive rental market in the US[1]) is because the union fought for a cost-of-living allowance. We at UCSC didn't always agree with the course of the larger UAW 2685 but they did a lot for us.

I'm not sure what the system was like in the union in Wisconsin but I'm surprised that more STEM students didn't join and change the course of the union if they were that negatively affected. Our union was democratic almost to a fault but maybe the structure in Wisconsin was different.

[1]https://www.sfchronicle.com/realestate/article/most-expensiv...

ASpring commented on Moving beyond “algorithmic bias is a data problem”   cell.com/patterns/fulltex... · Posted by u/mwexler
fennecfoxen · 4 years ago
> I'm not convinced that is a responsibility that belongs to private corporations.

Private corporations are, by and large, the entities which execute their business using these algorithms, which their employees write.

They are already responsible for business decisions whether made using computers or otherwise. Indeed, who else would possibly manage such a thing? This is tantamount to saying that private corporations should have no business deciding how to execute their business — definitely an opinion you can have, it's just that it's an incredibly statist-central-planning opinion the end.

ASpring · 4 years ago
Maybe I wasn't very clear, I don't think every single machine learning model should be subject to regulation.

Rather I view it more along the lines of how the US currently regulates accessibility standards for the web or enforces mortgage non-discrimination in protected categories. The role of government here is identify a class of tangible harms that can result from unfair models deployed in various contexts and to legislate in a way to ensure those harms are avoided.

ASpring commented on Moving beyond “algorithmic bias is a data problem”   cell.com/patterns/fulltex... · Posted by u/mwexler
ASpring · 4 years ago
I wrote about this exact topic a few years back: "Algorithmic Bias is Not Just Data Bias" (https://aaronlspringer.com/not-just-data-bias/).

I think the author is generally correct but there is a lot of focus on algorithmic design and not on how we collectively decide what is fair and ethical for these algorithms to do. Right now it is totally up to the algorithm developer to articulate their version of "fair" and implement it however they see fit. I'm not convinced that is a responsibility that belongs to private corporations.

u/ASpring

KarmaCake day552November 19, 2012
About
Human-Centered AI Researcher

aaronlspringer.com

View Original