dongobread (u/dongobread)

dongobread commented on Open models by OpenAI openai.com/open-models/... · Posted by u/lackoftactics

teitoklien · 25 days ago

for now, i wouldnt rank any model from openai in coding benchmarks, despite all the false messaging they are giving, almost every single model openai has launched even the high end o3 expensive models are absolutely monumentally horrible at coding tasks. So this is expected.

If its decent in other tasks, which i do find openai often being better than others at, then i think its a win, especially a win for the open source community that even AI labs that pionered the hype of Gen AI who didnt want to ever launch open models are now being forced to launch them. That is definitely a win, and not something that was certain before.

dongobread · 25 days ago

It is absolutely awful at writing and general knowledge. IMO coding is its greatest strength by far.

dongobread commented on Open models by OpenAI openai.com/open-models/... · Posted by u/lackoftactics

cco · 25 days ago

The lede is being missed imo.

gpt-oss:20b is a top ten model (on MMLU (right behind Gemini-2.5-Pro) and I just ran it locally on my Macbook Air M3 from last year.

I've been experimenting with a lot of local models, both on my laptop and on my phone (Pixel 9 Pro), and I figured we'd be here in a year or two.

But no, we're here today. A basically frontier model, running for the cost of electricity (free with a rounding error) on my laptop. No $200/month subscription, no lakes being drained, etc.

I'm blown away.

dongobread · 25 days ago

How up to date are you on current open weights models? After playing around with it for a few hours I find it to be nowhere near as good as Qwen3-30B-A3B. The world knowledge is severely lacking in particular.

dongobread commented on Why Austin Is Falling Out of Favor for Tech Workers wsj.com/podcasts/tech-new... · Posted by u/CharlesW

dongobread · 2 months ago

This is a little misleading. The data they quote is based on their previous article[1], which just uses this analysis[2] provided by a VC company. Funnily enough the same VC company put a seperate clickbaitish article just a year before that one, claiming the exact opposite findings (about startups ditching SV).

I would guess a lot of these annual trends are just random fluctuations in their dataset, though to be honest I wonder how they're even trying to estimate this kind of information.

[1] https://www.wsj.com/articles/austins-reign-as-a-tech-hub-mig...

[2] https://www.signalfire.com/blog/signalfire-state-of-talent-r...

[3] https://www.signalfire.com/blog/state-of-talent-tech-trends

dongobread commented on Meta invests $14.3B in Scale AI to kick-start superintelligence lab nytimes.com/2025/06/12/te... · Posted by u/RyanShook

paxys · 3 months ago

This is exactly why Zuck feels he needs a Sam Altman type in charge. They have the labs, the researchers, the GPUs, and unlimited cash to burn. Yet it takes more than all that to drive outcomes. Llama 4 is fine but still a distant 6th or 7th in the AI race. Everyone is too busy playing corporate politics. They need an outsider to come shake things up.

dongobread · 3 months ago

The corporate politics at Meta is the result of Zuck's own decisions. Even in big tech, Meta is (along with Amazon) rather famous for its highly political and backstabby culture.

This is because these two companies have extremely performance-review oriented cultures where results need to be proven every quarter or you're grounds for laying off.

Labs known for being innovative all share the same trait of allowing researchers to go YEARS without high impact results. But both Meta and Scale are known for being grind shops.

dongobread commented on Trump temporarily drops tariffs to 10% for most countries cnbc.com/2025/04/09/trump... · Posted by u/bhouston

barbazoo · 5 months ago

I believe at this point it's a simple game of leverage. Look how much damage I can inflict. Now let's negotiate.

dongobread · 5 months ago

The US has crashed its own stock market, tanked its own government's approval ratings, and had its own business leaders speak out against the government. This definitely does not increase leverage.

dongobread commented on Andrew Gelman: Is marriage associated with happiness for men or for women? statmodeling.stat.columbi... · Posted by u/paulpauper

danenania · a year ago

From a quoted article in this piece[1]:

> unmarried and childless women are the happiest subgroup in the population

Isn't this kind of scary from a sociological and demographic perspective? It would seem to indicate that we've built a socioeconomic system with self-terminating incentives.

I consider myself very liberal/libertarian and individualist vs. collectivist, and I have a daughter, so I'm (angrily) unsympathetic to ideas that even hint at restricting women's freedom on this basis. I'd easily prefer the gradual dissolution of western civilization to my daughter being trapped in an abusive marriage with no right to divorce, being forced to give birth to a child she doesn't want, and so on.

However, all that aside, it does seem like a serious bug in our system. I wonder how we can we flip this statistic without restricting anyone's freedom?

1 - https://www.theguardian.com/lifeandstyle/2019/may/25/women-h...

dongobread · a year ago

The paragraph immediately after that paragraph explains that the study was based off faulty analysis (and links to the below article).

https://www.vox.com/future-perfect/2019/6/4/18650969/married...

dongobread commented on GPT-4 LLM simulates people well enough to replicate social science experiments treatmenteffect.app/... · Posted by u/thoughtpeddler

dongobread · a year ago

I'm very skeptical on this, the paper they linked is not convincing. It says that GPT-4 is correct at predicting the experiment outcome direction 69% of the time versus 66% of the time for human forecasters. But this is a silly benchmark because people are not trusting human forecasters in the first place, that's the whole purpose for why the experiment is run. Knowing that GPT-4 is slightly better at predicting experiments than some human guessing doesn't make it a useful substitute for the actual experiment.

dongobread commented on XLSTMTime: Long-Term Time Series Forecasting with xLSTM arxiv.org/abs/2407.10240... · Posted by u/beefman

optimalsolver · a year ago

Reminder: If someone's time series forecasting method worked, they wouldn't be publishing it.

dongobread · a year ago

They definitely would and do, the vast majority of time series work is not about asset prices or beating the stock market

dongobread commented on XLSTMTime: Long-Term Time Series Forecasting with xLSTM arxiv.org/abs/2407.10240... · Posted by u/beefman

sigmoid10 · a year ago

Transformers are just MLPs with extra steps. So in theory they should be just as powerful. The problem with transformers is simultaneously their big advantage: They scale extremely well with larger networks and more training data. Better so than any other architecture out there. So if you had enormous datasets and unlimited compute budget, you could probably do amazing things in this regard as well. But if you're just a mortal data scientist without extra funding, you will be better off with more traditional approaches.

dongobread · a year ago

I think what you say is true when comparing transformers to CNNs/RNNs, but not to MLPs.

Transformers, RNNs, and CNNs are all techniques to reduce parameter count compared to a pure-MLP model. If you took a transformer model and replaced each self-attention layer with a linear layer+activation function, you'd have a pure MLP model that can model every relationship the transformer does, but can model more possible relationships as well (but at the cost of tons more parameters). MLPs are more powerful/scalable but transformers are more efficient.

Compared to MLPs, transformers save on parameter count by skimping on the number of parameters devoted to modeling the relationship between tokens. This works in language modeling, where relationships between tokens isn't that important - you can jumble up the words in this sentence and it still mostly makes sense. This doesn't work in time series, where relationships between tokens (timesteps) is the most important thing of all. The LTSF paper linked in the OP paper also mentions this same problem: https://arxiv.org/pdf/2205.13504 (see section 1)

dongobread commented on XLSTMTime: Long-Term Time Series Forecasting with xLSTM arxiv.org/abs/2407.10240... · Posted by u/beefman

carbocation · a year ago

> In recent years, transformer-based models have gained prominence in multivariate long-term time series forecasting

Prominence, yes. But are they generally better than non-deep learning models? My understanding was that this is not the case, but I don't follow this field closely.

dongobread · a year ago

From experience in payments/spending forecasting, I've found that deep learning generally underperform gradient-boosted tree models. Deep learning models tend to be good at learning seasonality but do not handle complex trends or shocks very well. Economic/financial data tends to have straightforward seasonality with complex trends, so deep learning tends to do quite poorly.

I do agree with this paper - all of the good deep learning time series architectures I've tried are simple extensions of MLPs or RNNs (e.g. DeepAR or N-BEATS). The transformer-based architectures I've used have been absolutely awful, especially the endless stream of transformer-based "foundational models" that are coming out these days.