ersiees (u/ersiees) - Readit News

ersiees commented on Splash-free urinals: Design through physics and differential equations academic.oup.com/pnasnexu... · Posted by u/yeknoda

ersiees · 8 months ago

I think the European urinals already follow the proposed designs closely. They only compare to very old chunky designs.

ersiees commented on Why Fennel? fennel-lang.org/rationale... · Posted by u/behnamoh

ersiees · 8 months ago

I don’t love fennel, it usually dominates the whole taste of a dish for me

ersiees commented on How much do you think it costs to make a pair of Nike shoes in Asia? twitter.com/dieworkwear/s... · Posted by u/taubek

ersiees · 8 months ago

So, 100$ Nike shoes will soon be 125$.

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

nickpsecurity · a year ago

A while back, I was looking for a project amateurs could do for experimenting with Transformer alternatives and optimization algorithms. My concept was grabbing objective, test functions from the literature, making custom ones based on realistic data, and layering them together based on real-world depth. Then, training various approaches on them using consumer GPU’s or spot instances of high-end GPU’s.

What I read in this paper blew that idea out the water! I mean, it’s still doable but you’ve far exceeded it.

I love that you covered many types of structures, used 8x consumer GPU’s more like OSS folks do (widely-accessible pretraining), claim no copyright infringement for pretraining, and use enough techniques in ML that people can enjoy Googling stuff for days.

I do have some questions about what I might have overlooked in the paper.

1. Is the training data and code available to reproduce the model? And iteratively improve its architectural decisions?

2. Most authors claiming their data was legal or open were actually committing copyright infringement. Your method might dodge that if users generate their own synthetic data using methods they can verify aren’t themselves encumbered. Is that code available under open licensing? If not, would you offer it for a fee for companies or free for researchers?

3. What specific, common uses could amateurs try that would display the model’s ability in a business setting? (Both to drive more research or build products on the model.)

I thank you for your time.

ersiees · a year ago

Author here!

Thanks :)

1. Only for the first version, not for this version. I am sorry! 2. Yeah ours is guaranteed ok, as we wrote code to generate it basically just from plain torch ops. The code to run inference is available, just not the training code and data generation. 3. We have put it to work on time series data, which is very business relevant for example https://github.com/liam-sbhoo/tabpfn-time-series, and we have a table in the Appendix with all datasets we evaluate on in our main analysis to give you some ideas for possible datasets.

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

peepeepoopoo99 · a year ago

How can you train a tabular foundation model when the tabular features themselves are inherently domain-specific? Is there some kind of preprocessing step beforehand to match the inference time features with their closest analogues in the training set?

ersiees · a year ago

Yes, there are normalizations applied before the features are fed to the neural network. Additionally, the neural network is trained on a very diverse set of artificial datasets.

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

tmostak · a year ago

This looks amazing!

Just looking through the code a bit, it seems that the model both supports a (custom) attention mechanism between features and between rows (code uses the term items)? If so, does the attention between rows help improve accuracy significantly?

Generally, for standard regression and classification use cases, rows (observations) are seen to be independent, but I'm guessing cross-row attention might help the model see the gestalt of the data in some way that improves accuracy even when the independence assumption holds?

ersiees · a year ago

Author here: The new introduction of attention between features did make a big impact compared to the first variant of TabPFN. The old model handled every feature like it was completely different to be feature 5 vs 15, but actually features are typically more-or-less permutation invariant. So the logic is similar to why a CNN is better for images than an MLP.

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

pplonski86 · a year ago

Amazing results! Beating AutoML with single model is not easy :)

Could you please explain like I'm five what is doing a trick? You have model pre-trained on large set of small datasets and you leverage it to boost performance?

Training is fast, few seconds, but what is time needed to compute predictions?

How large is the model?

ersiees · a year ago

To put it very simply, the trick is that while the others train a new model for each problem, TabPFN is pre-trained to handle any kind of problem on the fly.

To draw a parallel to NLP: previously people trained a neural network for each kind of text classification they wanted to do, but then LLMs came around that pre-trained to learn to perform new tasks on the fly. Similarly, TabPFN learns to do new tasks on the fly just from the context (dataset) given.

Training and prediction in these models is by default one and the same, similar to how the prediction of the next token in an LLM is not split into learning from context and then doing the actual prediction. There is a way to split this even up, though, then the predictions, I believe, take something like 1/10s for medium-sized datasets.

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

hooloovoo_zoo · a year ago

Were your benchmark methods tuned per dataset or across datasets?

ersiees · a year ago

Tuned per dataset

ersiees commented on Show HN: TabPFN v2 – A SOTA foundation model for small tabular data nature.com/articles/s4158... · Posted by u/onasta

_giorgio_ · a year ago

It's probably the same model with the same limitations, released nearly two years ago?

https://arxiv.org/abs/2207.01848

ersiees · a year ago

No, it is *much* stronger, a different architecture and scales to 10x the number of examples. It can also do regression now, and handle categorical features. Please, have a quick look at the abstract before making such claims.