yomritoyj (u/yomritoyj)

yomritoyj commented on Pen and Paper Exercises in Machine Learning (2022) arxiv.org/abs/2206.13446... · Posted by u/ibobev

lucasoshiro · 5 months ago

Seems to be cool, but, one of thing that most annoys me on studying machine learning is that I may dive as deep as it is possible in theory, but I can't see how it connects to the practice, i. e. how it makes me choose the correct number of neurons in a layer, how many layers, the activation functions, if I should use a neural network or other techniques, and so on...

If someone have something explaining that I'll be grateful

yomritoyj · 5 months ago

ML practice has for the moment far outstripped ML theory. But even if ML theory catches up, the answers to your question will get likely be still dependent on the nature of the process generating the data and hence they would still have to be answered empirically. I see the value of theory more in providing a general conceptual framework. Just as the asymptotic theory of algorithms today cannot tell you which algorithm to use, but gives you some broad guidance.

yomritoyj commented on Deep Learning Is Not So Mysterious or Different arxiv.org/abs/2503.02113... · Posted by u/wuubuu

d_burfoot · 5 months ago

DNNs do not have special generalization powers. If anything, their generalization is likely weaker than more mathematically principled techniques like the SVM.

If you try to train a DNN to solve a classical ML problem like the "Wine Quality" dataset from the UCI Machine Learning repo [0], you will get abysmal results and overfitting.

The "magic" of LLMs comes from the training paradigm. Because the optimization is word prediction, you effectively have a data sample size equal to the number of words in the corpus - an inconceivably vast number. Because you are training against a vast dataset, you can use a proportionally immense model (e.g. 400B parameters) without overfitting. This vast (but justified) model complexity is what creates the amazing abilities of GPT/etc.

What wasn't obvious 10 years ago was the principle of "reusability" - the idea that the vastly complex model you trained using the LLM paradigm would have any practical value. Why is it useful to build an immensely sophisticated word prediction machine, who cares about predicting words? The reason is that all those concepts you learned from word-prediction can be reused for related NLP tasks.

[0] https://archive.ics.uci.edu/dataset/186/wine+quality

yomritoyj · 5 months ago

You may want to look at this. Neural network models with enough capacity to memorize random labels are still capable of generalizing well when fed actual data

Zhang et al (2021) 'Understanding deep learning (still) requires rethinking generalization'

https://dl.acm.org/doi/10.1145/3446776

yomritoyj commented on Ask HN: Is Knuth's TAOCP worth the time and effort? · Posted by u/toinewx

yomritoyj · 2 years ago

They are great books about some combinatorial mathematics inspired by programming problems. Not the best investment of time if you want to learn programming itself, because:

1. Most of the time you are not implementing foundational algorithms like sorting or SAT solving. You use mature implementation in libraries.

2. If you are in fact implementing foundational algorithms, then the existing volumes of Knuth cover only a very limited set of problem areas.

3. If you are implementing something in an area covered by Knuth, the books are worth looking into as a reference, but often writing performant and robust code requires considerations not in Knuth. This is because Knuth works with an idealised machine ignoring things like parallelism, memory hierarchy etc. and understandably does not get into software design issues.

yomritoyj commented on Let's Write a Malloc (2014) danluu.com/malloc-tutoria... · Posted by u/yla92

yomritoyj · 2 years ago

If you are going to do it at this level, it already exists as an example (section 8.7) in K & R.

yomritoyj commented on Modern Linux Tools vs. Unix Classics: Which Would I Choose? meetryanflowers.com/moder... · Posted by u/geocrasher

yomritoyj · 2 years ago

awk, sed etc. belong to the museum, now that we have so many tools and libraries that can handle structured data.

The whole early Unix obsession with plain text files was a step in the wrong direction. One grating holdover of that is the /proc filesystem. Instead of a typed, structured api you get the stuff as text to be parsed, file system trees and data embedded in naming conventions.

yomritoyj commented on I've procrastinated working on my thesis for more than a year thoughtsbyaashiq.bearblog... · Posted by u/memorable

shikshake · 3 years ago

Hey, I'm the author of this blog post. I didn't think anyone would even read it, let alone post it on HN. To be honest it's a little scary how many people are seeing this post since it's basically my 2am anxiety-fueled cry for help. But it's also motivating reading all these comments from people who've had similar experiences and are offering encouragement. I'm reading every one of your replies and they're giving me the confidence that I can be better and do better.

yomritoyj · 3 years ago

I procrastinated on my thesis. I got the usual advice: 'perfect is the enemy of the good', 'just get it written' etc. but it made no difference. With hindsight I now realize that I was working on an unfeasible project with inadequate preparation but did not want to accept that. How do other experienced people, not just your advisor, react to your project? If they are skeptical it may be a worth taking a cold hard look at the entire plan.

yomritoyj commented on “Nobody cares about your blog” mssprovenance.blogspot.co... · Posted by u/barry-cotter

yomritoyj · 3 years ago

Shameful conduct. And as a counterpoint to "nobody cares about your blog", a few days back I got a referee report asking me to cite a Twitter thread. This one: https://twitter.com/NilsEnevoldsen/status/152096733265462476...

Lot of actual academic work happening on Twitter, blogs and sites like MathOverflow.

yomritoyj commented on Power struggles among nice people edbatista.com/2022/03/pow... · Posted by u/rammy1234

yomritoyj · 3 years ago

I think this is an increasing problem as 'liberalism' becomes the official ideology of the elite, so that everyone pays lip-service to 'non-hierarchical' values and overt displays of power are looked down on. Power struggles and self-aggrandizement then go underground and are covered in extra layers of hypocrisy and doublespeak. Those who don't catch on, lose.

yomritoyj commented on Mastodon Explained mastodon.ie/@Ciaraioch/10... · Posted by u/ano-ther

yomritoyj · 3 years ago

For me Twitter is mainly a substitute to RSS: a central location to consume interesting content from diverse sources. In that role having an algorithmically curated as opposed to a strictly chronological feed is essential. For most people/entities I follow I'm interested in only a fraction of their tweets and I can rely on Twitter to do a good enough job of surfacing them for me. By following about a thousand accounts I can reliably hear about the latest trends in the areas that interest me by spending about half an hour each day.

On the other hand, right now I follow only a few dozen accounts on Mastodon and I'm already drowning in irrelevant posts. It can at best be a glorified group chat.