Ask HN: Daily practices for building AI/ML skills?

I got a masters degree in ML at a good school. I will say there’s pretty much nothing they taught me that I couldn’t have learned myself. That said, school focused my attention in ways I wouldn’t have alone, and provided pressure to keep going.

The single thing which I learned the most from was implementing a paper. Lectures and textbooks to me are just words. I understand them in the abstract but learning by doing gets you far deeper knowledge.

Others might suggest a more varied curriculum but to me nothing beats a one hour chunk of uninterrupted problem solving.

Here are a few suggested projects.

Train a baby neural network to learn a simple function like ax^2 + bx + c.

MNIST digits classifier. Basically the “hello world” of ML at this point.

Fine tune GPT2 on a specialized corpus like Shakespeare.

Train a Siamese neural network with triplet loss to measure visual similarity to find out which celeb you’re most similar to.

My $0.02: don’t waste your time writing your own neural net and backprop. It’s a biased opinion but this would be like implementing your own HashMap function. No company will ask you to do this. Instead, learn how to use profiling and debugging tools like tensorboard and the tf profiler.

DeathArrow · 2 years ago

ML is so much more than just neural networks.

I would start by taking a free university level course in statistics. Then I would continue with the basics: SVM, linear regression, naive Bayes, gradient boosting, neural nets etc. I would not only train and fine tune them, but I would also build simple ones myself instead of just using libraries. Then I would continue to what you said, participate in Kaggle competitions, try to solve real world problems.

I think that understanding the field from bottom up is priceless. Many people fine tune and train models, but they don't understand how that model works, nor do they know if the model they've chosen is the best fit for the problem they are trying to solve.

It's a rather long path if you really want to get good at it. Like in music: you can learn to play a tune by ear or you can learn to have a good, deep and thorough understanding of music.

wanderingmind · 2 years ago

This type of bottom up approach is terrible idea for a fast moving area like ML. Ultimately to get a job and make money in the area, you need to solve customer problems, starting with libraries and fine tuning is what needs to happen first. However you should try to learn fundamentals as you go and when you get stuck. It will take a very long time if you have a full time job or a student studying something else and doesn't help to get a job

amanda99 · 2 years ago

The risk with this is that you spend 10 hours learning statistics then get demotivated and never do the other 190 hours to get to the good stuff. Then quickly you forget the 10 hours of stats you learnt too as it's irrelevant and you don't use it.

For me, playing with things and doing cool & fun stuff is always the way to get deeper into something.

mickael-kerjean · 2 years ago

> SVM, linear regression, naive Bayes

When I studied ML in 2012, the very first course started with naive Bayes and went one from there. A decade after being away, I see a lot of people around me starting with neural nets to train a model that naive Bayes would be plenty enough for and never heard about naive Bayes. Is that only my experience?

IKantRead · 2 years ago

> don’t waste your time writing your own neural net and backprop.

I don't think you should be combining writing a neural network with doing backprop since I don't know anyone working with serious ML who is not using some sort of automatic differentiation library to handling the backprop part for them. I'm not entirely sure people even know what they're saying when they talk about backprop these days, and I suspect they're confusing it with gradient optimization.

But anyone seriously interested in ML absolutely should be building their own models from scratch and training them with gradient descent, ideally start with building out your own optimization routine rather than using a prepackaged one.

This is hugely important since the optimization part of the learning is really the heart of modern machine learning. If you really want to understand ML you should have a strong intuition about various methods of optimizing a given model. Additionally there are lots of details and tricks behind these models that are ignored if you're only calling an api around these models.

There's a world of difference between implementing an LSTM and calling one. You learn significantly more about what's actually happening by doing the former.

janalsncm · 2 years ago

> the optimization part of the learning is really the heart of modern machine learning

It’s an important component but I wouldn’t say it’s the main factor. ML is ultimately about your data, so understanding it is critical. Feature selection and engineering, sampling, subspace optimization (e.g. ESMMs) and interpreting the results correctly are really the main places you can squeeze the most juice out. Optimizing the function is the very last step.

Basically, you can go ahead and optimize down to the very bottom of the global min but a model with better features and better feature interactions is going to win.

Further, there are a ton of different optimizers available. SGD, Adam, Adagrad, RMSProp, FTRL, etc. With just one hour a day, you could spend six months simply writing and understanding the most popular ones.

mortallywounded · 2 years ago

> I will say there’s pretty much nothing they taught me that I couldn’t have learned myself. That said, school focused my attention in ways I wouldn’t have alone, and provided pressure to keep going.

I have a masters degree in computer science and took a fair share of ML graduate courses. That pretty much summed up what I was thinking. They basically forced me to sit and learn something I wouldn't have alone.

Now-- I'm not saying you need to go to grad school. You could buy some ML textbooks and force yourself through them and go from there... but how many people have that grit? I wouldn't have been one of them :)

janalsncm · 2 years ago

I hate to say it, but the diploma also matters. Having an MS next to your name means employers will give you the time of day that others won’t get. I really don’t like that this is the way things work, but it is.

RamblingCTO · 2 years ago

Also compsci master w/ only ML courses here. I actually enjoyed it and learned tons of stuff which I'd never have learned on my own. Who learns boltzmann machines, self organizing maps (and such) or fourier transformation/wavelets and stuff like that. I've never seen any of those in most ML books or courses and I really enjoyed learning all of these (and these are only the things I can think of right now, it's been a while).

dacryn · 2 years ago

I'd argue backprop is still handy just to learn the basics

It doesn't have to be production ready of course, but spending 3-4 hours to write it out in code, debug a few steps, ... are useful in my opinion.

or at least watch the karpathy video and try to follow along

manojlds · 2 years ago

As Andrej Karpathy says, understanding it deeply is required to take your skills to the next level and not make mistakes. As a practice, one should implement their own neutral net and manual backprop to start their understanding.

culi · 2 years ago

> My $0.02: don’t waste your time writing your own neural net and backprop.

I think it's worth doing a very simple implementation at least once to ensure you have the fundamentals memorized. It's not actually that complicated to implement a very simple one. Maybe a day or two -long project

vintermann · 2 years ago

I think that getting a feel for gradients and how it all works is a good reason for implementing your own - once.

Don't worry about what companies will ask you to do unless you absolutely have to.

atomicnature · 2 years ago

Love the suggestions, especially the one about implementing papers.

Do you have any starters on how one selects papers in the early days, to implement?

Also - any great papers you recommend beginners expose themselves to?

janalsncm · 2 years ago

The best site imo is Papers With Code. State of the art benchmarks, the papers which achieved them (along with previous papers) and github repos to actual implementations.

I wouldn’t recommend papers to absolute beginners though. For them, it’s best to go to HuggingFace, find a model that seems interesting and play with it in a Jupyter notebook. You’ll get a lot more bang for your buck.

IshanMi · 2 years ago

Try the "historical papers" on this repo: https://github.com/aimerou/awesome-ai-papers And also you can find papers with their implementations in code here: http://paperswithcode.com

vintermann · 2 years ago

I don't know if anyone does it still, but a few years ago there were a lot of papers suggesting more or less clever alternatives to ReLU as activation function. There was also a whole zoo of optimizers as alternatives to SGD.

Those papers were within reach for me. Even if the math (or the collossal search effort) needed to find them was out of reach, implementing them wasn't.

There were some things besides optimizers and activation functions too. In particular I remember Dmitri Ulyanov's "Deep Image Priors" paper. He did publish code, but the thing he explored - using the implicit structure in a model architecture without training (or, training on just your input data!) is actually dead simple to try yourself.

I'm sure if you just drink from the firehose of the arxiv AI/ML feeds, you'll find something that tickles your interest that you can actually implement. Or at least play with published code.

spaniard89277 · 2 years ago

I only have old hardware at home. How viable is to practice this stuff on my own projects (and I'd like to touch JS as much as possible, despite everyone being on python)

janalsncm · 2 years ago

You don’t need your own hardware. You can use Google Colab for free.

Most of the action happens in python. That being said, there’s a library called Tensorflow JS. It has some pre-trained models you can use off the shelf and run from your browser. Things like face detection and sentiment analysis.

sanderjd · 2 years ago

This seems like great advice.

You say don't write your own neural net and backprop implementation. That makes sense to me. What do you suggest using instead, for your suggested projects? I'm guessing tensorflow, based on your suggestions on profiling and debugging tools? Do the papers / projects you suggest map straightforwardly onto a tensorflow implementation, rather than a custom one?

uoaei · 2 years ago

Implementations in Tensorflow are widely considered to be technical debt. Internally, Google has mostly switched to JAX. PyTorch now has torch.compile and exports to ONNX so there's little reason to use Tensorflow these days except in niche cases.

alephnan · 2 years ago

> I got a masters degree in ML at a good school. I will say there’s pretty much nothing they taught me that I couldn’t have learned myself.

Are people actually going into masters degree to learn? I thought the whole point of paying for masters is just credentialism

janalsncm · 2 years ago

Technically you could do it on your own. Practically speaking? I quit my job and went to school full time. I spent 2 years studying the stuff practically 10 hours a day. The only way this is socially acceptable is if you get a piece of paper at the end which says you did it.

Roughly speaking, the roadmap for a typical ML/AI student looks like this:

0) Learn the pre-requisites of math, CS, etc. That usually means calc 1-3, linear algebra, probability and statistics, fundamental cs topics like programming, OOP, data structures and algorithms, etc.

1) Elementary machine learning course, which covers all the classic methods.

2) Deep Learning, which covers the fundamental parts of DL. Note, though, this one changes fast.

From there, you kind of split between ML engineering, or ML research.

For ML engineering, you study more technical things that relate to the whole ML-pipeline. Big data, distributed computing, way more software engineering topics.

For ML research, you focus more on the science itself - which usually involves reading papers, learning topics which are relevant to your research. This usually means having enough technical skills to translate research papers into code, but not necessarily at a level that makes the code good enough to ship.

I'll echo what others have said, though, use to tools at hand to implement stuff. It is fun and helpful to implement things from scratch, for the learning, but it is easy to get extremely bogged down trying to implement every model out there.

When I tried to learn "practical" ML, I took some model, and tried to implement it in such a way that I could input data via some API, and get back the results. That came with some challenges:

- Data processing (typical ETL problem)

- Developing and hosting software (core software engineering problems)

- API development

And then you have the model itself, lots of work goes toward that alone.

te_chris · 2 years ago

As someone a wee bit along the journey but with the maths dragging me down a bit, I've found that, while in a perfect world I'd love to get my maths up to solid 2nd year undergrad level, it's just going to take me another year or so. That hasn't stopped me moving forwards. I understand y = ax + b, bits of linear algebra, gradient descent, but I still don't have the critical intuition to pass a college level maths exam.

This has helped me build the intuition for understanding these concepts in ML, and as an experienced developer I've found I've been able to pick up the ML stuff relatively easily - it's mostly libraries at the practical level. This has in turn shown me two things: ML is data quality, prep, and monitoring; I actually like the maths: it annoys me that there's this whole branch of knowledge that I don't grok intuitively and I want to know more. As I go deeper on the maths, I find myself retrospectively contextualising my ML knowledge.

So: do both and they'll reinforce each other - just accept you'll be lost for a bit.

Also: working with LLMs is incredible, as you can skip the training step and go straight to using the models. They're fucking wild technology.

TrackerFF · 2 years ago

I personally think it is possible to get a grasp of how many ML models learn, if you can get the intuition behind it - without the formal math knowledge, but only up to a certain point.

From my time in college studying this, you had approximately four types of students:

1) Those that didn't understand how models worked, and lacked the math to theoretically understand the models (dropped out class after a couple of weeks)

2) Those that understood (intuitively) how the models worked, but lacked the math to read and formalize models. Lots of students from the CS program fell under this group - but I think that is due to CS programs here having less math requirements than traditional engineering and science majors.

3) Those that understood how the models worked, and had the math knowledge. This was the majority of students.

4) Those that did not understand the models, but had the math knowledge.

Of these, 2-3 were the most common types of students. In the rare occasion, you had type 4 students. They would have no problem with deriving formulas, or proving stuff - but they'd more or less freeze up or start to stumble when asked to explain how the models worked, on a blackboard.

With that said, if someone has any ambition of doing ML research, I think math prereqs are a must. Hell, even people with good (graduate level) math skills can have a hard time reading papers, as there are so many different fields/branches of math involved. Lots and lots of inconsistent math notation, overloading, and all that.

There's a lot of contrived "mathiness" in papers, even where simple diagrams will do the trick. If your paper doesn't include a certain amount of equations / math, people aren't taking it serious...so some authors will just spam their papers with somewhat related equations, using whatever notation they're most comfortable with.

opportune · 2 years ago

If I can offer some unsolicited advice, try to seek out ML educational material with polished visualizations. If you're struggling with the math, trying to learn the concepts by reading libraries or textbooks or papers will be very hard - you might understood a specific thing if you look at it closely, but it will be hard to conceptually develop an intuition for why it works. A good visualization, or a strong educator explaining by way of analogy, can make a huge difference.

For example, gradient descent in conjunction with your learning rate can be visualized as calculating your error vector (your gradient), stretching it by your learning rate, applying it your parameters, computing the next error vector, and so on. If you think of what applying this vector might look like in 3d space, training your model is basically getting all your parameters to fall into a hole (an optimum). This kind of conceptualization helps you understand the purpose and impact of the learning rate: a way to stretch out the steps you make to descend into holes, so that you might hopefully "shoot over" local non-global optima while still being able to "fall into" other optima.

You could read papers and stare at code for a very very long time without developing that kind of intuition. I don't think I could ever come up with this myself just dabbling.

And as a side note, in mathematics at least for me, the most unexpectedly hugely important factor in understanding something is exposure-time. In college and grad school I found I didn't fully intuit most material until about 12mo after I had finished studying it - even if I hadn't actively studied it at all in the interim. I think it has something to do with the different ways our brains encode recent/medium/long term knowledge, or sleep, or something - not really sure, but I do know the earlier you started learning something and exposing yourself to the concepts, the sooner your subconscious builds that intuitive understanding. So you can do yourself a huge favor by just making an effort to dive into the math material even if it feels like a slog or that you're not getting it right this minute - you might make up one day in a few months and just "get it" somehow