Readit News logoReadit News
Posted by u/atomicnature 2 years ago
Ask HN: Daily practices for building AI/ML skills?
Say I have around 1 hour daily allocated to developing AI/ML skills.

What in your opinion is the best way to invest the time/energy?

1. Build small projects (build what?)

2. Read blogs/newsletters (which ones?)

3. Take courses (which courses?)

4. Read textbooks (which books?)

6. Kaggle competitions

7. Participate in AI/ML forums/communities

8. A combination of the above (if possible share time % allocation/weightage)

Asking this in general to help good SE people build up capabilities in ML.

janalsncm · 2 years ago
I got a masters degree in ML at a good school. I will say there’s pretty much nothing they taught me that I couldn’t have learned myself. That said, school focused my attention in ways I wouldn’t have alone, and provided pressure to keep going.

The single thing which I learned the most from was implementing a paper. Lectures and textbooks to me are just words. I understand them in the abstract but learning by doing gets you far deeper knowledge.

Others might suggest a more varied curriculum but to me nothing beats a one hour chunk of uninterrupted problem solving.

Here are a few suggested projects.

Train a baby neural network to learn a simple function like ax^2 + bx + c.

MNIST digits classifier. Basically the “hello world” of ML at this point.

Fine tune GPT2 on a specialized corpus like Shakespeare.

Train a Siamese neural network with triplet loss to measure visual similarity to find out which celeb you’re most similar to.

My $0.02: don’t waste your time writing your own neural net and backprop. It’s a biased opinion but this would be like implementing your own HashMap function. No company will ask you to do this. Instead, learn how to use profiling and debugging tools like tensorboard and the tf profiler.

DeathArrow · 2 years ago
ML is so much more than just neural networks.

I would start by taking a free university level course in statistics. Then I would continue with the basics: SVM, linear regression, naive Bayes, gradient boosting, neural nets etc. I would not only train and fine tune them, but I would also build simple ones myself instead of just using libraries. Then I would continue to what you said, participate in Kaggle competitions, try to solve real world problems.

I think that understanding the field from bottom up is priceless. Many people fine tune and train models, but they don't understand how that model works, nor do they know if the model they've chosen is the best fit for the problem they are trying to solve.

It's a rather long path if you really want to get good at it. Like in music: you can learn to play a tune by ear or you can learn to have a good, deep and thorough understanding of music.

wanderingmind · 2 years ago
This type of bottom up approach is terrible idea for a fast moving area like ML. Ultimately to get a job and make money in the area, you need to solve customer problems, starting with libraries and fine tuning is what needs to happen first. However you should try to learn fundamentals as you go and when you get stuck. It will take a very long time if you have a full time job or a student studying something else and doesn't help to get a job
amanda99 · 2 years ago
The risk with this is that you spend 10 hours learning statistics then get demotivated and never do the other 190 hours to get to the good stuff. Then quickly you forget the 10 hours of stats you learnt too as it's irrelevant and you don't use it.

For me, playing with things and doing cool & fun stuff is always the way to get deeper into something.

mickael-kerjean · 2 years ago
> SVM, linear regression, naive Bayes

When I studied ML in 2012, the very first course started with naive Bayes and went one from there. A decade after being away, I see a lot of people around me starting with neural nets to train a model that naive Bayes would be plenty enough for and never heard about naive Bayes. Is that only my experience?

IKantRead · 2 years ago
> don’t waste your time writing your own neural net and backprop.

I don't think you should be combining writing a neural network with doing backprop since I don't know anyone working with serious ML who is not using some sort of automatic differentiation library to handling the backprop part for them. I'm not entirely sure people even know what they're saying when they talk about backprop these days, and I suspect they're confusing it with gradient optimization.

But anyone seriously interested in ML absolutely should be building their own models from scratch and training them with gradient descent, ideally start with building out your own optimization routine rather than using a prepackaged one.

This is hugely important since the optimization part of the learning is really the heart of modern machine learning. If you really want to understand ML you should have a strong intuition about various methods of optimizing a given model. Additionally there are lots of details and tricks behind these models that are ignored if you're only calling an api around these models.

There's a world of difference between implementing an LSTM and calling one. You learn significantly more about what's actually happening by doing the former.

janalsncm · 2 years ago
> the optimization part of the learning is really the heart of modern machine learning

It’s an important component but I wouldn’t say it’s the main factor. ML is ultimately about your data, so understanding it is critical. Feature selection and engineering, sampling, subspace optimization (e.g. ESMMs) and interpreting the results correctly are really the main places you can squeeze the most juice out. Optimizing the function is the very last step.

Basically, you can go ahead and optimize down to the very bottom of the global min but a model with better features and better feature interactions is going to win.

Further, there are a ton of different optimizers available. SGD, Adam, Adagrad, RMSProp, FTRL, etc. With just one hour a day, you could spend six months simply writing and understanding the most popular ones.

mortallywounded · 2 years ago
> I will say there’s pretty much nothing they taught me that I couldn’t have learned myself. That said, school focused my attention in ways I wouldn’t have alone, and provided pressure to keep going.

I have a masters degree in computer science and took a fair share of ML graduate courses. That pretty much summed up what I was thinking. They basically forced me to sit and learn something I wouldn't have alone.

Now-- I'm not saying you need to go to grad school. You could buy some ML textbooks and force yourself through them and go from there... but how many people have that grit? I wouldn't have been one of them :)

janalsncm · 2 years ago
I hate to say it, but the diploma also matters. Having an MS next to your name means employers will give you the time of day that others won’t get. I really don’t like that this is the way things work, but it is.
RamblingCTO · 2 years ago
Also compsci master w/ only ML courses here. I actually enjoyed it and learned tons of stuff which I'd never have learned on my own. Who learns boltzmann machines, self organizing maps (and such) or fourier transformation/wavelets and stuff like that. I've never seen any of those in most ML books or courses and I really enjoyed learning all of these (and these are only the things I can think of right now, it's been a while).
dacryn · 2 years ago
I'd argue backprop is still handy just to learn the basics

It doesn't have to be production ready of course, but spending 3-4 hours to write it out in code, debug a few steps, ... are useful in my opinion.

or at least watch the karpathy video and try to follow along

manojlds · 2 years ago
As Andrej Karpathy says, understanding it deeply is required to take your skills to the next level and not make mistakes. As a practice, one should implement their own neutral net and manual backprop to start their understanding.
culi · 2 years ago
> My $0.02: don’t waste your time writing your own neural net and backprop.

I think it's worth doing a very simple implementation at least once to ensure you have the fundamentals memorized. It's not actually that complicated to implement a very simple one. Maybe a day or two -long project

vintermann · 2 years ago
I think that getting a feel for gradients and how it all works is a good reason for implementing your own - once.

Don't worry about what companies will ask you to do unless you absolutely have to.

atomicnature · 2 years ago
Love the suggestions, especially the one about implementing papers.

Do you have any starters on how one selects papers in the early days, to implement?

Also - any great papers you recommend beginners expose themselves to?

janalsncm · 2 years ago
The best site imo is Papers With Code. State of the art benchmarks, the papers which achieved them (along with previous papers) and github repos to actual implementations.

I wouldn’t recommend papers to absolute beginners though. For them, it’s best to go to HuggingFace, find a model that seems interesting and play with it in a Jupyter notebook. You’ll get a lot more bang for your buck.

IshanMi · 2 years ago
Try the "historical papers" on this repo: https://github.com/aimerou/awesome-ai-papers And also you can find papers with their implementations in code here: http://paperswithcode.com
vintermann · 2 years ago
I don't know if anyone does it still, but a few years ago there were a lot of papers suggesting more or less clever alternatives to ReLU as activation function. There was also a whole zoo of optimizers as alternatives to SGD.

Those papers were within reach for me. Even if the math (or the collossal search effort) needed to find them was out of reach, implementing them wasn't.

There were some things besides optimizers and activation functions too. In particular I remember Dmitri Ulyanov's "Deep Image Priors" paper. He did publish code, but the thing he explored - using the implicit structure in a model architecture without training (or, training on just your input data!) is actually dead simple to try yourself.

I'm sure if you just drink from the firehose of the arxiv AI/ML feeds, you'll find something that tickles your interest that you can actually implement. Or at least play with published code.

spaniard89277 · 2 years ago
I only have old hardware at home. How viable is to practice this stuff on my own projects (and I'd like to touch JS as much as possible, despite everyone being on python)
janalsncm · 2 years ago
You don’t need your own hardware. You can use Google Colab for free.

Most of the action happens in python. That being said, there’s a library called Tensorflow JS. It has some pre-trained models you can use off the shelf and run from your browser. Things like face detection and sentiment analysis.

sanderjd · 2 years ago
This seems like great advice.

You say don't write your own neural net and backprop implementation. That makes sense to me. What do you suggest using instead, for your suggested projects? I'm guessing tensorflow, based on your suggestions on profiling and debugging tools? Do the papers / projects you suggest map straightforwardly onto a tensorflow implementation, rather than a custom one?

uoaei · 2 years ago
Implementations in Tensorflow are widely considered to be technical debt. Internally, Google has mostly switched to JAX. PyTorch now has torch.compile and exports to ONNX so there's little reason to use Tensorflow these days except in niche cases.
alephnan · 2 years ago
> I got a masters degree in ML at a good school. I will say there’s pretty much nothing they taught me that I couldn’t have learned myself.

Are people actually going into masters degree to learn? I thought the whole point of paying for masters is just credentialism

janalsncm · 2 years ago
Technically you could do it on your own. Practically speaking? I quit my job and went to school full time. I spent 2 years studying the stuff practically 10 hours a day. The only way this is socially acceptable is if you get a piece of paper at the end which says you did it.
viksit · 2 years ago
(Former AI researcher + current technical founder here)

I assume you’re talking about the latest advances and not just regression and PAC learning fundamentals. I don’t recommend following a linear path - there’s too many rabbit holes. Do 2 things - a course and a small course project. Keep it time bound and aim to finish no matter what. Do not dabble outside of this for a few weeks :)

Then find an interesting area of research, find their github and run that code. Find a way to improve it and/or use it in an app

Some ideas.

- do the fast.ai course (https://www.fast.ai/)

- read karpathy’s blog posts about how transformers/llms work (https://lilianweng.github.io/posts/2023-01-27-the-transforme... for an update)

- stanford cs231n on vision basics(https://cs231n.github.io/)

- cs234 language models (https://stanford-cs324.github.io/winter2022/)

Now, find a project you’d like to do.

eg: https://dangeng.github.io/visual_anagrams/

or any of the ones that are posted to hn every day.

(posted on phone in transit, excuse typos/formatting)

manojlds · 2 years ago
Would recommend Zero to Hero by Karpathy as well

https://karpathy.ai/zero-to-hero.html

krmboya · 2 years ago
I also recommend fastai. It gets you hands on from the very beginning with links to extra resources like papers and articles you can read to improve your understanding.

Doing fastai while solving comparative problems on your own in kaggle is quite enlightening

zupatol · 2 years ago
Ah, visual anagrams, that was exactly the idea I had for a project that would allow me to learn. I hadn't dared looking if it already existed. I will try to pretend it doesn't and try to find my own way...
ru552 · 2 years ago
fast.ai course (https://www.fast.ai/) gets a thumbs up from me as well
_bramses · 2 years ago
I think a lot of these comments will highlight the lower level parts of ML, but what ML needs right now in my opinion is really smart people at the implementation level. As an analogy, there are way less “frontend” ML practitioners than “backend” ones.

Leveraging existing LLM technologies and putting them in software where regular people can use them and have a great experience is important, necessary work. When I studied CS in college the data structure kids were the “cool kids”, but I don’t think that’s the case in ML.

The daily practice is to sketch applications, configure prompts and function calls, learn to market what you create, and try to create zero to one type tools. Here’s two examples I made, one where I took the commonplace book technique of the era of Aristotle and put it in our modern embeddings era [1] and one where I really pushed to understand the pure MD spec and integrate streaming generative models into it [2]

[1] - https://github.com/bramses/commonplace-bot

[2] - https://github.com/bramses/chatgpt-md

duckworthd · 2 years ago
What's worked well for me: Find a way to put what AI/ML on your critical path. Think of it like learning a new language: classes, lessons, and watching TV helps, but nothing works like full-on immersion. In the context of AI/ML, that means find a way to turn AI/ML into your full-time job or school. It's not easy! But if you do, you'll see endless returns.

If you don't have a solid enough footing to get a job in the field yet, the next best thing in my opinion: find a passion project and keep cooking up new ways to tackle it. On the way to solving your problem, you'll undoubtedly begin absorbing the tools of the trade.

Lastly, consider going back to school (a Bachelor's or Master's, perhaps?). It'll take far more than 1 hour/day, but I promise you, you'll see results far faster and far more concretely than any other learning strategy.

Good luck!

Context: I've been a Researcher/Engineer at Google DeepMind (formerly Google Brain) for the last ~7 years. I studied AI/ML in my BS and MS, but burnt out of a PhD before publishing my first paper. Now I do AI/ML research as a day job.

atomicnature · 2 years ago
Yes, I was leaning more towards the "personal project" idea as well, something around document understanding. I subscribe to the "learning by doing/immersion" philosophy as well (upto a large extent).

The problem with projects is one's understanding tends to go more and more specialised, and collaborating/connecting with other ML engineers requires a broader knowledge base sometimes.

Also, for giving advice and useful inputs to others (on their projects), I feel a balanced knowledge base is useful.

Hence the question.

markha · 2 years ago
Greg Brockman's blog[1] has few links on how he picked up ML. Another link at [2] describes the path Michal(blog's author) followed (though it's aligned to "how i got into ..."). Both these blogs walk through how they were able to get into the ML bits of things. They have bunch of links (ex: [3]).

I think it'll help if you can get a job at a company who's main focus is ML, you'll talk to folks who are doing research or solving problems using ML, you'll learn. If not, i hope these links help as folks there (people way smarter than me, a swe) had similar question and documented the steps they took to reduce the gaps in their understanding.

[1] - https://blog.gregbrockman.com/how-i-became-a-machine-learnin... [2] - https://agentydragon.com/posts/2023-01-11-how-i-got-to-opena... [3] - https://github.com/jacobhilton/deep_learning_curriculum

TrackerFF · 2 years ago
Roughly speaking, the roadmap for a typical ML/AI student looks like this:

0) Learn the pre-requisites of math, CS, etc. That usually means calc 1-3, linear algebra, probability and statistics, fundamental cs topics like programming, OOP, data structures and algorithms, etc.

1) Elementary machine learning course, which covers all the classic methods.

2) Deep Learning, which covers the fundamental parts of DL. Note, though, this one changes fast.

From there, you kind of split between ML engineering, or ML research.

For ML engineering, you study more technical things that relate to the whole ML-pipeline. Big data, distributed computing, way more software engineering topics.

For ML research, you focus more on the science itself - which usually involves reading papers, learning topics which are relevant to your research. This usually means having enough technical skills to translate research papers into code, but not necessarily at a level that makes the code good enough to ship.

I'll echo what others have said, though, use to tools at hand to implement stuff. It is fun and helpful to implement things from scratch, for the learning, but it is easy to get extremely bogged down trying to implement every model out there.

When I tried to learn "practical" ML, I took some model, and tried to implement it in such a way that I could input data via some API, and get back the results. That came with some challenges:

- Data processing (typical ETL problem)

- Developing and hosting software (core software engineering problems)

- API development

And then you have the model itself, lots of work goes toward that alone.

te_chris · 2 years ago
As someone a wee bit along the journey but with the maths dragging me down a bit, I've found that, while in a perfect world I'd love to get my maths up to solid 2nd year undergrad level, it's just going to take me another year or so. That hasn't stopped me moving forwards. I understand y = ax + b, bits of linear algebra, gradient descent, but I still don't have the critical intuition to pass a college level maths exam.

This has helped me build the intuition for understanding these concepts in ML, and as an experienced developer I've found I've been able to pick up the ML stuff relatively easily - it's mostly libraries at the practical level. This has in turn shown me two things: ML is data quality, prep, and monitoring; I actually like the maths: it annoys me that there's this whole branch of knowledge that I don't grok intuitively and I want to know more. As I go deeper on the maths, I find myself retrospectively contextualising my ML knowledge.

So: do both and they'll reinforce each other - just accept you'll be lost for a bit.

Also: working with LLMs is incredible, as you can skip the training step and go straight to using the models. They're fucking wild technology.

TrackerFF · 2 years ago
I personally think it is possible to get a grasp of how many ML models learn, if you can get the intuition behind it - without the formal math knowledge, but only up to a certain point.

From my time in college studying this, you had approximately four types of students:

1) Those that didn't understand how models worked, and lacked the math to theoretically understand the models (dropped out class after a couple of weeks)

2) Those that understood (intuitively) how the models worked, but lacked the math to read and formalize models. Lots of students from the CS program fell under this group - but I think that is due to CS programs here having less math requirements than traditional engineering and science majors.

3) Those that understood how the models worked, and had the math knowledge. This was the majority of students.

4) Those that did not understand the models, but had the math knowledge.

Of these, 2-3 were the most common types of students. In the rare occasion, you had type 4 students. They would have no problem with deriving formulas, or proving stuff - but they'd more or less freeze up or start to stumble when asked to explain how the models worked, on a blackboard.

With that said, if someone has any ambition of doing ML research, I think math prereqs are a must. Hell, even people with good (graduate level) math skills can have a hard time reading papers, as there are so many different fields/branches of math involved. Lots and lots of inconsistent math notation, overloading, and all that.

There's a lot of contrived "mathiness" in papers, even where simple diagrams will do the trick. If your paper doesn't include a certain amount of equations / math, people aren't taking it serious...so some authors will just spam their papers with somewhat related equations, using whatever notation they're most comfortable with.

opportune · 2 years ago
If I can offer some unsolicited advice, try to seek out ML educational material with polished visualizations. If you're struggling with the math, trying to learn the concepts by reading libraries or textbooks or papers will be very hard - you might understood a specific thing if you look at it closely, but it will be hard to conceptually develop an intuition for why it works. A good visualization, or a strong educator explaining by way of analogy, can make a huge difference.

For example, gradient descent in conjunction with your learning rate can be visualized as calculating your error vector (your gradient), stretching it by your learning rate, applying it your parameters, computing the next error vector, and so on. If you think of what applying this vector might look like in 3d space, training your model is basically getting all your parameters to fall into a hole (an optimum). This kind of conceptualization helps you understand the purpose and impact of the learning rate: a way to stretch out the steps you make to descend into holes, so that you might hopefully "shoot over" local non-global optima while still being able to "fall into" other optima.

You could read papers and stare at code for a very very long time without developing that kind of intuition. I don't think I could ever come up with this myself just dabbling.

And as a side note, in mathematics at least for me, the most unexpectedly hugely important factor in understanding something is exposure-time. In college and grad school I found I didn't fully intuit most material until about 12mo after I had finished studying it - even if I hadn't actively studied it at all in the interim. I think it has something to do with the different ways our brains encode recent/medium/long term knowledge, or sleep, or something - not really sure, but I do know the earlier you started learning something and exposing yourself to the concepts, the sooner your subconscious builds that intuitive understanding. So you can do yourself a huge favor by just making an effort to dive into the math material even if it feels like a slog or that you're not getting it right this minute - you might make up one day in a few months and just "get it" somehow

gwbas1c · 2 years ago
One thing to point out: Try not to let your imagination run away, or get overconfident in what AI/ML can do.

I worked for a major company on an ML project for 2 years. By the time I left, I realized that:

1: The project I was working on has no improvement over ordinary statistical methods; yet the ability for people to understand the statistics (over the black box of ML) meant that the project had no tangible improvement over the processes we were trying to replace.

2: A lot of the ML I was working on was a solution in search of a problem.

I personally found the ML system I was working on fascinating; but the overconfidence about what it can infer, and the way that non-developers thought ML could make magical inferences, frustrating.

---

One other thing: Make sure you understand how to use databases, both SQL and non-SQL. In order to use ML effectively, you will need to be excellent at programming with large volumes of data in a performant manner.

borg16 · 2 years ago
this answers a different question - what is the simplest possible solution to the problem at hand.

answering that requires a good understanding of the problem at hand as well as knowledge to be able to propose a simple solution that could be the starting point, and then searching for improvements over the same - assuming the improvement they bring is useful to the solution at hand.

i guess what I am trying to say is, the question asked by op and your suggestion are orthogonal imo :)

hereonout2 · 2 years ago
Presuming you want to work in the field and already have software development experience why not look at the confluence between ML and engineering?

Things like ML ops, application of DevOps, testing and ci/cd in the ml space, how to train across multiple gpus, how to actually host an LLM especially at scale and affordably.

In my experience there are hundreds of candidates coming from academia with strong academic backgrounds in ML. There are very few experienced engineers available to help them realise their ambitions!

brainbag · 2 years ago
Do you have any recommended resources on those topics? I'm coming from a strong ~30 year software engineering background which has been excellent, until now, as ML requires a completely different background. I'm trying to decide if I should start a new game+ with academic background, or get some expansion packs with what I already know and move into ML that way. I've found plenty of resources for the former and practically nothing for the latter.
hereonout2 · 2 years ago
Things like this give a good overview of the problems being face in productionising ML:

https://research.google/pubs/whats-your-ml-test-score-a-rubr...

Note they start to discuss things like unit testing, integration testing, processing pipelines, canary tests, rollbacks, etc. Sound familiar yet?

The same author has also written this book:

https://www.oreilly.com/library/view/reliable-machine-learni...

I don't see a software engineer's skills becoming redundant in this field, especially if you have a good level of experience in cloud infra and tooling. It seems more valuable that ever to me (e.g. I have worked with ML Researchers who don't grasp HTTP let alone could set up a fleet of severs to run their model developed entirely in Jupyter Notebook).

I have found it helpful to equate myself with the correct tools and terminology in order to speak the right language - there's specific tools lots of people use such as Weights & Biases for "Experiment Tracking", terms like "Model Repository" which is just what it sounds like. "Vector Databases" (Elastic Search had this feature for years), "Feature Stores" - feel familiar to big table type databases.

Reading up on a typical use case like "RAG - Retrieval Augmented Generation" is a good idea - alongside starting to think about how you'd actually build and deploy one.

Above all having a decent background in cloud infra, engineering and how to optimise systems and code for production deployment at scale is a very in demand at the moment.

Being the person helping these teams of PHDs (many of whom have little industry experience) to productionise and deploy is where I am at right now - it feels like a fruitful place to be :)

nkzd · 2 years ago
Hey, I am a classic backend software engineer looking to learn how to do things you mentioned. I believe if I learn these skills, I will know how to make "shovels" during gold rush :)

Can you recommend any learning resources for things you mentioned? I don't have an option to learn these on my current job, so it will be hard to structure CV to prove my future employers I know them when I don't have real world experience.

hereonout2 · 2 years ago
Check out my reply to the sibling comment
rramadass · 2 years ago
1. Get An Introduction to Statistical Learning with Applications in R/Python (aka ISLR/ISLP) by Hastie et al. Read this from cover-to-cover and make sure that you understand the concepts/ideas/nuances/subtleties explained.

2. Keep a couple of Mathematics/Statistics books handy while you are going through the above. When the above book talks about some Maths technique you don't know/understand you should immediately consult these books (and/or watch some short Youtube videos) to grasp the concept and usage. This way you learn/understand the necessary Mathematics inline without being overwhelmed.

This is the simplest and most direct route to studying and understanding AI/ML. Everything else mentioned in this thread should only come after this.