Readit News logoReadit News
angarg12 · 4 years ago
Little offtopic about the curse of knowledge.

I recently started interviewing ML Engineers for my company. In general I'm quite surprised by the lack of knowledge of people applying for the job. People seem to have several misconceptions, very surface knowledge, and lack even the fundamentals.

That made me question if I my expectations are set right. Is it possible that working in the field every day I expect candidates to know way more than it's reasonable? I'm not sure, and I don't feel we have a solid way to deal with that.

cjbgkagh · 4 years ago
I consider myself a decent ML expert, but the field is so vast that I think I could be tripped up in an interview rather easily. Plus I tend to get a bit rusty with the basics that I never use for anything.
anton_ai · 4 years ago
Same experience hiring Junior Data Scientist, it's a shitshow out there. The level is so low that I had to hire a "old school" statistician with a senior position
melling · 4 years ago
These fields are relatively new. There doesn’t seem to be a clear path to break into them.

Personally, I believe Kaggle is one of the ways to slowly gain some practical experience: https://www.kaggle.com/

However, I’m not sure if it’s sufficient.

Recently, I’ve been taking a deeper dive into studying various types of competitions. For example, I’ve created a repo where I’m organizing notebooks, etc for a regression competition:

https://github.com/melling/ml-regression

I’m creating others for classification, nlp, vision, etc

Of course, the self-study method means people have knowledge gaps because there’s no syllabus tailored for an interview

nl · 4 years ago
I once interviewed a ML engineer candidate who had two PhDs in ML who couldn't properly identify regression vs classification problems.

> and lack even the fundamentals

I don't think there is a strong consensus on what the fundamentals are. I've also noticed that the fundamentals differ remarkably between people who think of themselves as "data scientists" vs those who think of themselves as "machine learning practitioners".

mkl · 4 years ago
Multiple PhDs in the same field is a big red flag. A PhD is to teach you how to do research, so if you get to the end and need another one, you've failed.
neodypsis · 4 years ago
In your opinion, what are some examples of ML fundamentals? Would you include intermediate linear algebra?
antman · 4 years ago
What exactly do you ask an ML emgineer? In my experience if we ask 10 people of the scope of work pf an ML engineer we'll get 10 different answers 9 of which will be all inclusive aka Data Scientist that can also do robust production systems.

I ask for the actual fundamental skills in the job ad. Say 5 skills for junior, 15 skills for senior, organizing other people (incl clients) for a manager

angarg12 · 4 years ago
This is a fairly new role in our company. Officially it's described as 50% data scientist and 50% developer, but the loop composition is the same as developer but replacing one of the coding rounds for an ML round.

I'm still experimenting with the format, but I do a mix of asking theoretical ML questions, prior experience with ML, and designing a system to solve a business problem using ML.

tehsauce · 4 years ago
Curious, any specific examples of missing knowledge or misconceptions?
deepsquirrelnet · 4 years ago
I tend to notice that model evaluation is lacking, perhaps because it’s not especially interesting.

But to me, most business applied ML falls under the optimization umbrella. For some reason it’s never portrayed this way, but perhaps if it were, junior practitioners would more commonly pay attention to learning to thoroughly examine how their trained models will perform.

angarg12 · 4 years ago
Misconceptions: not knowing regression vs classification, supervised vs unsupervised, or thinking that ML is just neural networks.
strikelaserclaw · 4 years ago
ML is the "hot" field right now, like "big data" was a couple years ago, it attracts a vast amount of people who are interested in only money (and a huge subset of those people aren't really that interested in "knowing")
Silica6149 · 4 years ago
Were they mostly new graduates, or did they have some years of experience already?
angarg12 · 4 years ago
Both juniors and mid levels.
arolihas · 4 years ago
Do you have any examples? What are your expectations?
angarg12 · 4 years ago
The most baffling example is a candidate who admitted they didn't expect ML specific questions and hence hadn't prepared. I spent a minute figuring out if we were interviewing the right candidate.

My expectations are always evolving since this is a new role for us. The current guidelines are that candidates should have broad knowledge of ML fundamentals. We also work through a design challenge together where a candidate solves a business problem using ML.

I'm still figuring out the best ways to evaluate these.

imaltont · 4 years ago
The way I like to explain it, which is how I have seen it explained several times, is that all AI can be reduced to search/optimization. ML is just applying the search over the function that will search for the final answer over a dataset (either generated on the fly or prepared beforehand). For neural networks the hypothesis space (all the solutions you are searching through to find the best ones) is the weights for the neural network, and your search strategy/optimization is (usually) backpropagation. If you translate the weights to something traversable by other algorithms they could do just "fine" (assuming infinite time and space) in it's place. It really opens the mind up for experimentation on every bit of the process. The book that really hammered it in for me was Intelligence Emerging by Keith Downing, short, great book on bio-inspired AI.
tehsauce · 4 years ago
The search strategy/optimizer is actually gradient descent, backpropagation is just an efficient way to compute gradients.
imaltont · 4 years ago
Both Tom M. Mitchell's "Machine Learning" as well as Russel & Norvig's "Artificial Intelligence: A Modern Approach" define the whole process from propagating the input until you have an output, calculate the gradient and update the weights.
amelius · 4 years ago
Can anyone explain: why is the Adam optimizer so unreasonably effective? Is anyone even using a different optimizer anymore?
Yenrabbit · 4 years ago
A good run-down of the different algorithms: https://ruder.io/optimizing-gradient-descent/ I think people favour adaptive learning rate options like Adam in practice since they generally do seem to perform well, and are often less sensitive to initial conditions and the exact hyper-parameters used. There will always be people who like to test N optimizers with parameter sweeps to squeeze a tiny bit of extra performance out, but for the rest of us the default Adam or AdamW options and good, unobjectionable choices :)
NavinF · 4 years ago
It’s really hard to compare optimizers. Common architectures and default hyperparameters were discovered alongside Adam so you’d have to redo a bunch of sweeps if you wanted a “fair” comparison. In practice this doesn’t really matter and everyone just uses Adam. If you had infinite compute, you’d try every combo and select the one with the best results.
locuscoeruleus · 4 years ago
Adam was very effective when it got introduced so it was widely adopted. Since then only models that work well with Adam have made it from the idea stage to actually working. I think there's reason to believe we have over fit our model architectures to our loss functions and optimizers.
jonbaer · 4 years ago
What is the current SOTA? Demon Adam? https://arxiv.org/pdf/1910.04952v4.pdf
yobbo · 4 years ago
Adam functions as a low-pass filter (and/or "compressor") on the gradient. It filters out "noise", which is "wild" during start of training.

This is basically what all "optimizers" achieve in various ways, including momentum.

Adam uses several times more memory, and is slower, than momentum or just SGD. That's reason to not use if not needed.

seanmor5 · 4 years ago
It’s a pleasant surprise to see this shared here, I am the author of this piece. Honestly I wrote this post for myself more than anything else. I also find that my knowledge in a lot of areas is very “surface level.” It’s really easy to regurgitate definitions, but it’s definitely harder to get to the core of those ideas. I hope you enjoyed!
mikewarot · 4 years ago
Machine learning - the unreasonable effectiveness of trillions of matrix multiplications, and backpropagation as a universal function approximator.

Every time they pull off another thing that's just "too dang silly to work", and yet it does... it really makes me smile.

I took the free Machine Learning course at Stanford ways back, it was fun to get another toolkit, should I ever actually need it.

aaccount · 4 years ago
Statistics