I am currently working as a senior full-stack web software engineer. I have a BSc in Computer Science, and on my own, I've been learning more about AI/ML/deep learning. I really enjoy working with it, and I'd love to find a way to work on AI stuff professionally. The problem is that I've been working as a web developer professionally for about 10 years now, and I have no idea how I would pivot to more of a AI/data science role.
Does anyone have an experience of making this transition? As a web dev, I am senior level, but I'm sure I'd have to start from scratch on some things in the AI space. At least I have a good foundation of programming in general, math, and computer science.
If the former, I suggest digging into things like the excellent Fast AI course: https://course.fast.ai/
If the latter, the (relatively new) keyword you are looking for is likely "AI Engineer" - https://www.latent.space/p/ai-engineer
There's an argument that deep knowledge of how to train models isn't actually that useful when working with generative AI (LLMs etc) - knowing how to train or fine-tune a new model is less useful that developing knowledge of the other weird things you have to figure out about prompting, evals and using these models to build production-quality apps.
So some options are: * Building ML models for traditional applications - forecasting, ranking, recommending, all of that. Data Scientists and ML Engineers haven't gone away. * Working as an "AI Engineer" building on top of existing APIs and models. Very hip, but also in flux - I couldn't tell you whether that role will still be considered rare & valuable in a few years time, or what skills will be core to it. * ML Ops, engineering work for building & serving AI & ML models. It's always good to be selling shovels.
I would try to get into a team you're interested in as a SWE and then upskill or pivot. In my experience this is more effective than trying to completely reskill and sell yourself as an unproven prospect for MLE or AI Scientist work. AI/ML teams still need software, and in fact many of the best researchers are not great software engineers.
At 100x the operating cost.
If you have any kind of scale, using a model that's appropriately sized to your task is going to be better.
Why is that not the job of a software/product engineer?
> none of the highly effective AI Engineers I named above have done the equivalent work of the Andrew Ng Coursera courses, nor do they know PyTorch, nor do they know the difference between a Data Lake or Data Warehouse
they're explicitly not trained in ML/AI. any software engineer can write a good prompt, call an API, and deploy that on an http server.
why is that not just software/product engineering?
Like any other web dev job there are differences across domains that can make it valuable to hire someone with past experience in your particular industry (in this case LLMs), but the only reason this gets a brand new title and others don't is hype.
It might not even still be a speciality in a few years time. Right now though there's a lot of depth to the field that people who aren't focusing on it are missing out on.
Deleted Comment
While there are advantages to going "full-stack" in this analogy most people focus on one or another
Foundation you need what, a primer on Linear_Algebra+Calc+Stats+Prob. At a minimum you have to know the language. Then, do you need anything of classical AI? Anything of classical ML? Do you even need to have any knowledge of Deep Learning other than it exists? Or transformers? OR do you just need a bunch of tutorials on how to implement GPT API's, structuring and managing prompts, etc.
I know all this information is out there but has anyone linked it together? If so, is it paywalled? I did a quite search on AI Engineer and before I knew it a site was asking me for 20k+ and 1.5 years of my life. I already have a master's in CS, so I assume I can make faster strides and do it for less. Can anyone advise?
It depends on what type of role you want. If you'd be happy building the application layer and doing prompt engineering, just build applications that call LLM APIs.
If you want a research position at the top labs, the interviews really are actually passable by people without PhDs. They are really focused on having strong fundamentals. I've seen people make this leap but it can be years of preparation. Like actually reading textbooks, implementing low-level details like backprop, re-implementing papers, and doing non-trivial personal projects. Essentially, you're self-studying a Masters degree. Blog about it. Post about it here. I've found people to make this transition just generally love learning.
Do you mean collecting, cleaning data? Or setting up databases (if yes, how is it different from me managing my employer's databases, except for size)?
In other words, what does a data engineer do all day?
This comes from someone doing web dev after studying data science, so I don't know how well it reflects reality as I've never worked as a Data Eng. myself
In the Hollywood example from what I know of.. small "tiger teams" assemble with fundamentals, then quickly farm out the sexy work to disposable contracting firms, who then hire even more disposable people with various skill levels. In other words, lots of fun and excitement but also lots of work place abuses and low stability. Over time, Hollywood formed unions for a surprisingly large number of roles (like writers) because the real truth of business is not pretty. Needless to say, Silicon Valley has moved very quickly, and the Hollywood stories are not exactly applicable.
If you want to apply AI, there are lots of really useful projects that are just calling the Anthropic or OpenAI API for the AI part. Or replicate.com image models etc. That wasn't the case a few years ago before we had the general purpose models. I have been doing a lot of those types of projects and I don't have a machine learning background.
There are ML Ops jobs that don't require a lot of machine learning knowledge.
There are ML researcher jobs that are just training LLMs which are more practical rather than theory.
To do novel machine learning research or at least significant variations of popular neural network architectures, I think that is the only thing that really requires years of study. But I think there is a very large gap between that type of work and web development. Which is why I was very happy to see the progress in general purpose models.
Feel free to reach out if you’re in the EU (email in profile), we’re hiring. Also happy to give some pointers on how to approach these conversations.
Plenty of careers just go away. Might as well pick one where you can stay relevant by picking up incremental/adjacent skills continuously.
Maybe we'd get bored though.
but yes, anecdotally, compared to all of my friends and family, i don't know any profession with those same expectations. to name a few - market researchers, psychologists, primary school teachers.
It seems the author just want to change their career, not necessarily because of they won’t be able to earn money if they don’t.
You either stay relevant (or close to relevant) or lose your job. Unfortunately, this is the field in the 2020s.
I mean, I just had a PR merged in a language I had literally never used before. It took me five minutes to pick up the basics. Sure it would take much longer to be fully productive, but it would be a comfortable transition.
Because I am still in the academic process, I had the opportunity to take a couple of classes on the subject. Three books that I would recommend going over to make sure your foundation in ML and mathematics are solid are
-Pattern recognition and machine learning by Christopher Bishop
-Mathematics for Machine Learning by Peter Deisenroth
-Deep Learning by Courville, Bengio and Goodfellow
All three are legally available online in some form. I can't say I have any experience in finding a job related to ML though.