Tesla has bet the company on robotaxis, but their vision only tech stack doesn’t seem capable of solving it, which is a problem because Tesla has repeatedly promised FSD is right around the corner, or less than a year away. It’s hard to believe Karpathy would step down if he felt they were close to solving the problem anytime soon.
This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
FSD beta tester here. I think they are minimum 3 years away from anything exciting beta.. but the localization, mapping, and visualization are not the reason. I don't think LIDAR would contribute substantially to improvement.
The fundamental flaws are in the decision-making being based upon 10-30 second feature memory, ignoring features outright, and only depending on visible road features instead of persisted map data.
For instance, near my house there's an intersection where it will try to use a turn-only lane with a red arrow when it's trying to go straight thru a light. 100% of the time. Even if I'm in the correct lane with no traffic around.
That's because the turn arrow on the ground is worn off. It is not a perception problem, it is
a) ignoring the obvious red turn arrow signal's significance for lane selection deliberately
b) makes no attempt to persist or consult map data for 'which lanes go where'
c) it completely disregards painted lines on the ground in the "no drive here" striping.
Also one block from my house, FSD will stay still indefinitely waiting for trash cans (displayed as trash cans) to clear the leftmost lane so that it will turn left.
None of the failures I encounter are due to lack of perception.
> I think they are minimum 3 years away from anything exciting beta
I think that's optimistic. Ten plus years.
There's too many exceptions. In my home town, an intersection. Imagine, if you will:
Traveling westbound there's four straight ahead lanes. There's a traffic light for each straight lane. The traffic lights will alternate "left two straight, green; right two straight red" and then "left two straight, red; right two, green". It does this because there's a tunnel and a roundabout right there. I guarantee that FSD will choke on this.
Tesla has been collecting thousands of dollars, each, from car buyers and utterly failing to deliver what it represented, and keeping the money year after year. Would be let GM, Toyota, or Audi do this? Where is the criminal prosecution? Where are the refunds?
I vividly remember a conversation I had with an acquaintance who had recently taken an engineering position with an AV company that occurred circa late 2018. He claimed that they were at most 1 year away. In fact, his exact words were something along the lines of "They're already here. It's just a few edge cases to work out and some regulatory hurdles to overcome."
The reality is that the first 80% of the problem had been solved quickly and significant progress had been made at the time on the next 10%. The end was in sight. Unfortunately, that next 10% ended up taking as long as the first 80% to solve, and the final 10% will likely take decades if it's even possible.
It’s naive to call yourself beta tester. People who have access to beta are basically just a part of PR exercise by Tesla, that tries to hide lack of meaningful progress towards multiple missed deadlines.
You aren’t beta testing complex automation system that’s operating on public roads. To do any meaningful testing you should have defined operation domain, specific behaviors to test, direct line to the engineering team to report issues, etc, etc.
That’s not 3 years away, it’s lacking fundamental knowledge of the world, which might require GAI to begin with. It is just shitty drive assist features sold as much more.
Isn't Tesla supposed to be producing Optimus, their human-like android, next year?
Elon has been over-promising(i.e. flat out lying) about self-driving every year since.. 2014(there's a youtube video compilation of it)?
It seems like his strategy is to just come up with increasingly grandiose promises every year when he fails to deliver on his past promises. He's trapped in his swirling vortex of bullshit. Very worrying to see Karpathy leaving...
I don't recall Tesla ever saying they were producing Optimus as a product any time soon. He's said prototype demo later this year. A prototype is very far from a finished product. As a product businesses/consumers can buy, at least 5+ years from now.
Yeah, but Google's vision + lidar tech also doesn't seem any better at solving it either. They have been working on this problem the longest and they aren't even confident enough to produce a product with it. Google is probably the leader in AI and AI research. They are also the leader in data and mapping. They have billions of cash to play with. Yet it seem like they haven't gotten any closer at solving this problem as well.
They are just going about it better but not trying to selling it.
Any reason why everyone seems to be stuck on this problem?
Waymo and Cruise routinely have driverless cars on city streets. In California, all collisions, however minor, have to be reported, and DMV posts them on their web site.[1] Most are very minor. Here's a more serious one from last month:
"A Cruise autonomous vehicle ("Cruise AV") operating in driverless autonomous mode, was traveling eastbound on Geary Boulevard toward the intersection with Spruce Street. As it approached the intersection, the Cruise AV entered the left hand turn lane, turned the left
turn signal on, and initiated a left turn on a green light onto Spruce Street. At the same time, a Toyota Prius traveling westbound in the
rightmost bus and turn lane of Geary Boulevard approached the intersection in the right turn lane. The Toyota Prius was traveling
approximately 40 mph in a 25 mph speed zone. The Cruise AV came to a stop before fully completing its turn onto Spruce Street due to the
oncoming Toyota Prius, and the Toyota Prius entered the intersection traveling straight from the turn lane instead of turning. Shortly
thereafter, the Toyota Prius made contact with the rear passenger side of the Cruise AV. The impact caused damage to the right rear door,
panel, and wheel of the Cruise AV. Police and Emergency Medical Services were called to the scene, and a police report was filed. The
Cruise AV was towed from the scene. Occupants of both vehicles received medical treatment for allegedly minor injuries."
Now, this shows the strengths and weaknesses of the system. The Cruise vehicle was making a left turn from Geary onto Spruce. Eastbound Geary at this point has a dedicated left turn lane cut out of a grass median, two through lanes, a right turn bus/taxi lane, and a bus stop lane. It detected cross traffic that shouldn't have been in that lane and was going too fast. So it stopped, and was hit.
It did not take evasive action, which might have worked. Or it might have made the situation worse. By not doing so, it did the legally correct thing. The other driver will be blamed for this. But it may not have done the thing most likely to avoid an accident. This is the real version of the trolley problem.
Because self-driving has a bunch of tricky edge cases and most of them will kill people. Problems with hundreds of important edge cases cannot be solved by simply throwing more training data at the problem; that's how you solve AI problems in a "dumb" manner, and it works for lots of problems (like recognizing dogs in images) -- but not for self-driving.
To solve the self-driving problem we need "smart" A.I., which means we have to approach it with systematic engineering, and the solution will probably involve some combination of better sensors, introspectable neural nets, symbolic A.I., and logical A.I.
> Any reason why everyone seems to be stuck on this problem?
Because it's really, really difficult. A lot of AI-ish stuff pretty rapidly gets to the point where it _looks_ quite impressive, but struggles to make the jump to actual feasibility. Like, there were convincing demos of voice recognition in the mid-90s. You could buy software to transcribe voice on your home computer, and people did. And, now, well, it's better than in the mid-90s certainly, but you wouldn't trust it to write a transcript, not of anything important. Maybe in 2040 we'll have voice recognition that can produce a perfect transcript, and human transcription will be a quaint old-fashioned concept. But I wouldn't like to bet on it, honestly.
And voice recognition is arguably a far, far easier problem.
>Any reason why everyone seems to be stuck on this problem?
ML maximalism focused on the narrow problem of 'solving driving' while not recognizing that any task as complex as driving requires probably something closer to general intelligence, and theoretically the field has been impoverished in favor of "throw more graphics cards at everything".
Driving is a social problem, not a technical one. It's functionally the same as walking down a crowded sidewalk. The car is just a tool, just an extension of our bodies.
We can't build a robot which can walk down a sidewalk without running into people either. The sensor tech and mapping fidelity are red herrings. People drive well because only people are good at predicting human behavior.
I paid ~$10 for two rides after signing up as a regular ole user in Mesa AZ. It was great, the first ride was a bit nerve wracking, but the second felt very nice.
I certainly wouldn't argue with you that it isn't ready for prime time and wide distribution, but it is interesting to see their progress in San Francisco, a much different driving problem.
If it takes them 10 years to get to prod in Mesa, two (maybe three?) in SF, maybe they start shrinking that a lot in metros without winters. ¯\_(ツ)_/¯
There's been a lot of progress despite AVs not meeting intitial hyped predictions. Waymo and Cruise are operating driverless robotaxis in SF. We're probably a couple years from many major cities having them.
We often can figure out how to take a product that works 90% and bring it to 99%, and then to 99.9%. The engineering challenges involved in each nine are often vastly more than the percentages indicate, but they're conceivable. With AI we have absolutely no idea how much effort might be required to get to that next level of reliability. We hope that bigger models, or better AI technology might get us there, but there's also a chance that they won't.
Probably because it's really, really, really hard to solve the thousands of edge cases that occur in real-world driving situations. I don't think FSD happens until government gets behind it and starts putting infrastructure behind it. If we start building roads (and cars) to be highly visible to AI one way or another, it all becomes much easier.
> Any reason why everyone seems to be stuck on this problem?
Because they're all trying visual- or line-of-sight methods only, I call this the "robo-human" fallacy in ML: trying to automate the processes that humans undergo so that you eventually have a drop-in replacement for a human. But that is a myopic and unimaginative approach because you could be re-assessing the system itself and eliminating inefficiencies that lead to poor performance.
In the autonomous vehicles space, there is massive potential for self-organizing swarm algorithms to control pelotons of cars, rather than individual cars with no intrinsic sense of the general flow of traffic. You wouldn't need a top-down "commander" style architecture, it could be designed so that cars only talk to their immediate neighbors and emergent patterns keep traffic flowing smooth and fast.
I have always been skeptical of the attempts to reduce the amount of information about the road that a car receives. (Moving from stereoscopic to monocular vision to save the cost of one camera seems just stupid.) But people who dream of "smart cities" really seem to see little more than The Jetsons in their mind, and it limits the scope of research to our detriment.
How is AI supposed to confidently distinguish a real stop sign from someone/something holding up a picture of a stop sign?
Yes, this is a weird edge case, but I think it gets at the core issue being that it takes way more sophistication to release this tech into the wild then ppl would like to admit.
Google/Alphabet is stuck and will make advances in specific territories but will never get there without a fundamental change in approach. Their approach relies on very detailed mapping/modeling of specific terrain, so they can make a usable case sooner, but outside of the map/model territory, they're literally lost. And maps/models change constantly and rapidly.
Tesla is taking a fundamentally more broad and deep approach - working with the fundamental fact that a pair of visual sensors and a compute engine (eyes & brain) can successfully figure out driving in strange areas in real time, ergo, it should be possible without a map/model or lidar. Once they get it solved, it will be solved once and for all. Bigger gamble, bigger payoff. Equipping the car with dozens of eyes is the easy part. The question is whether enough compute power can be brought to bear on solving the recognition problems, and the edge cases. They have obvious issues with failing to recognize large objects like trucks in unexpected orientations, left turns etc. Using millions of miles of live human driver data as a training set is great, except that the average driver is really bad, so it's entirely polluted with bad examples, ESPECIALLY around the edge cases that get people killed. There, examples from professionally trained drivers, who really understand the physics and limits of the car, adhesion, traffic dynamics, etc, are what you want to train on, but that isn't what they have. It is also possible that even if the set of training data would actually be sufficient, the big question will kill them - perhaps the solution requires orders of magnitude more compute power to approach human performance, and they just don't have the hardware to simulate human compute power. So, have they just hit the limits of what their compute power can do?
I think Tesla's approach is fundamentally the way to go, as it is a general solution, compared to everyone else's limited map/model approach.
But both may require either or both a more specifically programmed higher-level behaviors, and/or something much closer to AGI than exists, something that has actual understanding of the machine-learned objects and relationships, which does not yet exist (if one is known, pleas correct me - I'd love to know about it).
Uber bet the company on robotaxis and lost. Tesla is still building very good cars that happen to not be able to drive themselves. Just like every other car. If they could lose their obsession with self-driving and just focus on their incredible cars, they'd still make money.
They are very much "default alive" BUT I get the sense that a lot of the company's plans and valuation are based on the idea that Tesla will not "just" produce competitive electric cars. They could make money just doing that - but they would never meet the expectations around their company's' trajectory that way and so they would be a "bad" investment (in ROI terms). So I think that everyone who is still working on the "more than a car maker" goal is going to insist the company will be more than that until they really can't.
Tesla is so overvalued only because of their claims of self driving. Musk utters a lot of bs but he's right that without this Tesla is doomed.
They have a big head start, but other car companies are now investing much more in battery tech etc and will quickly catch up. Not to mention Tesla's have terrible build quality, they have a lot of shady business practices like overcounting sales, reusing sold parts etc which came out in the recent leak.
Thing like the 4860 battery which were so hyped turn out to be not that much better. FSD is years away. Stop selling vaporware.
What they need to focus on is things they innovated on like OTA updates, integrated systems, no dealerships etc.
They have terrible manufacturing quality. The screens melt in Arizona heat. Maybe the ride is cool and feels good, but the car itself is not incredible.
> If they could lose their obsession with self-driving and just focus on their incredible cars, they'd still make money.
They may want to think about that strategy soon. Model 3 is starting to seem dated (not to mention Model S, which is ten years old). There are very competitive alternatives on the market now that have strengths where Tesla is weak, and which are not especially weak in the areas Tesla is strong.
How so? They're not selling robotaxis or building factories to build them
> Tesla has repeatedly promised FSD is right around the corner
Which means it's years away and/or "FSD" means "automatic cruise control and lane keep assist" or whatever standard feature from auto manufacturers they've renamed
Because they chose to back themselves into that corner. Musk says that Tesla is worth nothing without full self-driving. Certainly it's the only thing left to justify the stock price:
The lies have been profitable so far. People have bought into the false promises. Perhaps they'll start demanding refunds for the full self-driving they paid for that has still not been delivered.
> This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
That's exactly what I would expect someone burning out to say. You feel the burnout so you need time to get over it and feel 100% (regain your technical edge). You're still burnt out after 4 months, so you don't come back.
Frustration with the technical approach can also cause burnout.
Yes, the vision only approach isn't something that you do in robotics.
There is usually a hierarchy of sensors, mainly for redundancy.
Example: Bumper sensory at the wheel base, sonar / Lidar at the mid, and a camera at the top for advanced sensing.
For the sake of cost cutting Tesla has done away with their radar sensors at the front of the vehicle. It would be a substantial cost overhead, but have very real repercussions when it comes to safety, while also providing a "ground truth" to what at least the front facing cameras are seeing.
I don't think Lidar is a practical sensor for them to adopt, because it is quite bulky and has limited viewing angles, but I would expect them to have adopted some novel, lower cost radar solution.
Apart from the lower cost of the camera, I think Elon's rationale for having a camera only FSD is not valid, has made the problem needlessly complex and unsafe. He believes since we have eyes, and we can drive a car, then it should be sufficient to drive the car, but we only use eyes because these are the sensors we were born with, it is the best we have. In my mind, Elon's approach is like looking at a horse, and saying to yourself, that you want to build a car based on a horse, where instead of wheels, you have four mechanical legs, and those mechanical legs are limited is so many ways, but they should still at least "work", but there is no reason to limit locomotion in that way. The same with the vision system on a FSD, the whole spectrum of light is available, with any number of configurations, providing data at rates and with precision far beyond what a camera system can do.
IIRC there was a presentation from Karpathy talking about the challenges with sensor fusion, particular in resolving divergence between e.g. the vision and the radar stack: https://www.youtube.com/watch?v=NSDTZQdo6H8&t=1949s
My background is in physics, but I find myself having a growing appreciate for the vision-only stack. It's really challenging building a formal understanding of the world that is robust to outliers that are so numerous as navigating in an urban environment. With vision, you have multiple kinds of information that are highly correlated (colour, spatial distribution, depth, etc) that are self-consistent. Whereas, fusing radar with vision, where object responses to radar are highly geometry & material dependent, is a much harder task.
I'm really not an expert, so this reads more as an opinion than an experienced view, but I can see the merits in doubling down on vision.
Monocular vision only seems pretty clearly not capable of solving the problem. Stereo/multi view systems have a shot (humans are proof), but Tesla bet against that long ago. I wonder what could’ve been in a proper multi view setup.
Humans are perfectly capable of driving on racing simulators using a flat screen though, where binocular vision makes no difference.
And Tesla cars have more than one camera on them. The front-facing camera is actually an array of 3 cameras (the two farthest ones are at about human eyes distance), but they're also equipped with forward and rearward looking side cameras, and back cameras.
I think Tesla underestimated how hard vision-only FSD is, but having a single camera (they don't) is not the reason.
BTW, Andrej, if you're reading this, it is not just excellent it is beyond excellent. I do a lot of tinkering with transformers and other models lately, and base them all on minGPT. My fork is now growing into a kind of monorepo for deep learning experimentation, though lately it started looking like a repo of Theseus, and the boat is not as simple anymore :)
> but their vision only tech stack doesn’t seem capable of solving it
Well, I'm not sure that anyone's tech stack is capable of solving it; the live examples of robotaxis are, well, not something you'd bet your company on (and generally their creators are _not_ betting their companies on them). There was, I think, a decade ago the idea that fully self-driving cars were a near-term inevitability. That's fading, now.
I think a lot of that came from the Tesla hype machine creating a strong association between electric and self-driving as being the immediate future of cars in popular consciousness, so when people saw electric becoming a reality they assumed self-driving was right around the corner when in actuality their maturity levels aren't related much at all. Fallacious thinking that may doom a few companies between Lyft, Uber, and Tesla
I don't see how smart roads would solve the issues of "the unexpected".. weird pedestrian behavior, getting cut off, parked trucks, construction, etc. It seems to me that smart roads would only solve the issue of general routing, and that already seems to be dealt with as far as I can tell.
how is that any different from geofencing. The goal of an AV should be a car that can drive in any country/road. Not just fancy smart roads in western countries
I've long had this fantasy of a smart road pilot program wherein manufacturers would partner with governments (or toll road owners in some places?) to make self-driving-only smart lanes, where you have to surrender control but the car goes 120+mph. I imagine getting even a handful of popular longer routes enabled for that would be quite popular.
> if he felt they were close to solving the problem anytime soon
He felt? It's evident enough, that approach they used doesn't allow them to prepare FSD for real life and real streets. I think he just understood, that approach to be changed/improved significantly to reach the goal.
If humans can master driving with 2 eyes looking forward, why would a car with plenty of cameras in all directions not have sufficient sensory input to master it?
Same reason airplanes don't fly by just flapping their wings like birds. There's not always a biological equivalent for solving a problem, especially when you take into account human brain's evolution over millions of years. Sometimes computers need more help.
> This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
I think Karpathy realized (probably way back) that cheap sensors + no HD maps + their (reckless) public testing feedback loop doesn't advance towards L5 self driving and is bailing out. Karpathy has always backed Elon Musk whenever he talks about their technical approach, so it can't be frustration with the approach all of a sudden.
If Musk actually said this he's an even bigger idiot than I gave him credit for. He makes the best cars in the world (obviously IMHO) and that's worth a lot more than 0. I couldn't give less of a damn if they can't drive themselves.
I ended up selling mine because the build quality started getting on my nerves. Misaligned panels that are "within spec" or interior rattles that make the ride completely unpleasant (at only 10k miles).
The tech is all right, and I got to try auto pilot at a discount. Unfortunately the phantom braking made AP completely useless with passengers who would freak out and complain. However, when it worked it was quite nice but I ended up using it way less than I'd hoped. Glad I didn't pay 12k for it!
The best part of owning the car was the insane acceleration and supercharger network at the beginning. But, that got annoying as more people started getting Teslas. Going on longer trips meant a ton of anxiety especially since some superchargers in cities would be packed. Worse, some would be out of order or slow charging. After the gimmick wore off, wasting 45 minutes to go another 2-3 hours started becoming annoying. And before someone asks why 2-3 hours, its called hills. California is full of them, and especially where I live I lose so much efficiency climbing hills.
Anyway overall I'd rate the car 5/10. Fastest car I've ever owned. Beyond that it was pretty much exactly as they described - a beta product. I'll probably try a Tesla again in 5-10 years.
As an owner for 5+ years, the cars show well but pretty quickly become a pain in the ass.
They'll get eclipsed by other electric car manufacturers real quick.
Edit: more specifically, the parts break and they are difficult to replace. The battery degrades. They stopped providing maps to the vehicle unless I'm willing to spend several hundred dollars to replace the media console, they've told me I'm covered by a recall/warranty but have been unable to schedule the appointment.
I was prepared to buy a Tesla, and even planned on doing the full acceptance process (https://github.com/polymorphic/tesla-model-y-checklist) at the factory (a short drive from my house) but after Musk announced the Model Y was losing the radar and reading a bit more about their problems with FSD Beta (which I was planning to spend $10K to get), I concluded their cars were not the best in the world. I went and got a toyota which could pretty much have the engine compartment welded shut for ten years (modulo oil changes) and run perfectly.
It's a real shame because in principle having a manufacturing facility making great electric cars in the bay area would be a real win. Musk's reality distortion field is cracking.
IMHO, the best cars in the world are still Toyotas - not the bleeding edge, mind you, just more reliability, overall "gets out of the way and does the job" in the most effective manner possible.
In theory if Tesla just figures out how to make batteries at scale they could be quite successful even if their sales numbers plateau. But in practice it doesn't seem like auto part wholesalers have a whole hell of a lot of power to set prices. Some of the better known brands are subsidiaries, and a few seem to have started as such.
I think his point is that FSD is an inevitability and when it comes it will generally upend the way we think about cars. Manual driving may always be a thing but it would become more of a passion than a way to get to work in the morning. The vast majority of value would come from autonomous transportation and logistics.
It was exaggeration as a figure of speech to suggest that if self-driving is solved, it will make the EV business look incredibly small.
What's shocking is how many people interpreted it literally--that the value is literally zero without self-driving--as if the successful EV business is in fact unsuccessful.
How are they the best cars in the world? I saw you add IMHO, but unless you mean it like children saying t-rex being the best dino, it really doesn’t bode well with reality.
Like, at least give one metric on which they are measurably “the best”.
The cars are real but they are... Mediocre. The first year of ownership is an amazing honeymoon period, assuming you got one with decent build quality. But the parts are cheap, and they break quickly. I've had one for 5+ years and it's gradually become something I prefer driving less and less.
For how long though? It seems like traditional automakers have mostly figured out EVs now. I love my Chevy EV, and it's 4 years old now. Similarly, the Kia EV I recently drove was excellent(although the Bolt one pedal driving is better IMHO).
But in itself, just making on of the best electric cars today would justify a valuation of 1/10 of what Tesla currently have.
I still admire Musk and Tesla for having started the electric revolution. But by 2025 (and maybe already are), they will just be one of many electric car manufacturers - somewhere in the middle of the pack.
> Does anybody seriously think Karpathy would step down if FSD was really close to be released ??
Life is not only about work. I stepped down from the company I funded after 14 years, it never stopped growing after that. Some people just get bored after doing one thing for a long time and want to explore other areas, especially if they always had broad interests.
> It really starts to feel like Tesla is a huge fraud which is about to be uncovered.
I seriously doubt that my Model S will somehow stop working so well and turn out to be just a fake car after 5 years.
Tesla has been over promising for sure. But it’s not a fraud. Had it been it would have met the same fate as Nikola. I am for heavy fines on Tesla for their lofty claims, but I don’t know why people want to see Tesla destroyed. They do have a profitable electric car business with no fraud.
Karpathy would step down if his vesting schedule is over and he failed to negotiate another stock option. TSLA is public, it's value was inflated at least an order of magnitude last year, so any rational person would cache hugely on that.
I've really gotten into self driving the past couple years. Watched every video I can find on it. Got a Toyota Rav 4 bought a Comma 2 then later a Comma 3. Really enjoy Comma AI as a product on the highway and some in town. That being said, self driving in city streets is a problem that literally doesn't exist. On highway you can relax, mostly because the car is keeping a straight line and not much is happening. As soon as a car is negotiating stop signs, doing turns, etc it stops being fun and relaxing and starts becoming anxiety inducing.
The only thing that needs to be solved fully is highway with auto lane change. That's literally it. And the system needs to do eye tracking like Comma Ai. If your adaptive cruise control makes you put your hand on the wheel it's useless.
You're mostly right about long distance commuting/travel... but FSD in cities is needed for trucking/delivery/taxis to be fully automated. If you need to pay for a driver to drive then a lot of value is going to be left on the table still.
It's starting to look like we'll need general intelligence AI to solve self driving fully within cities. For trucking, let AI solve the long distance stuff, then have drivers drive the last few miles. I don't know if it would work but worth thinking about. That seems like a much closer thing than having AI negotiating an 18 wheeler through a city
That's pretty much a totally different category of employee, though:
"Tesla is laying off 229 data annotation employees who are part of the company’s larger Autopilot team [...] Most of the workers were in moderately low-skilled, low-wage jobs, such as Autopilot data labeling"
Data annotation can be cheaply outsourced or scaled up and down without much affecting progress on self-driving. Karpathy leaving says more about that progress to me.
On top of that, this might signal that Tesla is confident enough in their autolabeling that they no longer need as much human intervention. This job title theoretically should be temporary.
> Data annotation can be cheaply outsourced or scaled up
Only if you don't care about quality. In my experience using annotation companies leads to a conflict of interest - annotate more to earn more, or annotate better.
I shared a desk with Andrej when he was interning at our lab for the summer at the University of Toronto. One of the smartest people I ever connected with about all CS topics. Hopefully he finds what he is looking for and takes a well earned break.
1) It sometimes can be hard to leave a company when you are "in the thick of it." A sabbatical can give you personal time to reflect on whether you want to stay or not.
2) Sometimes people use sabbaticals to prep/perform job interviews or plan career transitions.
3) Sabbaticals can allow you to quit early while waiting for vesting restricted stock units, employee stock plan sales, retirement contributions (matches), etc. There are certainly many more timed bonuses available for senior leaders.
At some companies sabbaticals are an explicit part of the benefits package. Every X years, you'll get a paid sabbatical of Y months. If you're close to X, why not wait and get the Y before you leave?
When a key person leaves, a lot of things can go haywire. If they take a break first, they can be semi-available to help resolve the most urgent issues. Kind of like turning off the circuit at the breaker to find out what is plugged into it.
This announcement comes after a 4 month sabbatical where Karpathy said he wanted to take some time off to “sharpen my technical edge,” which makes it sound like this is the result of frustration with the technical approach instead of burnout.
The fundamental flaws are in the decision-making being based upon 10-30 second feature memory, ignoring features outright, and only depending on visible road features instead of persisted map data.
For instance, near my house there's an intersection where it will try to use a turn-only lane with a red arrow when it's trying to go straight thru a light. 100% of the time. Even if I'm in the correct lane with no traffic around.
That's because the turn arrow on the ground is worn off. It is not a perception problem, it is
a) ignoring the obvious red turn arrow signal's significance for lane selection deliberately b) makes no attempt to persist or consult map data for 'which lanes go where' c) it completely disregards painted lines on the ground in the "no drive here" striping.
Also one block from my house, FSD will stay still indefinitely waiting for trash cans (displayed as trash cans) to clear the leftmost lane so that it will turn left.
None of the failures I encounter are due to lack of perception.
I think that's optimistic. Ten plus years.
There's too many exceptions. In my home town, an intersection. Imagine, if you will:
Traveling westbound there's four straight ahead lanes. There's a traffic light for each straight lane. The traffic lights will alternate "left two straight, green; right two straight red" and then "left two straight, red; right two, green". It does this because there's a tunnel and a roundabout right there. I guarantee that FSD will choke on this.
Tesla has been collecting thousands of dollars, each, from car buyers and utterly failing to deliver what it represented, and keeping the money year after year. Would be let GM, Toyota, or Audi do this? Where is the criminal prosecution? Where are the refunds?
I vividly remember a conversation I had with an acquaintance who had recently taken an engineering position with an AV company that occurred circa late 2018. He claimed that they were at most 1 year away. In fact, his exact words were something along the lines of "They're already here. It's just a few edge cases to work out and some regulatory hurdles to overcome."
The reality is that the first 80% of the problem had been solved quickly and significant progress had been made at the time on the next 10%. The end was in sight. Unfortunately, that next 10% ended up taking as long as the first 80% to solve, and the final 10% will likely take decades if it's even possible.
You aren’t beta testing complex automation system that’s operating on public roads. To do any meaningful testing you should have defined operation domain, specific behaviors to test, direct line to the engineering team to report issues, etc, etc.
Elon has been over-promising(i.e. flat out lying) about self-driving every year since.. 2014(there's a youtube video compilation of it)?
It seems like his strategy is to just come up with increasingly grandiose promises every year when he fails to deliver on his past promises. He's trapped in his swirling vortex of bullshit. Very worrying to see Karpathy leaving...
Steve Jobs was also known for his “reality distortion field”, so maybe it comes with the territory.
That said, FSD progress seems asymptotic and the Optimus thing always seemed like bullshit.
Elon in 2026: “by 2028 we’ll have FTL drives”
Elon in 2028: “time machine!”
They are just going about it better but not trying to selling it.
Any reason why everyone seems to be stuck on this problem?
"A Cruise autonomous vehicle ("Cruise AV") operating in driverless autonomous mode, was traveling eastbound on Geary Boulevard toward the intersection with Spruce Street. As it approached the intersection, the Cruise AV entered the left hand turn lane, turned the left turn signal on, and initiated a left turn on a green light onto Spruce Street. At the same time, a Toyota Prius traveling westbound in the rightmost bus and turn lane of Geary Boulevard approached the intersection in the right turn lane. The Toyota Prius was traveling approximately 40 mph in a 25 mph speed zone. The Cruise AV came to a stop before fully completing its turn onto Spruce Street due to the oncoming Toyota Prius, and the Toyota Prius entered the intersection traveling straight from the turn lane instead of turning. Shortly thereafter, the Toyota Prius made contact with the rear passenger side of the Cruise AV. The impact caused damage to the right rear door, panel, and wheel of the Cruise AV. Police and Emergency Medical Services were called to the scene, and a police report was filed. The Cruise AV was towed from the scene. Occupants of both vehicles received medical treatment for allegedly minor injuries."
Now, this shows the strengths and weaknesses of the system. The Cruise vehicle was making a left turn from Geary onto Spruce. Eastbound Geary at this point has a dedicated left turn lane cut out of a grass median, two through lanes, a right turn bus/taxi lane, and a bus stop lane. It detected cross traffic that shouldn't have been in that lane and was going too fast. So it stopped, and was hit.
It did not take evasive action, which might have worked. Or it might have made the situation worse. By not doing so, it did the legally correct thing. The other driver will be blamed for this. But it may not have done the thing most likely to avoid an accident. This is the real version of the trolley problem.
[1] https://www.dmv.ca.gov/portal/vehicle-industry-services/auto...
[2] https://earth.google.com/web/@37.78169591,-122.45337171
[3] https://patch.com/california/san-francisco/speed-limit-lower...
To solve the self-driving problem we need "smart" A.I., which means we have to approach it with systematic engineering, and the solution will probably involve some combination of better sensors, introspectable neural nets, symbolic A.I., and logical A.I.
Because it's really, really difficult. A lot of AI-ish stuff pretty rapidly gets to the point where it _looks_ quite impressive, but struggles to make the jump to actual feasibility. Like, there were convincing demos of voice recognition in the mid-90s. You could buy software to transcribe voice on your home computer, and people did. And, now, well, it's better than in the mid-90s certainly, but you wouldn't trust it to write a transcript, not of anything important. Maybe in 2040 we'll have voice recognition that can produce a perfect transcript, and human transcription will be a quaint old-fashioned concept. But I wouldn't like to bet on it, honestly.
And voice recognition is arguably a far, far easier problem.
ML maximalism focused on the narrow problem of 'solving driving' while not recognizing that any task as complex as driving requires probably something closer to general intelligence, and theoretically the field has been impoverished in favor of "throw more graphics cards at everything".
We can't build a robot which can walk down a sidewalk without running into people either. The sensor tech and mapping fidelity are red herrings. People drive well because only people are good at predicting human behavior.
I certainly wouldn't argue with you that it isn't ready for prime time and wide distribution, but it is interesting to see their progress in San Francisco, a much different driving problem.
If it takes them 10 years to get to prod in Mesa, two (maybe three?) in SF, maybe they start shrinking that a lot in metros without winters. ¯\_(ツ)_/¯
I thought they had real self-driving taxis in Pheonix that you can order? Real ones, with no safety driver.
That definitely sounds "better", even if it is heavily geo-fenced.
Deleted Comment
Because they're all trying visual- or line-of-sight methods only, I call this the "robo-human" fallacy in ML: trying to automate the processes that humans undergo so that you eventually have a drop-in replacement for a human. But that is a myopic and unimaginative approach because you could be re-assessing the system itself and eliminating inefficiencies that lead to poor performance.
In the autonomous vehicles space, there is massive potential for self-organizing swarm algorithms to control pelotons of cars, rather than individual cars with no intrinsic sense of the general flow of traffic. You wouldn't need a top-down "commander" style architecture, it could be designed so that cars only talk to their immediate neighbors and emergent patterns keep traffic flowing smooth and fast.
I have always been skeptical of the attempts to reduce the amount of information about the road that a car receives. (Moving from stereoscopic to monocular vision to save the cost of one camera seems just stupid.) But people who dream of "smart cities" really seem to see little more than The Jetsons in their mind, and it limits the scope of research to our detriment.
How is AI supposed to confidently distinguish a real stop sign from someone/something holding up a picture of a stop sign?
Yes, this is a weird edge case, but I think it gets at the core issue being that it takes way more sophistication to release this tech into the wild then ppl would like to admit.
Deleted Comment
Tesla is taking a fundamentally more broad and deep approach - working with the fundamental fact that a pair of visual sensors and a compute engine (eyes & brain) can successfully figure out driving in strange areas in real time, ergo, it should be possible without a map/model or lidar. Once they get it solved, it will be solved once and for all. Bigger gamble, bigger payoff. Equipping the car with dozens of eyes is the easy part. The question is whether enough compute power can be brought to bear on solving the recognition problems, and the edge cases. They have obvious issues with failing to recognize large objects like trucks in unexpected orientations, left turns etc. Using millions of miles of live human driver data as a training set is great, except that the average driver is really bad, so it's entirely polluted with bad examples, ESPECIALLY around the edge cases that get people killed. There, examples from professionally trained drivers, who really understand the physics and limits of the car, adhesion, traffic dynamics, etc, are what you want to train on, but that isn't what they have. It is also possible that even if the set of training data would actually be sufficient, the big question will kill them - perhaps the solution requires orders of magnitude more compute power to approach human performance, and they just don't have the hardware to simulate human compute power. So, have they just hit the limits of what their compute power can do?
I think Tesla's approach is fundamentally the way to go, as it is a general solution, compared to everyone else's limited map/model approach.
But both may require either or both a more specifically programmed higher-level behaviors, and/or something much closer to AGI than exists, something that has actual understanding of the machine-learned objects and relationships, which does not yet exist (if one is known, pleas correct me - I'd love to know about it).
They have a big head start, but other car companies are now investing much more in battery tech etc and will quickly catch up. Not to mention Tesla's have terrible build quality, they have a lot of shady business practices like overcounting sales, reusing sold parts etc which came out in the recent leak.
Thing like the 4860 battery which were so hyped turn out to be not that much better. FSD is years away. Stop selling vaporware.
What they need to focus on is things they innovated on like OTA updates, integrated systems, no dealerships etc.
They have terrible manufacturing quality. The screens melt in Arizona heat. Maybe the ride is cool and feels good, but the car itself is not incredible.
They may want to think about that strategy soon. Model 3 is starting to seem dated (not to mention Model S, which is ten years old). There are very competitive alternatives on the market now that have strengths where Tesla is weak, and which are not especially weak in the areas Tesla is strong.
Deleted Comment
How so? They're not selling robotaxis or building factories to build them
> Tesla has repeatedly promised FSD is right around the corner
Which means it's years away and/or "FSD" means "automatic cruise control and lane keep assist" or whatever standard feature from auto manufacturers they've renamed
Because they chose to back themselves into that corner. Musk says that Tesla is worth nothing without full self-driving. Certainly it's the only thing left to justify the stock price:
https://electrek.co/2022/06/15/elon-musk-solving-self-drivin...
> Which means it's years away and/or "FSD" means "automatic cruise control and lane keep assist"
Well, more precisely it means Musk has been lying about it for nine years straight:
https://jalopnik.com/elon-musk-promises-full-self-driving-ne...
The lies have been profitable so far. People have bought into the false promises. Perhaps they'll start demanding refunds for the full self-driving they paid for that has still not been delivered.
That's exactly what I would expect someone burning out to say. You feel the burnout so you need time to get over it and feel 100% (regain your technical edge). You're still burnt out after 4 months, so you don't come back.
Frustration with the technical approach can also cause burnout.
There is usually a hierarchy of sensors, mainly for redundancy. Example: Bumper sensory at the wheel base, sonar / Lidar at the mid, and a camera at the top for advanced sensing.
For the sake of cost cutting Tesla has done away with their radar sensors at the front of the vehicle. It would be a substantial cost overhead, but have very real repercussions when it comes to safety, while also providing a "ground truth" to what at least the front facing cameras are seeing.
I don't think Lidar is a practical sensor for them to adopt, because it is quite bulky and has limited viewing angles, but I would expect them to have adopted some novel, lower cost radar solution.
Apart from the lower cost of the camera, I think Elon's rationale for having a camera only FSD is not valid, has made the problem needlessly complex and unsafe. He believes since we have eyes, and we can drive a car, then it should be sufficient to drive the car, but we only use eyes because these are the sensors we were born with, it is the best we have. In my mind, Elon's approach is like looking at a horse, and saying to yourself, that you want to build a car based on a horse, where instead of wheels, you have four mechanical legs, and those mechanical legs are limited is so many ways, but they should still at least "work", but there is no reason to limit locomotion in that way. The same with the vision system on a FSD, the whole spectrum of light is available, with any number of configurations, providing data at rates and with precision far beyond what a camera system can do.
My background is in physics, but I find myself having a growing appreciate for the vision-only stack. It's really challenging building a formal understanding of the world that is robust to outliers that are so numerous as navigating in an urban environment. With vision, you have multiple kinds of information that are highly correlated (colour, spatial distribution, depth, etc) that are self-consistent. Whereas, fusing radar with vision, where object responses to radar are highly geometry & material dependent, is a much harder task.
I'm really not an expert, so this reads more as an opinion than an experienced view, but I can see the merits in doubling down on vision.
And Tesla cars have more than one camera on them. The front-facing camera is actually an array of 3 cameras (the two farthest ones are at about human eyes distance), but they're also equipped with forward and rearward looking side cameras, and back cameras.
I think Tesla underestimated how hard vision-only FSD is, but having a single camera (they don't) is not the reason.
LOL no, he was jumping ship already.
BTW, Andrej, if you're reading this, it is not just excellent it is beyond excellent. I do a lot of tinkering with transformers and other models lately, and base them all on minGPT. My fork is now growing into a kind of monorepo for deep learning experimentation, though lately it started looking like a repo of Theseus, and the boat is not as simple anymore :)
Well, I'm not sure that anyone's tech stack is capable of solving it; the live examples of robotaxis are, well, not something you'd bet your company on (and generally their creators are _not_ betting their companies on them). There was, I think, a decade ago the idea that fully self-driving cars were a near-term inevitability. That's fading, now.
Both clauses seem wrong.
How so?
If humans can master driving with 2 eyes looking forward, why would a car with plenty of cameras in all directions not have sufficient sensory input to master it?
The problem is the software, not the sensors.
you're using elon's own argument btw, are you repeating that knowingly
I think Karpathy realized (probably way back) that cheap sensors + no HD maps + their (reckless) public testing feedback loop doesn't advance towards L5 self driving and is bailing out. Karpathy has always backed Elon Musk whenever he talks about their technical approach, so it can't be frustration with the approach all of a sudden.
Tesla filed with the FCC in May to get authorization for a new radar system.
Does anybody seriously think Karpathy would step down if FSD was really close to be released ??
It really starts to feel like Tesla is a huge fraud which is about to be uncovered.
In fairness, this is just another in the long line of ridiculous things that he is prone to saying.
The tech is all right, and I got to try auto pilot at a discount. Unfortunately the phantom braking made AP completely useless with passengers who would freak out and complain. However, when it worked it was quite nice but I ended up using it way less than I'd hoped. Glad I didn't pay 12k for it!
The best part of owning the car was the insane acceleration and supercharger network at the beginning. But, that got annoying as more people started getting Teslas. Going on longer trips meant a ton of anxiety especially since some superchargers in cities would be packed. Worse, some would be out of order or slow charging. After the gimmick wore off, wasting 45 minutes to go another 2-3 hours started becoming annoying. And before someone asks why 2-3 hours, its called hills. California is full of them, and especially where I live I lose so much efficiency climbing hills.
Anyway overall I'd rate the car 5/10. Fastest car I've ever owned. Beyond that it was pretty much exactly as they described - a beta product. I'll probably try a Tesla again in 5-10 years.
They'll get eclipsed by other electric car manufacturers real quick.
Edit: more specifically, the parts break and they are difficult to replace. The battery degrades. They stopped providing maps to the vehicle unless I'm willing to spend several hundred dollars to replace the media console, they've told me I'm covered by a recall/warranty but have been unable to schedule the appointment.
It's a real shame because in principle having a manufacturing facility making great electric cars in the bay area would be a real win. Musk's reality distortion field is cracking.
As for luxury, quality, ride comfort - they’re just ok
What's shocking is how many people interpreted it literally--that the value is literally zero without self-driving--as if the successful EV business is in fact unsuccessful.
Like, at least give one metric on which they are measurably “the best”.
But are the financials of the company also real ? The prospects of future products ? Robotaxis, FSD, Cybertruck, Semi ?
But in itself, just making on of the best electric cars today would justify a valuation of 1/10 of what Tesla currently have.
I still admire Musk and Tesla for having started the electric revolution. But by 2025 (and maybe already are), they will just be one of many electric car manufacturers - somewhere in the middle of the pack.
Life is not only about work. I stepped down from the company I funded after 14 years, it never stopped growing after that. Some people just get bored after doing one thing for a long time and want to explore other areas, especially if they always had broad interests.
> It really starts to feel like Tesla is a huge fraud which is about to be uncovered.
I seriously doubt that my Model S will somehow stop working so well and turn out to be just a fake car after 5 years.
I say that I must be imagining that I’ve actually been driving an EV for the past four years.
I must’ve dreamt having to drive only 2 hours out of a 12 hour road trip.
Bugs? Yea. Some really dangerous ones we all have read about. Manufacturing defects? Yup! Missed Deadlines? Hell yeah.
But a fraud? Nope.
It was always the case.
The only thing that needs to be solved fully is highway with auto lane change. That's literally it. And the system needs to do eye tracking like Comma Ai. If your adaptive cruise control makes you put your hand on the wheel it's useless.
Just perfect highway driving first. Trucking alone would massively change if goods could be transported to edge of urban areas autonomously.
I'm sure it's being worked on too, but puzzling why that use case is not priority 1, 2 and 3
Data annotation can be cheaply outsourced or scaled up and down without much affecting progress on self-driving. Karpathy leaving says more about that progress to me.
Only if you don't care about quality. In my experience using annotation companies leads to a conflict of interest - annotate more to earn more, or annotate better.
Why do people leave companies in this manner?
1) It sometimes can be hard to leave a company when you are "in the thick of it." A sabbatical can give you personal time to reflect on whether you want to stay or not.
2) Sometimes people use sabbaticals to prep/perform job interviews or plan career transitions.
3) Sabbaticals can allow you to quit early while waiting for vesting restricted stock units, employee stock plan sales, retirement contributions (matches), etc. There are certainly many more timed bonuses available for senior leaders.
Employer: Are you sure? Why don’t you take some time off and think about it?
Deleted Comment
Deleted Comment