Cruise AVs are being remotely assisted (RA) 2-4% of the time on average, in complex urban environments. This is low enough already that there isn’t a huge cost benefit to optimizing much further, especially given how useful it is to have humans review things in certain situations.
The stat quoted by nyt is how frequently the AVs initiate an RA session. Of those, many are resolved by the AV itself before the human even looks at things, since we often have the AV initiate proactively and before it is certain it will need help. Many sessions are quick confirmation requests (it is ok to proceed?) that are resolved in seconds. There are some that take longer and involve guiding the AV through tricky situations. Again, in aggregate this is 2-4% of time in driverless mode.
In terms of staffing, we are intentionally over staffed given our small fleet size in order to handle localized bursts of RA demand. With a larger fleet we expect to handle bursts with a smaller ratio of RA operators to AVs. Lastly, I believe the staffing numbers quoted by nyt include several other functions involved in operating fleets of AVs beyond remote assistance (people who clean, charge, maintain, etc.) which are also something that improve significantly with scale and over time.
https://images.ctfassets.net/95kuvdv8zn1v/6h1C7lPC79OLOlddEE...
They and their VC backers are clearly betting on the concept that radars + lidar + imaging will be the ultimate successful solution in full self driving cars, as a completely opposite design and engineering philosophy from Tesla attempting to do "full self driving" with camera sensors and categorical rejection of lidar.
It is interesting to me that right now this is sitting on the HN homepage directly adjacent to: "Tesla to recall vehicles that may disobey stop signs (reuters.com)"
Our strategy has been to solve the challenges needed to operate driverless robotaxis on a well-equipped vehicle, then aggressively drive the cost down. Many OEMs are doing this in reverse order. They're trying to squeeze orders of magnitude of performance gains out of really low-cost hardware. Today it's unclear what strategy will win.
In a few years our next generation of low-cost compute and sensing lands in these vehicles and our service area will be large enough that you forget there is even a geofence. If OEMs have still not managed to get the necessary performance gains to go fully driverless, we'll know what move was the right one.
We shared several details on how our system works and our future plans here: https://www.youtube.com/playlist?list=PLkK2JX1iHuzz7W8z3roCZ...
> Cruise has had state authority to test autonomous vehicles on public roads with a safety driver since 2015 and authority to test autonomous vehicles without a driver since October 2020.
> Waymo has had state authority to test autonomous vehicles on public roads with a safety driver since 2014 and received a driverless testing permit in October 2018.
More info at the DMV's website (although it's a lot to digest): https://www.dmv.ca.gov/portal/vehicle-industry-services/auto...
- Cruise permit is for robo-taxi service, available to public (fully driverless, nobody in the car)
- Waymo permit is for robo-taxi service, available to public (human safety driver behind the wheel at all times)
- Nuro permit is for robo-delivery, available to public (no human passengers)
Excuse my ignorance, and I'm sure 2 seconds is probably an engineering feat, but I'm genuinely curious. What is it that prevents latency to go as down as a few hundred ms (pretty much close to and IP round trip) ?
1) If you want very low latency, any network jitter or delays will cause pauses on the viewer side and "skips" when the feed catches up after a brief dropout. This is fine for video chat, where a little blip doesn't interrupt the experience. For live streams with 30k+ viewers, it's pretty annoying and very noticeable if the audio cuts out or skips. A 2-3 second window is typically large enough to paper over any jitter or retransmits due to packet loss between the broadcaster and Twitch servers.
2) Transcoding can be done with very low latency, but it's harder to scale horizontally and uses more bandwidth than if you give yourself a few hundred ms of buffer. Larger buffers enable better compression. Transcoding is needed if you want to stream to mobile, web, etc. in multiple formats, bitrates, or resolutions.
3) Chunked HTTP content is much easier to serve than RTMP or WebRTC-style content. You can use nginx or drop your content on a low-cost CDN. The caveat is that chunking generally introduces latency unless you do something fancy such as streaming chunks as they're being written to disk.
Source: I designed the video streaming network that Twitch was running when Amazon acquired it. More info here http://highscalability.com/blog/2010/3/16/justintvs-live-vid...
By this point in history, it wasn’t just him anymore and we’d done a few rounds of improvements already out of necessity. As I recall, he got us up and running at PAIX based mostly on research, but most of the other data centers were built out by a network engineer(1) we hired away from YouTube.
While he was working on the network engineering and keeping the original system afloat, I did a lot of the software work for the system described here.
(1) Name withheld out of courtesy
"Vogt, as the sole Director of Cruise Automation, Inc., authorized the issuance of 50% of the Company’s stock to Guillory;"
That seems like a very clear statement. It's either true, in which case Guillory has a very strong case, or it's false and he doesn't.