I don’t have (nor have to have) such a plan, I offer X service with Y guarantees paying out Z dollars if I don’t hold up my part of the bargain. In this hypothetical situation if Visa signs up I assumed they wanted to host their marketing website or some low-hanging fruit, it’s not my job to check what they’re using it for (in fact it would be preferable for me not to check, as I’d be seeing unencrypted card numbers and PII otherwise).
That aside, I think the example is good. It's a bit like priority inversion in scheduling. With no agreement from the lemonade seller they've suddenly changed greatly in terms of their criticality to some value creation chain.
I get that FSD (maybe) has/requires better hardware than my car. But what I hate about autopilot is all around basic driving:
* Lane centering. It's extremely aggressive about lane centering, if you're in the right lane and an onramp joins from the right, the car aggressively drives to the right as soon as it perceives that the lane is wider.
* Throttle/brake behavior. It waits too long to brake (despite having radar in my car, which can supposedly "see" more than one car ahead), and when it does apply the brakes it doesn't do so smoothly. It tips in somewhat aggressively, and you can feel the discrete steps in brake force application change. Ditto for acceleration when the traffic in front of me moves.
There's no reason to think that any of this has anything to do with compute power, it all seems to be programming decisions that have been made for whatever reasons, so I can't see why FSD would be different.
And yet, if FSD drives like this, I don't get how anyone can think it's good? On the other hand, I've also heard people say they think autopilot is good, which it's clearly not, so it makes me judge their driving skill rather than the different models. But perhaps there's some matrix of hardware revisions and software/decision models out there that I'm unaware of, that explains differences in driving behavior, if they exist?
Isn't there a trial you can try?
> Three out of three one-shot debugging hits with no help is extremely impressive. Importantly, there is no need to trust the LLM or review its output when its job is just saving me an hour or two by telling me where the bug is, for me to reason about it and fix it.
The approach described here could also be a good way for LLM-skeptics to start exploring how these tools can help them without feeling like they're cheating, ripping off the work of everyone who's code was used to train the model or taking away the most fun part of their job (writing code).
Have the coding agents do the work of digging around hunting down those frustratingly difficult bugs - don't have it write code on your behalf.
As I grew up, I started seeing/hearing about IMAX movies, and didn't realize they were different until I went to one in another part of the country. I was very excited to go, as it had been a long time since I had been to an Omnimax.
I was pretty confused and disappointed, which is a weird reaction to have the first time in an IMAX theater. "It's just a big screen... Where's the dome?"
"Who put the bomp in the bomp sha bomp!"
I too had a similar reaction the first time I saw an imax.
A few takeaways:
-The sampling rate is slightly off between devices - approximately ±1 sample per second - not a lot, but you need to take that into account.
-Spectral characteristics in consumer microphones are all over the place - two phones of the same model, right out of the box, will have not only measurable, but also audible differences.
-Sound bounces off of everything, particularly concrete walls.
-A car is the closest thing to an anechoic chamber you can readily access.
-The Fourier transform of a Gaussian is a Gaussian, which is very helpful when you need to estimate the frequency of a harmonic signal (like speech) with a wavelength shorter than half your window, but just barely.
If you have an AI that can answer 90% of queries correctly AND now this is the key, it knows which 90% it can answer correctly, human in the loop can be incredibly valuable to answer that other 10%.