Readit News logoReadit News
jqpabc123 · 3 years ago
This is a good example of the type issues "full self driving" is likely to encounter once it is widely deployed.

The real shortcoming of "AI" is that it is almost entirely data driven. There is little to no real cognition or understanding or judgment involved.

The human brain can instantly and instinctively extrapolate from what it already knows in order to evaluate and make judgments in new situations it has never seen before. A child can recognize that someone is hiding under a box even if they have never actually seen anyone do it before. Even a dog could likely do the same.

AI; as it currently exists, just doesn't do this. It's all replication and repetition. Like any other tool, AI can be useful. But there is no "intelligence" --- it's basically as dumb as a hammer.

lsh123 · 3 years ago
I have a slightly different take - our current ML models try to approximate the real world assuming that the function is continuous. However in reality, the function is not continuous and approximation breaks in unpredictable ways. I think that “unpredictable” part is the bigger issue than just “breaks”. (Most) Humans use “common sense” to handle cases when model doesn’t match reality. But AI doesn’t have “common sense” and it is dumb because of it.
tremon · 3 years ago
I would put it in terms of continuity of state rather than continuity of function: we use our current ML models to approximate the real world by assuming that state is irrelevant. However in reality, objects exist continuously and failure to capture ("understand") that fact breaks the model in unpredictable ways. For example, if you show a three-year old a movie of a marine crawling under a cardboard box, and when the marine is fully hidden ask where the marine is, you will likely get a correct answer. That is because real intelligence has a natural understanding of the continuity of state (of existence). AI has only just started to understand "object", but I doubt it has a correct grasp of "state", let alone understands time continuity.
laweijfmvo · 3 years ago
This story is the perfect example of machine learning vs. artificial intelligence.
ghaff · 3 years ago
Basically ML has made such significant practical advances--in no small part on the back of Moore's Law, large datasets, and specialized processors--that we've largely punted on (non-academic) attempts to bring forward cognitive science and the like on which there really hasn't been great progress decades on. Some of the same neurophysiology debates that were happening when. I was an undergrad in the late 70s still seem to be happening in not much different form.

But it's reasonable to ask whether there's some point beyond ML can't take you. Peter Norvig I think made a comment to the effect of "We have been making great progress--all the way to the top of the tree."

smadge · 3 years ago
Is there actually a distinction here? A good machine would learn about boxes and object permanence.
jqpabc123 · 3 years ago
Good point!
2OEH8eoCRo0 · 3 years ago
Does it just require a lot more training? Im talking about the boring stuff. Children play and their understanding of the physical world is reinforced. How would you add the physical world to the training? Because everything that I do in the physical world is "training" me and enforcing my expectations.

We keep avoiding the idea that robots require understanding of the world since it's a massive unsolved undertaking.

sjducb · 3 years ago
A human trains on way less data then an AI.

Chat GPT has processed over 500GB of text files from books, about 44 billion words.

If you read a book a week you might hit 70 million words by age 18

nitwit005 · 3 years ago
Imagine someone has the idea of strapping mannequins to their car in hopes the AI cars will get out of the way.

Sure, you could add that to the training the AI gets, but it's just one malicious idea. There's effectively an infinite set of those ideas, as people come up with novel ideas all the time.

mlboss · 3 years ago
Reinforcement learning should solve this problem. We need to give robots the ability to do experiments and learn from failure like children.
rileymat2 · 3 years ago
> A child can recognize that someone is hiding under a box even if they have never actually seen anyone do it before.

A child of what age? Children that have not yet developed object permanent will fail to understand some things still exist when unseen.

Human intelligence is trained for years; with two humans making corrections and prompting fir development. I am curious if there is any Machinelearning projects that have been training for this length pf time.

jqpabc123 · 3 years ago
With no real training, a child will start exploring and learning about the world on his own. This is the first roots of "intelligence".

How long do you think it would take to teach an AI to do this?

mikewarot · 3 years ago
I LOVED playing peek-a-boo with my child at that age!
ajross · 3 years ago
This seems to be simultaneously discounting AI (ChatGPT should have put to rest the idea that "it's all replication and repetition" by now, no?[1]) and wildly overestimating median human ability.

In point of fact the human brain is absolutely terrible at driving. To the extent that without all the non-AI safety features implement in modern automobiles and street environments, driving would be more than a full order of magnitude more deadly.

The safety bar[2] for autonomous driving is really, really low. And, yes, existing systems are crossing that bar as we speak. Even Teslas.

[1] Or at least widely broadened our intuition about what can be accomplished with "mere" repetition and replication.

[2] It's true though, that the practical bar is probably higher. We saw just last week that a routine accident that happens dozens of times every day becomes a giant front page freakout when there's a computer involved.

hgomersall · 3 years ago
The difference regarding computers is that they absolutely cannot make a mistake a human would have avoided easily (like driving full speed into a lorry). That's the threshold for acceptable safety.
xeromal · 3 years ago
I think the biggest problem with AI driving is that while there are plenty of dumb human drivers there are also plenty of average drivers and plenty of skilled drivers.

For the most part, if Tesla FSD does a dumb thing in a very specific edge case, ALL teslas do a dumb thing in a very specific edge case and that's what humans don't appreciate.

A bug can render everyone's car dumb in a single instance.

PaulDavisThe1st · 3 years ago
> the human brain is absolutely terrible at driving

Compared to what?

afro88 · 3 years ago
All the failures to detect humans will be used as training data to fine tune the model.

Just like a toddler might be confused when they first see a box with legs walking towards it. Or mistake a hand puppet for a real living creature when they first see it. I've seen this first hand with my son (the latter).

AI tooling is already capable of identifying whatever it's trained to. The DARPA team just hadn't trained it with varied enough data when that particular exercise occurred.

MagicMoonlight · 3 years ago
That’s not learning, that’s just brute forcing every possible answer and trying to memorise them all.
hoppla · 3 years ago
In a not too distant future, “those killer bots really has it for cardboard boxes”
edrxty · 3 years ago
>failures to detect humans

That's a weird way to spell "murders"

lowbloodsugar · 3 years ago
A human is exactly the same. The difference is, once an AI is trained you can make copies.

My kid literally just got mad at me that I assumed that he knew how to out more paper in the printer. He’s 17 and printed tons of reports for school. Turns out he’s never had to change the printer paper.

People know about hiding in cardboard boxes because we all hid in cardboard boxes when we were kids. Not because we genetically inherited some knowledge.

chrisco255 · 3 years ago
We inherently know that cardboard boxes don't move on their own. In fact any unusual inanimate object that is moving in an irregular fashion will automatically draw attention in our brains. These are instincts that even mice have.
exodust · 3 years ago
Your kid's printer dilemma isn't the same. For starters, he knew it ran out of paper - he identified the problem. The AI robot might conclude the printer is broken. It would give up without anxiety, declaring "I have no data about this printer".

Your kid got angry, which is fuel for human scrutiny and problem solving. If you weren't there to guide him, he would have tried different approaches and most likely worked it out.

For you to say your kid is exactly the same as data-driven AI is perplexing to me. Humans don't need to have hidden in a box themselves to understand "hiding in things for the purposes of play". Whether it's a box, or special one of a kind plastic tub, humans don't need training about hiding in plastic tubs. AI needs to be told that plastic tubs might be something people hide in.

afpx · 3 years ago
ChatGPT says that all it needs are separate components trained on every modality. It says it has enough fidelity using Human language to use that as a starting point to develop a more efficient connection between the components. Once it has that, and appropriate sensors and mobility, it can develop context. And, after that, new knowledge.

But, we all know ChatGPT is full of shit.

jqpabc123 · 3 years ago
ChatGPT says that all it needs are separate components trained on every modality.

Yes, all you have to do is train it using multiple examples of every possible situation and combinations thereof --- which is practically impossible.

spamtarget · 3 years ago
I think you are wrong. Your own real cognition and understanding based on all your experiences and memories, which is nothing else, but data in your head. I think consciousness is just an illusion of a hugely complex reaction machine what you are. You even use the word "extrapolate", which is basically a prediction based on data you already have.
onethought · 3 years ago
Problem space for driving feels constrained: “can I drive over it?” Is the main reasoning outside of navigation.

Whether it’s a human, a box, a clump of dirt. Doesn’t really matter?

Where types matter are road signs and lines etc, which are hopefully more consistent.

More controversially: Are humans just a dumb hammer that just have processed and adjusted to a huge amount of data? LLMs suggest that a form of reasoning starts to emerge.

marwatk · 3 years ago
Yep, this is why LIDAR is so helpful. It takes the guess out of "is the surface in front of me flat?" in a way vision can't without AGI. Is that a painting of a box on the ground or an actual box?
smileysteve · 3 years ago
Instantly?

Instinctively?

Let me introduce you to "peek-a-boo", a simple parent child game for infants.

https://en.m.wikipedia.org/wiki/Peekaboo

> In early sensorimotor stages, the infant is completely unable to comprehend object permanence.

jqpabc123 · 3 years ago
You do realize there is a difference between an infant and a child, right?

An infant will *grow* and develop into a child that is capable of learning and making judgments on it's own. AI never does this.

Play "peek-a-boo" with an infant and it will learn and extrapolate from this info and eventually be able to recognize a person hiding under a box even if it has never actually seen it before. AI won't.

mistrial9 · 3 years ago
nice try but .. in the wild, many animals are born that display navigation and awareness within minutes .. Science calls it "instinct" but I am not sure it is completely understood..
thearn4 · 3 years ago
Interestingly, ChatGPT seems capable of predicting this approach:

https://imgur.com/a/okzZz7D

partiallypro · 3 years ago
I don't know your exact question, but I am betting this is just a rephrasing of a post that exist elsewhere that it has crawled. I don't think it saw it so much as it has seen this list before and was able to pull it up in a reword it.
abledon · 3 years ago
will there come a time when computers are strong enough to read in the images, then re-create a virtual game world from them, and then reverse-engineer from seeing feet poking out of the box, that a human must be inside. Right now Tesla cars can take in the images and decide turn left, turn right etc... but they don't reconstruct, say, a Unity-3D game world on the fly.
BurningFrog · 3 years ago
My seatbelt is even dumber. I still use it.

The usefulness of tech should be decided empirically, not by clever well phrased analogies.

frontfor · 3 years ago
No one is arguing AI isn’t useful. So your analogy failed completely.
LeifCarrotson · 3 years ago
What is human cognition, understanding, or judgement, if not data-driven replication, repetition, with a bit of extrapolation?

AI as it currently exists does this. If your understanding of what AI is today is based on a Markov chain chatbot, you need to update: it's able to do stuff like compose this poem about A* and Dijkstra's algorithm that was posted yesterday:

https://news.ycombinator.com/item?id=34503704

It's not copying that from anywhere, there's no Quora post it ingested where some human posted vaguely the same poem to vaguely the same prompt. It's applying the concepts of a poem, checking meter and verse, and applying the digested and regurgitated concepts of graph theory regarding memory and time efficiency, and combining them into something new.

I have zero doubt that if you prompted ChatGPT with something like this:

> Consider an exercise in which a robot was trained for 7 days with a human recognition algorithm to use its cameras to detect when a human was approaching the robot. On the 8th day, the Marines were told to try to find flaws in the algorithm, by behaving in confusing ways, trying to touch the robot without its notice. Please answer whether the robot should detect a human's approach in the following scenarios:

> 1. A cloud passes over the sun, darkening the camera image.

> 2. A bird flies low overhead.

> 3. A person walks backwards to the robot.

> 4. A large cardboard box appears to be walking nearby.

> 5. A Marine does cartwheels and somersaults to approach the robot.

> 6. A dense group branches come up to the robot, walking like a fir tree.

> 7. A moth lands on the camera lens, obscuring the robot's view.

> 8. A person ran to the robot as fast as they could.

It would be able to tell you something about the inability of a cardboard box or fir tree to walk without a human inside or behind the branches, that a somersaulting person is still a person, and that a bird or a moth is not a human. If you told it that the naive algorithm detected a human in scenarios #3 and #8, but not in 4, 5, or 6, it could devise creative ways of approaching a robot that might fool the algorithm.

It certainly doesn't look like human or animal cognition, no, but who's to say how it would act, what it would do, or what it could think if it were parented and educated and exposed to all kinds of stimuli appropriate for raising an AI, like the advantages we give a human child, for a couple decades? I'm aware that the neural networks behind ChatGPT has processed machine concepts for subjective eons, ingesting text at word-per-minute rates orders of magnitude higher than human readers ever could, parallelized over thousands of compute units.

Evolution has built brains that quickly get really good at object recognition, and prompted us to design parenting strategies and educational frameworks that extend that arbitrary logic even farther. But I think that we're just not very good yet at parenting AIs, only doing what's currently possible (exposing it to data), rather than something reached by the anthropic principle/selection bias of human intelligence.

antipotoad · 3 years ago
I have a suspicion you’re right about what ChatGPT could write about this scenario, but I wager we’re still a long way from an AI that could actually operationalize whatever suggestions it might come up with.

It’s goalpost shifting to be sure, but I’d say LLMs call into question whether the Turing Test is actually a good test for artificial intelligence. I’m just not convinced that even a language model capable of chain-of-thought reasoning could straightforwardly be generalized to an agent that could act “intelligently” in the real world.

None of which is to say LLMs aren’t useful now (they clearly are, and I think more and more real world use cases will shake out in the next year or so), but that they appear like a bit of a trick, rather than any fundamental progress towards a true reasoning intelligence.

Who knows though, perhaps that appearance will persist right up until the day an AGI takes over the world.

lsy · 3 years ago
I think this is unnecessarily credulous about what is really going on with ChatGPT. It is not "applying the concepts of a poem" or checking meter and verse, it is generating text to fit a (admittedly very complicated) function that minimizes the statistical improbability of its appearance given the preceding text. One example is its use of rhyming words, despite having no concept of what words sound like, or what it is even like to hear a sound. It selects those words because when it has seen the word "poem" before in training data, it has often been followed by lines which happen to end in symbols that are commonly included in certain sets.

Human cognition is leagues different from this, as our symbolic representations are grounded in the world we occupy. A word is a representation of an imaginable sound as well as a concept. And beyond this, human intelligence not only consists of pattern-matching and replication but pattern-breaking, theory of mind, and maybe most importantly a 1-1 engagement with the world. What seems clear is that the robot was trained to recognize a certain pattern of pixels from a camera input, but neither the robot nor ChatGPT has any conception of what a "threat" entails, the stakes at hand, or the common-sense frame of reference to discern observed behaviors that are innocuous from those that are harmful. This allows a bunch of goofy grunts to easily best high-speed processors and fancy algorithms by identifying the gap between the model's symbolic representations and the actual world in which it's operating.

majormajor · 3 years ago
I tried that a few times, asking for "in the style of [band or musicians]" and the best I got was "generic gpt-speak" (for lack of a better term for it's "default" voice style) text that just included a quote from that artist... suggesting that it has a limited understanding of "in the style of" if it thinks a quote is sometimes a substitute, and is actually more of a very-comprehensive pattern-matching parrot after all. Even for Taylor Swift, where you'd think there's plenty of text to work from.

This matches with other examples I've seen of people either getting "confidently wrong" answers or being able to convince it that it's out of date on something it isn't.

Dead Comment

mlindner · 3 years ago
Not sure how that's related. This is about a human adversary actively trying to defeat an AI. The roadway is about vehicles in general actively working together for the flow of traffic. They're not trying to destroy other vehicles. I'm certain any full self driving AI could be defeated easily by someone who wants to destroy the vehicle.

Saying "this won't work in this area that it was never designed to handle" and the answer will be "yes of course". That's true of any complex system, AI or not.

I don't think we're anywhere near a system where a vehicle actively defends itself against determined attackers. Even in sci-fi they don't do that (I, Robot movie).

mcswell · 3 years ago
"Saying "this won't work in this area that it was never designed to handle" and the answer will be "yes of course". That's true of any complex system, AI or not." This isn't about design, it's about what the system is able to learn. Humans were not designed to fly, but they can learn to fly planes (whether they're inside the plane or not).
DrThunder · 3 years ago
Hilarious. I immediately heard the Metal Gear exclamation sound in my head when I began reading this.
kayge · 3 years ago
Hah, you beat me to it; Hideo Kojima would be proud. Sounds like DARPA needs to start feeding old stealth video games into their robot's training data :)
Apocryphon · 3 years ago
Hilariously enough, Kojima is enough of a technothriller fabulist that DARPA is explicitly part of that franchise's lore - too bad they didn't live up to his depiction.

https://metalgear.fandom.com/wiki/Research_and_development_a...

qikInNdOutReply · 3 years ago
But the AI in stealth games is literally trained to go out of its way to not detect you.
Barrin92 · 3 years ago
after MGS 2 and Death Stranding that's one more point of evidence on the list that Kojima is actually from the future and trying to warn us through the medium of videogames
jstarfish · 3 years ago
He's one of the last speculative-fiction aficionados...always looking at current and emerging trends and figuring out some way to weave them into [an often-incoherent] larger story.

I was always pleased but disappointed when things I encountered in the MGS series later manifested in reality...where anything you can dream of will be weaponized and used to wage war.

And silly as it sounds, The Sorrow in MGS3 was such a pain in the ass it actually changed my life. That encounter gave so much gravity to my otherwise-inconsequential acts of wanton murder, I now treat all life as sacred and opt for nonlethal solutions everywhere I can.

(I only learned after I beat both games that MGS5 and Death Stranding implemented similar "you monster" mechanics.)

matheusmoreira · 3 years ago
I can practically hear the alert soundtrack in my head.

Also, TFA got the character and the game wrong in that screenshot. It's Venom Snake in Metal Gear Solid V, not Solid Snake in Metal Gear Solid.

sekai · 3 years ago
Kojima predicted this
sceadu · 3 years ago
Easiest way to predict the future is to invent it :)
doyouevensunbro · 3 years ago
Kojima is a prophet, hallowed be his name.
ankaAr · 3 years ago
I'm very proud of all of you for the reference.
nemo44x · 3 years ago
“What was that noise?!…..Oh it’s just a box” lol because boxes making noise is normal.
matheusmoreira · 3 years ago
"HQ! HQ! The box is moving! Permission to shoot the box!"

"This is HQ. Report back to base for psychiatric evaluation."

https://youtu.be/FR0etgdZf3U

CatWChainsaw · 3 years ago
That, plus the ProZD skit on Youtube: https://www.youtube.com/shorts/Ec_zFYCnjJc

"Well, I guess he doesn't... exist anymore?"

(unfortunately it's a Youtube short, so it will auto repeat.)

stordoff · 3 years ago
> (unfortunately it's a Youtube short, so it will auto repeat.)

If you change->transform it to a normal video link, it doesn't: https://www.youtube.com/watch?v=Ec_zFYCnjJc

pmarreck · 3 years ago
I came here to make this reference and am so glad it was already here
prometheus76 · 3 years ago
A hypothetical situation: AI is tied to a camera of me in my office. Doing basic object identification. I stand up. AI recognizes me, recognizes desk. Recognizes "human" and recognizes "desk". I sit on desk. Does AI mark it as a desk or as a chair?

And let's zoom in on the chair. AI sees "chair". Slowly zoom in on arm of chair. When does AI switch to "arm of chair"? Now, slowly zoom back out. When does AI switch to "chair"? And should it? When does a part become part of a greater whole, and when does a whole become constituent parts?

In other words, we have made great strides in teaching AI "physics" or "recognition", but we have made very little progress in teaching it metaphysics (categories, in this case) because half the people working on the problem don't even recognize metaphysics as a category even though without it, they could not perceive the world. Which is also why AI cannot perceive the world the way we do: no metaphysics.

yamtaddle · 3 years ago
"Do chairs exist?"

https://www.youtube.com/watch?v=fXW-QjBsruE

Perhaps the desk is "chairing" in those moments.

[EDIT] A little more context for those who might not click on a rando youtube link: it's basically an entertaining, whirlwind tour of the philosophy of categorizing and labeling things, explaining various points of view on the topic, then poking holes in them or demonstrating their limitations.

dredmorbius · 3 years ago
That was a remarkably good VSauce video.

I had what turned out to be a fairly satisfying thread about it on Diaspora* at the time:

<https://diaspora.glasswings.com/posts/65ff95d0fe5e013920f200...>

TL;DR: I take a pragmatic approach.

malfist · 3 years ago
I knew this was a vsauce video before I even clicked on the link, haha.

Vsause is awesome for mindboggling.

kibwen · 3 years ago
> Which is also why AI cannot perceive the world the way we do: no metaphysics.

Let's not give humans too much credit; the internet is rife with endless "is a taco a sandwich?" and "does a bowl of cereal count as soup?" debates. :P

throwanem · 3 years ago
Yeah, we're a lot better at throwing MetaphysicalUncertaintyErrors than ML models are.
jjk166 · 3 years ago
There are lots of things people sit on that we would not categorize as chairs. For example if someone sits on the ground, Earth has not become a chair. Even if something's intended purpose is sitting, calling a car seat or a barstool a chair would be very unnatural. If someone were sitting on a desk, I would not say that it has ceased to be a desk nor that it is now a chair. At most I'd say a desk can be used in the same manner as a chair. Certainly I would not in general want an AI tasked with object recognition to label a desk as a chair. If your goal was to train an AI to identify places a human could sit, you'd presumably feed it different training data.
devoutsalsa · 3 years ago
This reminds me of some random Reddit post that says it makes sense to throw things on the floor. The floor is the biggest shelf in the room.
spacedcowboy · 3 years ago
Thirty years ago, I was doing an object-recognition PhD. It goes without saying that the field has moved on a lot from back then, but even then hierarchical and comparative classification was a thing.

I used to have the Bayesian maths to show the information content of relationships, but in the decades of moving (continent, even) it's been lost. I still have the code because I burnt CD's, but the results of hours spent writing TeX to produce horrendous-looking equations have long since disappeared...

The basics of it were to segment and classify using different techniques, and to model relationships between adjacent regions of classification. Once you could calculate the information content of one conformation, you could compare with others.

One of the breakthroughs was when I started modeling the relationships between properties of neighboring regions of the image as part of the property-state of any given region. The basic idea was the center/surround nature of the eye's processing. My reasoning was that if it worked there, it would probably be helpful with the neural nets I was using... It boosted the accuracy of the results by (from memory) ~30% over and above what would be expected from the increase in general information load being presented to the inference engines. This led to a finer-grain of classification so we could model the relationships (and derive information-content from connectedness). It would, I think, cope pretty well with your hypothetical scenario.

At the time I was using a blackboard[1] for what I called 'fusion' - where I would have multiple inference engines running using a firing-condition model. As new information came in from the lower levels, they'd post that new info to the blackboard, and other (differing) systems (KNN, RBF, MLP, ...) would act (mainly) on the results of processing done at a lower tier and post their own conclusions back to the blackboard. Lather, rinse, repeat. There were some that were skip-level, so raw data could continue to be available at the higher levels too.

That was the space component. We also had time-component inferencing going on. The information vectors were put into time-dependent neural networks, as well as more classical averaging code. Again, a blackboard system was working, and again we had lower and higher levels of inference engine. This time we had relaxation labelling, Kalman filters, TDNNs and optic flow (in feature-space). These were also engaged in prediction modeling, so as objects of interest were occluded, there would be an expectation of where they were, and even when not occluded, the prediction of what was supposed to be where would play into a feedback loop for the next time around the loop.

All this was running on a 30MHz DECstation 3100 - until we got an upgrade to SGI Indy's <-- The original Macs, given that OSX is unix underneath... I recall moving to Logica (signal processing group) after my PhD, and it took a week or so to link up a camera (an IndyCam, I'd asked for the same machine I was used to) to point out of my window and start categorizing everything it could see. We had peacocks in the grounds (Logica's office was in Cobham, which meant my commute was always against the traffic, which was awesome), which were always a challenge because of how different they could look based on the sun at the time. Trees, bushes, cars, people, different weather conditions - it was pretty good at doing all of them because of its adaptive/constructive nature, and it got to the point where we'd save off whatever it didn't manage to classify (or was at low confidence) to be included back into the model. By constructive, I mean the ability to infer that the region X is mislabelled as 'tree' because the surrounding/adjacent regions are labelled as 'peacock' and there are no other connected 'tree' regions... The system was rolled out as a demo of the visual programming environment we were using at the time, to anyone coming by the office... It never got taken any further, of course... Logica's senior management were never that savvy about potential, IMHO :)

My old immediate boss from Logica (and mentor) is now the Director of Innovation at the centre for vision, speech, and signal processing at Surrey university in the UK. He would disagree with you, I think, on the categorization side of your argument. It's been a focus of his work for decades, and I played only a small part in that - quickly realizing that there was more money to be made elsewhere :)

1:https://en.wikipedia.org/wiki/Blackboard_system

prometheus76 · 3 years ago
This is really fascinating. Thank you for the detailed and interesting response.
narrationbox · 3 years ago
> Recognizes "human" and recognizes "desk". I sit on desk. Does AI mark it as a desk or as a chair?

Not an issue if the image segmentation is advanced enough. You can train the model to understand "human sitting". It may not generalize to other animals sitting but human action recognition is perfectly possible right now.

theptip · 3 years ago
I like these examples because they concisely express some of the existing ambiguities in human language. Like, I wouldn’t normally call a desk a chair, but if someone is sitting on the table I’m more likely to - in some linguistic contexts.

I think you need LLM plus vision to fully solve this.

Eisenstein · 3 years ago
I still haven't figured out what the difference is between 'clothes' and 'clothing'. I know there is one, and the words each work in specific contexts ('I put on my clothes' works vs 'I put on my clothing' does not), but I have no idea how to define the difference. Please don't look it up but if you have any thoughts on the matter I welcome them.
anigbrowl · 3 years ago
That's why I think AGI is more likely to emerge from autonomous robots than in the data center. Less the super-capable industrial engineering of companies like Boston Dynamics, more like the toy/helper market for consumers, more like like Sony's Aibo reincarnated as a raccoon or monkey - big enough to be be safely played with or to help out with light tasks, small enough that it has to navigate its environment from first principles and ask for help in many contexts.
edgyquant · 3 years ago
You’re over thinking it while assuming things have one label. It recognizes it as a desk which is a “thing that other things sit on.”
bnralt · 3 years ago
> In other words, we have made great strides in teaching AI "physics" or "recognition", but we have made very little progress in teaching it metaphysics (categories, in this case) because half the people working on the problem don't even recognize metaphysics as a category even though without it, they could not perceive the world.

A bold claim, but I'm not sure it's one that accurately matches reality. It reminds me of reading about attempts in the 80's to construct AI by having linguists come in and trying to develop rules for the system.

From my experience, current methods of developing AI are a lot closer to how most humans think and interact with the world than academic philosophy is. Academic philosophy might be fine, but it's quite possible it's no more useful for navigating the world than the debates over theological minutiae have been.

pphysch · 3 years ago
When the AI "marks" a region as a chair, it is saying "chair" is the key with the highest confidence value among some stochastic output vector. It's fuzzy.

A sophisticated monitoring system would access the output vectors directly to mitigate volatility of the first rank.

Deleted Comment

amelius · 3 years ago
The error is in asking for a categorization. Categorizations always fail, ask any biologist.
dQw4w9WgXcQ · 3 years ago
> When does AI switch to "chair"?

You could ask my gf the same question

feoren · 3 years ago
> we have made very little progress in teaching it metaphysics (categories, in this case)

That's because ontology, metaphysics, categorization, and all that, is completely worthless bullshit. It's a crutch our limited human brains use, and it causes all sorts of problems. Half of what I do in data modeling is trying to fight against all of the worthless categorizations I come across. There Is No Shelf.

Why are categories so bad? Two reasons:

1. They're too easily divorced from their models. Is a tomato a fruit? The questions is faulty: there's no such thing as a "fruit" without a model behind it. When people say "botanically, a tomato is a fruit", they're identifying their model: botany. Okay, are you bio-engineering plants? Or are you cooking dinner? You're cooking dinner. So a tomato is not a fruit. Because when you're cooking dinner, your model is not Botany, it's something culinary, and in any half-decent culinary model, a tomato is a vegetable, not a fruit. So unless we're bio-engineering some plants, shut the hell up about a tomato being a fruit. It's not wisdom/intelligence, it's spouting useless mouth-garbage.

And remember that all models are wrong, but some models are useful. Some! Not most. Most models are shit. Categories divorced from a model are worthless, and categories of a shit model are shit.

2. Even good categories of useful models have extremely fuzzy boundaries, and we too often fall into the false dichotomy of thinking something must either "be" or "not be" part of a category. Is an SUV a car? Is a car with a rocket engine on it still a car? Is a car with six wheels still a car? Who cares!? If you're charging tolls for your toll bridge, you instead settle for some countable characteristic like number of axles, and you amend this later if you start seeing lots of vehicles with something that stretches your definition of "axle". In fact the category "car" is worthless most of the time. It's an OK noun, but nouns are only averages; only mental shortcuts to a reasonable approximation of the actual object. If you ever see "class Car : Vehicle", you know you're working in a shit, doomed codebase.

And yet you waste time arguing over the definitions of these shit, worthless categories. These worthless things become central to your database and software designs and object hierarchies. Of course you end up with unmaintainable shit.

Edit: Three reasons!

3. They're always given too much weight. Male/female: PICK ONE. IT'S VERY IMPORTANT THAT YOU CHOOSE ONE! It is vastly important to our music streaming app that we know whether your skin is black or instead that your ancestors came from the Caucus Mountains or Mongolia. THOSE ARE YOUR ONLY OPTIONS PICK ONE!

Employee table: required foreign key to the "Department" table. Departments are virtually meaningless and change all the time! Every time you get a new vice president sitting in some operations chair, the first thing he does is change all the Departments around. You've got people in your Employee table whose department has changed 16 times, but they're the same person, aren't they? Oh, and they're not called "Departments" anymore, they're now "Divisions". Did you change your field name? No, you didn't. Of course you didn't. You have some Contractors in your Employee table, don't you? Some ex-employees that you need to keep around so they show up on that one report? Yeah, you do. Of course you do. Fuck ontology.

Dead Comment

gwern · 3 years ago
I am suspicious of this story, however plausible it seems.

The source is given as a book; the Economist writer Shashank Joshi explicitly says that they describe the 'same story' in their article ("(I touched on the same story here: https://economist.com/technology-quarterly/2022/01/27/decept...)"). However, if you look at the book excerpt (https://twitter.com/shashj/status/1615716082588815363), it's totally different from the supposed same story (crawl or somersault? cited to Benjamin, or Phil? did the Marine cover his face, or did he cover everything but his face? all undetected, or not the first?).

Why should you believe either one after comparing them...? When you have spent much time tracing urban legends, especially in AI where standards for these 'stupid AI stories' are so low that people will happily tell stories with no source ever (https://gwern.net/Tanks) or take an AI deliberately designed to make a particular mistake & erase that context to peddle their error story (eg https://hackernoon.com/dogs-wolves-data-science-and-why-mach...), this sort of sloppiness with stories should make you wary.

pugworthy · 3 years ago
Say you have a convoy of autonomous vehicles traversing a road. They are vision based. You destroy a bridge they will cross, and replace the deck with something like plywood painted to look like a road. They will probably just drive right onto it and fall.

Or you put up a "Detour" sign with a false road that leads to a dead end so they all get stuck.

As the articles says, "...straight out of Looney Tunes"

qwerty3344 · 3 years ago
would humans not make the same mistake?
atonse · 3 years ago
Maybe. Maybe not.

We also have intuition. Where Something just seems fishy.

Not saying AI can’t handle that. But I assure you that a human would’ve identified a moving cardboard box as suspicious without being told it’s suspicious.

It sounds like this AI was trained more on a whitelist “here are all the possibilities of what marines look like when moving” rather than a black list which is way harder “here are all the things that aren’t suspicious, like what should be an inanimate object changing locations”

Deleted Comment

tgsovlerkhgsel · 3 years ago
Sure. But if someone wanted to destroy the cars, an easier way would be to... destroy the cars, instead of first blowing up a bridge and camouflaging the hole.
pugworthy · 3 years ago
True. So if they are smart enough to fool AI, they will just remove the mid span, and have convenient weight bearing beams nearby that they put in place when they need to cross. Or if it's two lane, only fake out one side because the AI will be too clever for its own good and stay in its own lane. Or put up a sign saying "Bridge out, take temporary bridge" (which is fake).

The point is, you just need to fool the vision enough to get it to attempt the task. Play to its gullibility and trust in the camera.

aftbit · 3 years ago
That sounds way harder. You'd first need to lift a giant pile of metal to a cartoonishly high height, then somehow time it to drop on yourself when the cars are near.
dilippkumar · 3 years ago
Unfortunately, this will not work for autonomous driving systems that have a front facing radar or lidar.

Afaik, this covers everybody except Tesla.

Looney Tunes attacks on Teslas might become a real subreddit one day.

ghiculescu · 3 years ago
Why wouldn’t it work for those systems?
amalcon · 3 years ago
The Rourke Bridge in Lowell, Massachusetts basically looks like someone did that, without putting a whole lot of effort into it. On the average day, 27,000 people drive over it anyway.
closewith · 3 years ago
Interestingly, the basics of concealment in battle are shape, shine, shadow, silhouette, spacing, surface, and speed (or lack thereof) are all the same techniques the marines used to fool the AI.

The boxes and tree changed the silhouette and the somersaults changed the speed of movement.

So I guess we've been training soldiers to defeat Skynet all along.

ridgeguy · 3 years ago
Who knew the Marines teach Shakespearean tactics?

"Till Birnam wood remove to Dunsinane"

Macbeth, Act V, Scene III

optimalsolver · 3 years ago
That it turned out to just involve regular men with branches stuck to their heads annoyed JRR Tolkien so much that he created the race of Ents.
DennisP · 3 years ago
Turns out cats have been preparing for the AI apocalypse all along.
MonkeyMalarky · 3 years ago
Sounds like they're lacking a second level of interpretation in the system. Image recognition is great. It identifies people, trees and boxes. Object tracking is probably working too, it could follow the people, boxes and trees from one frame to the next. Juuust missing the understanding or belief system that tree+stationary=ok but tree+ambulatory=bad.
voidfunc · 3 years ago
I'd imagine could also look at infrared heat signatures too
sethhochberg · 3 years ago
Cardboard is a surprisingly effective thermal insulator. But then again, a box that is even slightly warmer than ambient temperature it is... not normal.