packet_nerd (u/packet_nerd)

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

raldi · 2 years ago

So what color was the frog supposed to be in the original question?

packet_nerd · 2 years ago

Green of course? Anything else would be highly unusual and a normal reader would expect it to be called out.

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

2-718-281-828 · 2 years ago

the correct answer is 14. there is no logic/linguistic/semantic reason why "he didn't see a purple unicorn" should refer to the purple unicorn that he (according to your statement) did see. "he saw a red ball, but he didn't see one ball: a red one. how many balls did he see?". also regarding the green one ... there is no _logical_ reason why a "magical" frog should be green ... one can debate long about your question but a semantically sound interpretation implies: the frog saw 14 unicorns and the frog is not green. anything else falls apart because if the frog is green then how could he have seen a green uni? which is what you wrote for context.

packet_nerd · 2 years ago

Do you disagree with my claim that GPT-4 can perform some sort of basic reasoning about puzzles that aren't in it's training data?

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

raldi · 2 years ago

This puzzle is too poorly-worded to be solvable, due to the ambiguous nature of "see" and "count". Could you describe what the actual situation was, what the frog perceived it to be, and what color the frog was?

packet_nerd · 2 years ago

Ok, here's a (hopefully) better worded puzzle, again made up by myself right now.

There are 12 frogs. Five are green, 3 red, and 4 yellow. Two donkeys are counting the frogs. One of the donkeys is yellow, the other green. Each donkey is unable to see frogs that are the same color as itself, also each donkey was careless and missed a frog when counting. How many frogs does the green donkey count?

GPT4 answers 6 every time for me.

My point is that GPT is capable of a certain amount of "reasoning" about puzzles that most certainly don't exist in it's training data. Playing with it, it's clear that in this current generation the reasoning ability doesn't go very deep - just change the above puzzle a little to make it even slightly more complicated and it breaks. The amazing thing isn't how good at reasoning it is, but that a computer can reason at all.

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

krapp · 2 years ago

It answered 16 for me. Then 10 when I tried again. Then 12. And 15.

packet_nerd · 2 years ago

ChatGPT 3.5? I'm using 4 and get 11 most times but other numbers occasionally.

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

2-718-281-828 · 2 years ago

The correct answer is 14 ... the frog counted what it saw and it saw 5, 2, 7 unicorns.

packet_nerd · 2 years ago

It clearly says he didn't see some of them either at all or as unicorns. The correct answer is 11.

Edit: I do see now that "He saw" kind of messes the question up. My intent would have been better expressed with "There were". But again this proves my point! GPT4 is able to (most of the time) correctly work through the poor wording and interpret the question the way I meant it, and I think the way most people would read it.

packet_nerd commented on Rogue superintelligence: Inside the mind of OpenAI's chief scientist technologyreview.com/2023... · Posted by u/monort

gibsonf1 · 2 years ago

This is a very delusional idea: "He thinks ChatGPT just might be conscious (if you squint)" It's a technology with literally no intelligence or understanding of the world of any kind. Its just statistics on data. It is as conscious as a calculator.

packet_nerd · 2 years ago

Here's a question to ChatGPT I just made up:

>> A magical frog was counting unicorns. He saw 5 purple unicorns, 2 green unicorns, and 7 pink unicorns. However, he made a mistake and didn't see 2 unicorns: one purple and one green. Also, since he was a magical frog, he didn't see unicorns that were the same color as himself. How many unicorns did he count?

It correctly answers 11 for me.

To me this has demonstrated:

* "Understanding": It understood that "didn't see" implies he didn't count.

* "Knowledge": It knew enough about the world to know that frogs are often green.

* "Reasoning": It was able to correctly reason about how many should be subtracted from the final result.

* "Math: It successfully did some basic additions and subtractions arriving at the correct answer.

Crucially, I made this up right here on the spot, and used a dice for some of the numbers. This question does not exist anywhere in the training corpus!

I think this demonstrates an impressive level of intelligence, for what up until about a year ago I thought a computer would ever be capable of in my lifetime. Now in absolute terms of course current gen ChatGPT is clearly far less good at reasoning and understanding than most people (well, specifically it seems to me that it's knowledge and reasoning are super-humanly broad, but child-level deep).

Can future improvements to this architecture improve the depth up to "AGI", whatever that means? I have no idea. It doesn't automatically seem impossible, but maybe what we see now is already near the limit? I guess only time will tell.

packet_nerd commented on Optimizing LLMs from a Dataset Perspective sebastianraschka.com/blog... · Posted by u/alexmolas

soultrees · 3 years ago

Language translation can be tricky because of the underlying nuances in each language so more context would probably be better, but using multiple steps to evaluate its performance on a key level would be a good way to improve the confidence.

It might be beneficial to start your dataset at the key (word) level, generate some embeddings of the key pair in the source and target and stash them, then do the same for sentence level and just for fun, paragraph level. (I believe you could get enough context from the sentence level as a paragraph is just a group of sentences but it would still be interesting to generate paragraph level key pairs I think).

From there you’d have a set of embeddings of each word src:tgt that also has context of how it fits in a sentence level and paragraph level with the respective nuances of each language.

Once you have that dataset then you can augment your data with prompts like you’re using but also including some contextual references of word pairs, and sentence pairs in your prompt which should corner the LLM into the right path.

Edit: not an expert so will heed if someone smarter comes along.

packet_nerd · 3 years ago

Oh, yes, pairs of words is a good idea. I also have a bilingual dictionary and can generate a prompt for each entry something like "here's a word in <lang_a>, write a dictionary definition for it in <lang_b>: <lang_a_word>: <lang_b_definition".

packet_nerd commented on Optimizing LLMs from a Dataset Perspective sebastianraschka.com/blog... · Posted by u/alexmolas

packet_nerd · 3 years ago

What would a good fine-tuning dataset for language translation look like?

I want to try fine-tuning to machine translate to and from a fairly niche language (https://en.wikipedia.org/wiki/S'gaw_Karen_language). How much text would I need, and what format would be ideal?

I have a number of book length texts, most only in the target language, and a few bilingual or multilingual. For the bilingual and multilingual texts, I can script out probably several thousand pairs of "translate the following text from <source_lang> to <target_lang>: <source_lang_text> <target_lang_text>". Do I need to vary the prompt and format, or can I expect the LLM to generalize to different translation requests? Is there value in repeating the material in different lengths? One set of sentence lengths, another paragraph, and another page or chapter length? Also what should be done with the monolingual texts, just ignore them?

packet_nerd commented on Gut–brain axis study shows autism-associated molecular, microbial profiles nature.com/articles/s4159... · Posted by u/hammock

edgyquant · 3 years ago

His claim at least isn’t a blatant lie while yours clearly is

packet_nerd · 3 years ago

People who practice Breatharianism claim to live without food and get sustenance from various mystical sources, including sunshine (they either don't live very long or else are frauds and do secretly eat).

https://www.youtube.com/watch?v=WWRniMqhr00 https://en.wikipedia.org/wiki/Inedia

To me it seems uncalled for to accuse go_elmo of lying about knowing some of these people.

packet_nerd commented on Baby cried in his sleep and made movements as if he had a tablet in his hand twitter.com/tansuyegen/st... · Posted by u/redbell

lethologica · 3 years ago

My wife and I are planning to have our first child soon and device addiction is something we’re worried of happening. I’m keen to hear from parents on HN how you go about managing devices with children. Do you have any recommendations, anecdotes, or resources?

packet_nerd · 3 years ago

This is something I've tried to figure out this week.

I tried Youtube Kids because I wanted the content to be pre-approve only, I liked the toddler friendly controls, and the app can go in single app mode on the iPad. It's subscription, which is fine, but the main problem is they won't let me approve the content I want. I can see some great kids stories and learning videos in our language on regular Youtube, but can't seem to find a way to get them into the Youtube kids app. (And I certainly am not going to turn her loose on regular Youtube.)

There don't seem to be any other apps that do what I want so I ended up setting up a Plex media server and use yt-dlp to download the videos for her. This works pretty well, but is a lot more work. And the app is not great.