Readit News logoReadit News
bnjmn commented on Benjie's Humanoid Olympic Games   generalrobots.substack.co... · Posted by u/robobenjie
trhway · 2 months ago
> a baby some robot neglects to prevent from falling off the changing table

that is when we think about 2 handed robots. 6 handed robot can easily have 2-3 hands assigned to tightly keeping the baby. Humanoid robots are handicapped by their similarity to humans which is really an artificial constraint. After all we aren't building airplanes using birds as the blueprint.

On the similar note - while not about baby, was just rewatching an early Bing Bang Theory season with this episode where Howard "falls right into the mechanical hand"

bnjmn · 2 months ago
> Humanoid robots are handicapped by their similarity to humans which is really an artificial constraint.

YES, and I wish people would stop pretending we've unlocked some new generality by promoting generic humanoid robots over task-specific ones.

You can probably Rube-Goldberg your way to a diaper-changing robotic enclosure with a 3D baby bidet that uses many low-force robot arms to subdue (most) babies, but a humanoid robot is a very a poor substitute for a human here.

Plus, a human can take personal responsibility for the baby's safety, which is not something a robot can ever do, unless we somehow make the robot fear for its life/freedom/employment the same way the overarching social/legal system does for humans who sign contracts or accept highly accountable roles.

bnjmn commented on Benjie's Humanoid Olympic Games   generalrobots.substack.co... · Posted by u/robobenjie
bnjmn · 2 months ago
Here's a use case that seems more science fictional to me (as the parent of a 2yo) than warp drive: a robot that can gently restrain an uncooperative human baby while changing its diaper, with everything that entails: identifying and eliminating all traces of waste from all crevices, applying diaper cream as necessary, unfolding and positioning the new diaper correctly and quickly, always using enough but never too much force... not to mention the nightmare of providing any guarantees about safety at mass-market scale. Even one maimed baby, or even just a baby some robot neglects to prevent from falling off the changing table, is game over for that line of robots.

Is there any research program that could claim to tackle this? It's so far beyond folding laundry and doing dishes, which are already quite difficult.

I wouldn't bet my life on this tech _never_ materializing, but I would mistrust anyone who claimed it was feasible with today's tech. It calls for an entirely different kind of robotic perception, feedback, and control.

bnjmn commented on Claude can sometimes prove it   galois.com/articles/claud... · Posted by u/lairv
bnjmn · 3 months ago
Among other general advice, my CLAUDE.md insists Claude prove to me each unit of change works as expected, and I'm usually just hoping for it to write tests and convince me they're actually running and passing. A proof assistant seems overkill here, and yet Claude often struggles to assemble these informal proofs. I can see the benefit of a more formal proof language, along with adding a source of programmatic feedback, compared to open-ended verbal proof.

"Overkill" of course is an editorial word, and if you know about https://en.wikipedia.org/wiki/Curry%E2%80%93Howard_correspon... then you know many statically typed programming languages are essentially proof assistants, where the proof goal is producing well-typed programs. LLMs are already quite good at interacting with these programming language proof assistants, as you can see any time a competent LLM interacts with the Rust borrow checker, for example.

bnjmn commented on Palma 2   shop.boox.com/products/pa... · Posted by u/tosh
bnjmn · 9 months ago
They say the e-ink display has "Unmatched Speed, Never Seen on ePaper" so it would nice to know the actual refresh rate.

This is not an endorsement, but https://daylightcomputer.com/ claims 60fps, so that's the bar to meet in my opinion. Caveat: the daylight display is not true e-ink, but an e-ink-like LCD, IIUC.

Deleted Comment

bnjmn commented on Tokenisation Is NP-Complete   arxiv.org/abs/2412.15210... · Posted by u/belter
mcyc · a year ago
NB: Can't edit my original reply.

Sorry actually I misread part of your comment in relation to the paper and confused δ and another parameter, K.

To clarify, δ is the number of tokens in the tokenized corpus and K is the size of the vocabulary.

So, if you are asking about why would they limit _K_, then my answer still applies (after swapping δ for K). But if you still mean "why do they pick some arbitrary δ as the limit of the size of the tokenized corpus", then I think the answer is just "because that makes it a decision problem".

bnjmn · a year ago
Thanks for these detailed replies! Now I really want to read your paper.
bnjmn commented on Tokenisation Is NP-Complete   arxiv.org/abs/2412.15210... · Posted by u/belter
immibis · a year ago
NP is a category of decision problems - problems with boolean answers. Saying that it's NP-complete to find the tokeniser that produces the fewest symbols is meaningless. You have to convert it to the form "is there a tokenizer that produces fewer than N symbols?" before it even makes sense to ask whether it's NP-complete.
bnjmn · a year ago
I fully agree with your final statement, but needing to constrain the problem in an artificial way to prove it's NP-complete doesn't mean the constraint was justified or realistic, because then you've only proved the constrained version of the decision problem is NP-hard.

There might be plenty of perfectly "good" tokenizers (whatever that ends up meaning) that can be found or generated without formulating their design as an NP-complete decision problem. Claiming "tokenization is NP-complete" (paper title) in general seems like an overstatement.

bnjmn commented on Tokenisation Is NP-Complete   arxiv.org/abs/2412.15210... · Posted by u/belter
bnjmn · a year ago
> We still do not know, for instance, what makes a good tokeniser (Gowda and May, 2020; Cognetta et al., 2024): which characteristics should its produced subwords `s` have to be a good starting point for language modelling? If we knew this, then we could define an objective function which we could evaluate tokenisers with.

I don't see how the authors get past this true general statement from the first paragraph of the introduction. Finding a good tokenizer is not just NP-hard; we have no idea how hard it might be because we don't have theoretical agreement on what "good" means.

In order to have something to prove, the authors decide (somewhat arbitrarily):

> Specifically, we focus on finding tokenisers that maximise the compression of a text. Given this objective, we then define the tokenisation problem as the task of finding a tokeniser which compresses a dataset to at most δ symbols.

Is a tokenizer that maximizes the compression of text (e.g. by identifying longer tokens that tend to be used whole) necessarily a better tokenizer, in terms of overall model performance? Compression might be a useful property for an objective function to consider... but then again maybe not, if it makes the problem NP-hard.

I'm also not sure how realistic the limitation to "at most δ symbols" is. I mean, that limit is undeniably useful to make the proof of NP-completeness go through, because it's a similar mechanism to the minimum number of satisfied clauses in the MAX-2-SAT definition. But why not just keep adding tokens as needed, rather than imposing any preordained limit? IIRC OpenAI's tokenizer has a vocabulary of around 52k subword strings. When that tokenizer was being designed, I don't imagine they worried much if the final number had been 60k or even 100k. How could you possibly choose a meaningful δ from first principles?

To put that point a different way, imagine the authors had proven NP-completeness by reduction from the Knapsack Problem, where the knapsack you're packing has some maximum capacity. If you can easily swap your knapsack out for a larger knapsack whenever it gets (close to) full, then the problem becomes trivial.

If the authors managed to prove that any arbitrary objective function would lead to NP-hard tokenizer optimization problem, then their result would be more general. If the paper proves that somehow, I missed it.

I suppose this paper suggests "here be dragons" in an interesting if incomplete way, but I would also say there's no need to hurt yourself with an expensive optimization problem when you're not even sure it delivers the results you want.

bnjmn commented on The longest word you can type on the first row   rubenerd.com/the-longest-... · Posted by u/nafnlj
bnjmn · 2 years ago
On any macOS computer (or replace /usr/share/dict/words with your own word list):

  grep '^[qwertyuiop]*$' /usr/share/dict/words | \
  awk '{ print length(), $0 }' | \
  sort -n

bnjmn commented on How the most popular cars in the US track drivers   wired.com/story/car-data-... · Posted by u/arkadiyt
bnjmn · 2 years ago
Are there any new cars / car brands that credibly promise not to track their drivers?

Any car with a network connection for software updates seems likely to be harvesting driver data, or is at least capable of doing so.

u/bnjmn

KarmaCake day477July 6, 2011
About
[ my public key: https://keybase.io/benjamn; my proof: https://keybase.io/benjamn/sigs/94-u377AG0oTnRb3A88rmP2WOk88eq9DrT9K2RGfsJk ]
View Original