Readit News logoReadit News
dhon_ commented on Gemini 2.5 Computer Use model   blog.google/technology/go... · Posted by u/mfiguiere
MrToadMan · 2 months ago
Is this as impressive as it initially seems though? A Bing search for the text shows up some Web results for Dvorak to QWERTY conversion, I think because the word ‘t.fxrape’ (keyboard) hits. So there’s a lot of good luck happening there.
dhon_ · 2 months ago
Here's the chat session - you can expand the thought process and see that it tried a few things (hands misaligned with the keyboard for example) before testing the Dvorak keyboard layout idea.

https://chatgpt.com/share/68e5e68e-00c4-8011-b806-c936ac657a...

I also found it interesting that despite me suggesting it might be a password generator or API key, ChatGPT doesn't appear to have given that much consideration.

dhon_ commented on Gemini 2.5 Computer Use model   blog.google/technology/go... · Posted by u/mfiguiere
simonw · 2 months ago
Post edited: I was wrong about this. Gemini tried to solve the Google CAPTCHA but it was actually Browserbase that did the solve, notes here: https://simonwillison.net/2025/Oct/7/gemini-25-computer-use-...
dhon_ · 2 months ago
I was concerned there might be sensitive info leaked in the browserbase video at 0:58 as it shows a string of characters in the browser history:

    nricy.jd t.fxrape oruy,ap. majro
3 groups of 8 characters, space separated followed by 5 for a total of 32 characters. Seemed like text from a password generator or maybe an API key? Maybe accidentally pasted into the URL bar at one point and preserved in browser history?

I asked ChatGPT about it and it revealed

    Not a password or key — it’s a garbled search query typed with the wrong keyboard layout.
    
    If you map the text from Dvorak → QWERTY,
    nricy.jd t.fxrape oruy,ap. majro → “logitech keyboard software macos”.

dhon_ commented on iPhone dumbphone   stopa.io/post/297... · Posted by u/joshmanders
jaysonelliot · 3 months ago
Before installing all those apps the author listed, I'd recommend this exercise:

Let the battery die on your phone, and live one week without it. Cold turkey. Tell people in advance if you need to, give them an alternate way to reach you. Replace your phone for that week with a small notebook that fits in your pocket.

During that week, every time you want to do something that requires a smartphone, jot it down in your notebook. Then, fifteen minutes later or so, write down what you did instead.

After a week, you're ready to start using your smartphone again and turn it into a so-called "dumb phone." Read your notebook and think honestly about which things you really needed to do, and which ones weren't such a big deal after all.

dhon_ · 3 months ago
I switched to a candy-bar style dumb phone for a month and did something similar. My list was pretty much the same as the one in the article with a few small changes.

The most jarring was probably maps - other things like email, messaging etc could be delayed until I could reach a computer but not knowing how to get somewhere right now was problematic and required planning in advance.

I usually kept my smart phone in my car and did a sim swap on the occasion that I really needed it.

dhon_ commented on Claude Sonnet will ship in Xcode   developer.apple.com/docum... · Posted by u/zora_goron
not_your_vase · 4 months ago
3 days ago I saw another Claude praising submission on HN, and finally I signed up for it, to compare it with copilot.

I asked 2 things.

1. Create a boilerplate Zephyr project skeleton, for Pi Pico with st7789 spi display drivers configured. It generated garbage devicetree which didn't even compile. When I pointed it out, it apologized and generated another one that didn't compile. It configured also non-existent drivers, and for some reason it enabled monkey test support (but not test support).

2. I asked it to create 7x10 monochromatic pixelmaps, as C integer arrays, for numeric characters, 0-9. I also gave an example. It generated them, but number eight looked like zero. (There was no cross in ether 0 nor 8, so it wasn't that. Both were just a ring)

What am I doing wrong? Or is this really the state of the art?

dhon_ · 4 months ago
LLMs struggle more with embedded software due to the relative lack of examples in the training data compared to javascript etc. They also struggle more with visual reasoning tasks like the character example you provided.

For your first task - give it smaller steps along the way that you can validate. Provide context where possible (like docs for st7789, examples of other zephyr projects). Use Opus instead of Sonnet for tasks that are on the edge of it's capabilities like this. It will still make mistakes, be prepared for iterating on the design and providing feedback.

For your font example, you always need to validate the output of the LLM, but did it still save you time? If so then I consider that a win. If not, give it the task that it's good at (ie generating the surrounding code and definitions) and then fill in the font data yourself.

dhon_ commented on Show HN: Let’s Bend – Open-Source Harmonica Bending Trainer   letsbend.de... · Posted by u/egdels
dhon_ · 6 months ago
This is the best explanation I've found of the mechanics of pitch bending https://youtu.be/Fp-GxEaChr0?si=-E9uDTQx51gtnd9C

Essentially, match the size of the resonance chamber in your mouth to the note you're trying to bend to. This is different for every note you bend. You can find the right size by making "hissing" noises while breathing in (without harmonica) and matching the pitch.

dhon_ commented on Show HN: Shelgon: A Framework for Building Interactive REPL Shells in Rust   github.com/NishantJoshi00... · Posted by u/cat-whisperer
cmrdporcupine · 9 months ago
Why would I want to add tokio as a dependency if I don't use it?
dhon_ · 9 months ago
Yes you're right. I'd assumed that tokio was used internally in the UI but from a cursory read it doesn't seem to be the case.
dhon_ commented on Show HN: Shelgon: A Framework for Building Interactive REPL Shells in Rust   github.com/NishantJoshi00... · Posted by u/cat-whisperer
cmrdporcupine · 9 months ago
+1 for #1. In general, I would recommend providing non-async alternative APIs, with the async runtime as an option rather than assumed default. Not all of us drink the kool-aid there. And no, I don't mean just providing a "sync" API that uses "block_on" behind the scenes but still uses tokio...

Also, for async don't mandate tokio, use the "agnostic" crate to abstract it so people can use alternative runtimes.

And yes, don't use anyhow in a library like this. Anyhow is for your internal code, not public libraries/crates. Define a set of error enums, or use thiserror for it if you have to.

dhon_ · 9 months ago
Could you elaborate on why using block_on wouldn't be an acceptable solution for users that want a blocking API?
dhon_ commented on I helped fix sleep-wake hangs on Linux with AMD GPUs   nyanpasu64.gitlab.io/blog... · Posted by u/fanf2
OvbiousError · 10 months ago
My colleague showed me his windows machine recently. The rubber on the back around the fans has melted from the times he forgot to shut it down and sleep didn't trigger when he packed it away in his backpack.
dhon_ · 10 months ago
Linus tech tips on YouTube did a video about a windows bug where sleeping while charging would allow the laptop to wake up to check for updates etc but often caused this issue of turning on in a bag
dhon_ commented on OCR4all   ocr4all.org/... · Posted by u/LorenDB
modeless · 10 months ago
VLMs seem to render traditional OCR systems obsolete. I'm hearing lately that Gemini does a really good job on tasks involving OCR. https://news.ycombinator.com/item?id=42952605

Of course there are new models coming out every month. It's feeling like the 90s when you could just wait a year and your computer got twice as fast. Now you can wait a year and whatever problem you have will be better solved by a generally capable AI.

dhon_ · 10 months ago
I've seen Gemini Flash 2 mention "in the OCR text" when responding to VQA tasks which makes me question of they have a traditional OCR process mixed in the pipeline.
dhon_ commented on Nix – Death by a Thousand Cuts   dgt.is/blog/2025-01-10-ni... · Posted by u/jonotime
jonotime · a year ago
Author here. Your TLDR is spot on. Yes, my intent was to be on desktop use since most things I read dont consider that specifically. I did talk about how I would keep this running on some simple home servers since I think that's where Nix shines. But some of my servers are raspberry pis, which I mentioned I am worried to run Nix on due to resource limitations. I should probably just try it.
dhon_ · a year ago
I wish remote build/deploy for Raspberry Pi was in a better state - it seems like a perfect fit for NixOS.

I've got x86 servers running NixOS that are deployed using Colmena, but it seems to fall apart when I add cross compilation into the mix.

u/dhon_

KarmaCake day278January 20, 2014View Original