carlosdp (u/carlosdp)

carlosdp commented on We accidentally solved robotics by watching 1M hours of YouTube ksagar.bearblog.dev/vjepa... · Posted by u/alexcos

dchftcs · 2 months ago

Pure vision will never be enough because it does not contain information about the physical feedback like pressure and touch, or the strength required to perform a task.

For example, so that you don't crush a human when doing massage (but still need to press hard), or apply the right amount of force (and finesse?) to skin a fish fillet without cutting the skin itself.

Practically in the near term, it's hard to sample from failure examples with videos on Youtube, such as when food spills out of the pot accidentally. Studying simple tasks through the happy path makes it hard to get the robot to figure out how to do something until it succeeds, which can appear even in relatively simple jobs like shuffling garbage.

With that said, I suppose a robot can be made to practice in real life after learning something from vision.

carlosdp · 2 months ago

> Pure vision will never be enough because it does not contain information about the physical feedback like pressure and touch, or the strength required to perform a task.

I'm not sure that's necessarily true for a lot of tasks.

A good way to measure this in your head is this:

"If you were given remote control of two robot arms, and just one camera to look through, how many different tasks do you think you could complete successfully?"

When you start thinking about it, you realize there are a lot of things you could do with just the arms and one camera, because you as a human have really good intuition about the world.

It therefore follows that robots should be able to learn with just RGB images too! Counterexamples would be things like grabbing an egg without crushing, perhaps. Though I suspect that could also be done with just vision.

carlosdp commented on uv: An extremely fast Python package and project manager, written in Rust github.com/astral-sh/uv... · Posted by u/chirau

carlosdp · 3 months ago

I love uv, not just for local development, but it also makes it WAY easier to manage python environments you setup for running python workers / user code in the cloud.

carlosdp commented on Bill Atkinson has died daringfireball.net/linked... · Posted by u/romanhn

carlosdp · 3 months ago

I was just telling someone about the story of how he invented bitmapping for overlapping windows in the first Mac GUI in like two weeks, largely because he mis-remembered that being already a feature in the Xerox PARC demo and was convinced it was already possible.

RIP to a legend

carlosdp commented on Ask HN: How do I learn robotics in 2025? · Posted by u/srijansriv

amacneil · 3 months ago

Robotics is definitely getting more accessible! But keep in mind it’s a whole different world from traditional web or desktop software development, so don’t be surprised by the relatively steep learning curve. Keep at it though!

- Buy a small robot kit from Amazon or a local reseller. Yahboom make some good robot toy car kits. Hugging Face have the open source SO-ARM101 that plenty of companies manufacture and sell now. Expect to spend about $250 USD including a Jetson Nano for a good kit, up to $1000 USD if you want some more sensors

- If you can’t afford a real robot, play around with simulators like Isaac Sim and Mujoco

- Check out LeRobot, excellent framework for ML robotics from Hugging Face

- Learn the basics of ROS (pubsub), even if you don’t end up using it, a lot of the industry jargon and design patterns come from ROS so it helps to understand it. Think of ROS like Ruby on Rails, it’s a heavyweight batteries-included framework with lots of opinions.

- ROS does have some nice libraries for manipulation (MoveIt) and navigation (Nav2) using more classical (non-ML) methods

- Leverage AI tools such as ChatGPT and Cursor when you get stuck, it’s a lot faster than Googling when you’re just getting started and don’t even know the right term to search for.

- (Shameless plug) Check out two tools I’m working on: mcap.dev for logging and foxglove.dev for visualization

carlosdp · 3 months ago

I highly recommend starting with the SO-ARM101 and the LeRobot tutorial. They're super cheap, its insanely quick to get started, and you can even buy pre-made kits like at https://partabot.com . It's the "Hello World" of robotics now, imo.

Don't bother with a Jetson Nano, you don't need that to get started, and by the time you need that you'll know a lot already. You can just drive the robot from your laptop!

Getting to training your own VLA fine-tuned model is a super quick and easy process. You can see examples of other people completing the tutorial and uploading their training/evaluation datasets here (shameless plug for my thing): https://app.destroyrobots.com

I wouldn't bother much with ROS at first tbh. It'll bog you down, and startups are moving toward using other approaches that are more developer friendly, like Rust-based embedded.

You can go far with a robot connected to USB though!

carlosdp commented on FLUX.1 Kontext bfl.ai/models/flux-kontex... · Posted by u/minimaxir

mdp2021 · 3 months ago

Is input restricted to a single image? If you could use more images as input, you could do prompts like "Place the item in image A inside image B" (e.g. "put the character of image A in the scenery of image B"), etc.

carlosdp · 3 months ago

There's an experimental "multi" mode you can input multiple images to

carlosdp commented on Veo 3 and Imagen 4, and a new tool for filmmaking called Flow blog.google/technology/ai... · Posted by u/youssefarizk

carlosdp · 4 months ago

Wow, this is incredible work! Blown away at how well the audio/video matches up, and the dialogue is better sounding / on-par with dedicated voice models.

carlosdp commented on Is Chrome Even a Sellable Asset? daringfireball.net/2025/0... · Posted by u/carrotsalad

zamadatix · 4 months ago

The article ignores Firefox switched the contract to Yahoo as the default search provider from 2014-2017, in complete absence of any requirement to do so. Even if everyone did keep using Google, having Chrome be funded externally through search revenue rather than internally still frees the browser from a lot of over-alignment with other Google desires (e.g. ad blocking, ad APIs, promoting of the Google browser on unrelated Google services).

All that aside, the article doesn't really make much of an argument as to why 3 billion current users shouldn't be worth lots of money to someone wanting to try to monetize (even if the author doesn't see a good monetization opportunity). It, instead, focuses on why the Google integrations Chrome had are what made it popular. One of the biggest differences between Google selling Chrome and any old chromium fork is precisely that the "other" browsers no longer have to try to compete with Google's own browser to get users to monetize.

carlosdp · 4 months ago

> The article ignores Firefox switched the contract to Yahoo as the default search provider from 2014-2017

I worked at Mozilla when this deal was struck. The deal with Yahoo did require Yahoo be the default for Firefox, I'm not sure what you mean by "absence of any requirement"?

Mozilla broke that contract with Yahoo (there was a clause allowing them to do so without repercussion and keep the money, if they deemed it better for the users, wild contract) less than 3 years later because users hated Yahoo so much, and went back to Google.

Google is dominant because it just _is_ the best search engine.

> One of the biggest differences between Google selling Chrome and any old chromium fork is precisely that the "other" browsers no longer have to try to compete with Google's own browser to get users to monetize.

Isn't that literally anti-competitive? The DoJ is saying Google search is dominant partially because of Chrome pushing users to Google.

You're saying Chrome is dominant because users like it too much, and other browsers can't compete? Tough, that's the users' choice, though.

carlosdp commented on Gaussian Splatting Alternative: WebGL Implementation of Nvidia's SVRaster github.com/samuelm2/svras... · Posted by u/samuelm2

carlosdp · 5 months ago

Awesome! I'm super impressed with SVRaster, glad to see others already playing with it too

carlosdp commented on Overengineered Anchor Links thirty-five.com/overengin... · Posted by u/matser

carlosdp · 5 months ago

The article is a neat read! The design of the blog itself is even more interesting. I don't love the right-aligned way it starts, but I love the inline activations of the left popup! So cool