pedrovhb (u/pedrovhb)

pedrovhb commented on Show HN: OverType – A Markdown WYSIWYG editor that's just a textarea · Posted by u/panphora

pedrovhb · 10 days ago

Nice! Seems very useful if you can drop in and have everything work.

Nitpicking a bit: it's not as much _rendering_ markdown as it's _syntax highlighting_ it. Another interesting approach there could be to use the CSS Custom Highlight API [0]. Then it wouldn't need the preview div, and perhaps it'd even be possible to have non-mono fonts and varying size text for headers.

[0] https://developer.mozilla.org/en-US/docs/Web/API/CSS_Custom_...

pedrovhb commented on Checking Out CPython 3.14's remote debugging protocol rtpg.co/2025/06/28/checki... · Posted by u/ingve

orbisvicis · a month ago

The problem is running the injected code at a specific location... line number or definition.

pedrovhb · a month ago

You can presumably run code that calls `sys.settrace` for that. Which makes it somewhat underwhelming to realize that you pretty much could also do that before, but perhaps convenient that now you don't have to have the foresight to have set that up beforehand.

pedrovhb commented on Brazil's government-run payments system has become dominant economist.com/the-america... · Posted by u/jcartw

pedrovhb · 5 months ago

As a Brazilian - Pix was a pleasant surprise, especially in that for once it feels like we're not lagging behind. It's convenient, free, instant transfers across banks. You can also easily create or programmatically generate QR codes or pastable codes with preset receivers and amounts. Great UX all around, and it quickly became the de-facto standard in how people send money.

It's technically quite impressive - it's a large scale thing and it works really well. I can think of maybe one or two times in these years where I saw downtime, and in both cases it was working again after a few minutes. The usual experience with the government building technical solutions is to have something that makes little sense, is slow, and goes down frequently with even the most predictable usage peaks, but with Pix they really seem to have nailed it.

It does feel a bit weird to have so many payments go through the government's systems, and it definitely feels like it puts them in a position of having more information than they should. There's a lot of Orwellian surveillance potential there, as any transfers are necessarily tied to both users' real identities. I don't think there's a realistic way around this, though.

Another concern is that people can expose some of their information without necessarily being aware of it. You can register e.g. emails and phone numbers as Pix "keys", and then anyone can initiate a transfer to those keys and your full name will pop up so you can confirm or cancel the transfer. I've seen some clever advice around this - "When using a carpooling app (often details are arranged off the platform using WhatsApp), put the driver's phone number on Pix. If a name comes up and it doesn't match the name or gender of the driver's profile, something is up". Obviously though there's potential for misuse and I'm sure the vast majority of people don't think about this when registering their Pix keys. You can, however, just use randomly generated uuids as keys as well, a different one for each transaction if you so desire, so this one can be a non-issue with more awareness.

Overall though it's a very convenient thing which works surprisingly well, and the downsides are theoretical at this point. IMO it's a rare case of our government nailing something.

pedrovhb commented on Exposed DeepSeek database leaking sensitive information, including chat history wiz.io/blog/wiz-research-... · Posted by u/talhof8

caust1c · 7 months ago

Interesting to note:

- Dev infra, observability database (open telemetry spans)

- Logs of course contain chat data, because that's what happens with logging inevitably

The startling rocket building prompt screenshot that was shared is meant to be shocking of course, but most probably was training data to prevent deepseek from completing such prompts, evidenced by the `"finish_reason":"stop"` included in the span attributes.

Still pretty bad obviously and could have easily led to further compromise but I'm guessing Wiz wanted to ride the current media wave with this post instead of seeing how far they could take it. Glad to see it was disclosed and patched quickly.

pedrovhb · 7 months ago

> but most probably was training data to prevent deepseek from completing such prompts, evidenced by the `"finish_reason":"stop"` included in the span attributes

As I understand, the finish reason being “stop” in API responses usually means the AI ended the output normally. In any case, I don't see how training data could end up in production logs, nor why they'd want to prevent such data (a prompt you'd expect to see a normal user to write) from being responded to.

> [...] I'm guessing Wiz wanted to ride the current media wave with this post instead of seeing how far they could take it.

Security researchers are often asked to not pursue findings further than confirming their existence. It can be unhelpful or mess things up accidentally. Since these researchers probably weren't invited to deeply test their systems, I think it's the polite way to go about it.

This mistake was totally amateur hour by DeepSeek, though. I'm not too into security stuff but if I were looking for something, the first thing I'd think to do is nmap the servers and see what's up with any interesting open ports. Wouldn't be surprised at all if others had found this too.

pedrovhb commented on Exposed DeepSeek database leaking sensitive information, including chat history wiz.io/blog/wiz-research-... · Posted by u/talhof8

jazzyjackson · 7 months ago

I don't have personal experience but from a quick google it looks like default setup is to accept connections on localhost only [0], and there's a default user without capability to run SQL statements. They would have had to open remote connections and enable SQL capability for the default user (it looks like this is the first step to creating other users, the 3rd step is, removing SQL capability for default user.) [1]:

  1. Enable SQL-driven access control and account management for the default user.
  2. Log in to the default user account and create all the required users. Don’t forget to create an administrator account (GRANT ALL ON *.* TO admin_user_account WITH GRANT OPTION).
  3. Restrict permissions for the default user and disable SQL-driven access control and account management for it.

[0] https://chistadata.com/knowledge-base/allow-clickhouse-to-ac...

[1] https://clickhouse.com/docs/en/operations/access-rights

pedrovhb · 7 months ago

I imagine it wouldn't necessarily require their opening of remote connections, just a misconfigured reverse proxy.

pedrovhb commented on Open Heart Protocol openheart.fyi/... · Posted by u/thunderbong

xrd · 7 months ago

Is this a decentralized like button? It's an interesting alternative to webmention (as is mentioned).

pedrovhb · 7 months ago

This seems centralized, though you can self-host it.

pedrovhb commented on Trusting clients is probably a security flaw liberda.nl/weblog/trust-n... · Posted by u/aquastorm

pedrovhb · 7 months ago

> [the extensive anti-reverse engineering measures are] more annoying than any financial app I've had, and I have 5 of them on my phone

Ah, this reminds me of the Tuya app.

I've done some ssl unpinning and mitm to see requests going in and out of my phone, it's pretty fun and there's often really nice and easy to use restful APIs underneath. Among them I've also done a couple of banking apps and they weren't particularly defensive either. That's great; as a user I'm empowered by it and like TFA says, it's totally fine from a security standpoint if you just don't trust the client to do anything they shouldn't be able to do. It shouldn't be your form validation that stops me from transferring a trillion dollars, and though I haven't tried, I'm sure that's not the case for those apps. All it does is allow me to get my monthly statements with a for loop rather than waiting for a laggy UI and clicking through each month.

Now, Tuya is a Chinese company offering a bunch of cheap IoT devices like smart power switches and IR motion detectors. You can interact with everything through their app. That app for some reason has spent by far the most resources on anti-RE of any apps I've seen. I already bought your hardware, mate. Please let me use it on my local network. My smart home infrared motion sensors were meant to turn lights on when I enter a room. But they don't feel very smart when I'm standing in the dark for 4 seconds while they check with a server in China. I don't even need a clean API; just let me see what you do, and I'll do something similar, no support or documentation necessary. But they go through extensive measures to prevent you from interacting with the hardware you bought and which is sitting in your home.

This was a while ago, but I think for the motion sensing in particular, I managed to just put them in a subnetwork with blocked internet access, and snooped on the network to catch their DHCP requests when they tried to call home. This would happen every once in a while presumably for settings/update checks, but crucially also when there was motion detected, and I didn't mind a few false positives. So in the end they were very quick, locally functioning, privacy-friendly little devices!

pedrovhb commented on Pigment Mixing into Digital Painting scrtwpns.com/mixbox/... · Posted by u/tlarkworthy

pedrovhb · 8 months ago

That's very interesting!

My first thought, looking at the webpage: "Huh, that's neat. I didn't know that painting software didn't even attempt to do color mixing beyond naive interpolation, though I guess it figures; the physics behind all the light stuff must be fairly gnarly, and there's a lot of information lost in RGB that probably can't be just reconstructed."

Scrolling down a bit: "Huh, there's some snippets for using it as a library. Wait, it does operations in RGB? What's going on here?"

Finally, clicking the paper link, I found the interesting bit: "We achieve this by establishing a latent color space, where RGB colors are represented as mixtures of primary pigments together with additive residuals. The latents can be manipulated with linear operations, leading to expected, plausible results."

That's very clever, and seems like a great use for modern machine learning techniques outside the fashionable realm of language models. It uses perceptual color spaces internally too, and physics based priors. All around very technically impressive and beautiful piece of work.

It rhymes with an idea that's been floating in my head for a bit - would generative image models, or image encoder models, work better if rather than rgb, we fed them with wavelength data, or at least a perceptually uniform color space? Seems it'd be closer to truth than arbitrarily using the wavelengths our cone cells happen to respond to (and roughly, at that).

pedrovhb commented on How we made our AI code review bot stop leaving nitpicky comments greptile.com/blog/make-ll... · Posted by u/dakshgupta

pedrovhb · 8 months ago

Here's an idea: have the LLM output each comment with a "severity" score ranging from 0-100 or maybe a set of possible values ("trivial", "small", "high"). Let it get everything off of its chest outputting the nitpicks but recognizing they're minor. Filter the output to only contain comments above a given threshold.

It's hard to avoid thinking of a pink elephant, but easy enough to consciously recognize it's not relevant to the task at hand.

pedrovhb commented on Training LLMs to Reason in a Continuous Latent Space arxiv.org/abs/2412.06769... · Posted by u/omarsar

ttul · 9 months ago

Indeed, I would not be surprised if OpenAI one day admits that the `o1` model uses the last hidden layer (or some other intermediate layer) to feed the "thought process" that you can watch as it "thinks" about the answer. I suspect that they may take the last hidden layer and feed it back into the front of the `o1` model while also feeding a separate, likely much smaller LLM that generates the "thought process" as language tokens.

In this manner, the model makes use of the rich semantic information encoded at the last hidden layer while informing the user via an extraction of that hidden layer specifically tuned to generate human-legible concepts such as, "I'm considering the impact of converting the units from kilograms to pounds," or whatever.

pedrovhb · 9 months ago

That's certainly possible, but it reminds me a bit of a similar thing I've seen in their UI that rhymes in a way that makes me think otherwise. In the code interpreter tool, you have a little preview of the "steps" it's following as it writes code. This turns out to just be the contents of the last written/streamed comment line. It's a neat UI idea I think - pretty simple and works well. I wouldn't be surprised if that's what's going on with o1 too - the thought process is structured in some way, and they take the headings or section names and just display that.