Readit News logoReadit News
markden commented on Launch HN: Bitrig (YC S25) – Build Swift apps on your iPhone    · Posted by u/kylemacomber
kylemacomber · 7 months ago
Don't tell anyone, but once you're on the paid plan we're not actually enforcing the 100 message limit right now ;)

We're still in the learning phase, and are going to adjust the plans based on exactly the kind of considerations you're raising

markden · 7 months ago
Your secret is safe, hah! It's almost like you need an intermediate 'cheap/dumb' AI as a proxy to flesh everything out ahead of sending it off to be coded ... something that crisps up all the requirement and ultimately crafts a more cost-effective and likely all-around-better prompt. Even being semi-technical, I'm always surprised how much the details matter when describing something to AI (e.g., the fireworks should only explode _outward_, not inward, lol). Thanks!
markden commented on Launch HN: Bitrig (YC S25) – Build Swift apps on your iPhone    · Posted by u/kylemacomber
markden · 7 months ago
First, as a non-developer geek -- super fun.

What happens if you go over 100 messages/month?

I just burned my 5 free messages to get a simple toggle button working that just says "win" (with animated fireworks!) and "lose". I'm sure I'm not an efficient prompter, but it seems I'd knock out 100 messages easily in an afternoon, which looks to be the monthly limit at $20/mo.

(This is coming from someone who has no idea how expensive it would be to 'vibe code' using something like Claude ... so it may be an entirely unfair assumption that you could chat with this 'unlimited' for $20/mo ... that's what I have in my head as 'reasonable' only because that's what I pay for Gemini or ChatGPT and, for all intents and purposes, it feels 'unlimited'.)

markden commented on Ask HN: Best practice to protect from back end data exfiltration via website?    · Posted by u/markden
solardev · a year ago
That's a good analogy. If you have the first mover advantage and can earn user loyalty through good UX or whatever, it might not really matter thwt much even if someone does steal your data. Worth a shot?
markden · a year ago
Haha still thinking through that. But potentially!
markden commented on Ask HN: Best practice to protect from back end data exfiltration via website?    · Posted by u/markden
solardev · a year ago
You're welcome, but also keep in mind that it's just my opinion :) Someone else might come along and tell you all the ways I'm wrong.

Also, it's not a black & white situation. If your dataset isn't super valuable, or if it's just niche enough, it's possible that adding Cloudflare by itself would be "good enough" protection. It's a LOT better than nothing, and also much better protection than what most people can DIY on their own.

markden · a year ago
Yeah, and that’s kind of exactly what I am looking for. This is niche enough that I am likely overly concerned someone would do real work to “steal” it. But I also always lock my car, even if someone can still smash the window. :)
markden commented on Ask HN: Best practice to protect from back end data exfiltration via website?    · Posted by u/markden
solardev · a year ago
Web dev here, but not cybersec focused... if I'm wrong, someone will be along to correct me shortly :)

That said, I'm reasonably confident that what you want isn't doable/practical, unfortunately :(

While there are certainly companies that make valuable datasets available over the web, the usual way they prevent mass scraping is by enforcing account limits, making retrieval expensive and also limited to only one tiny slice of data at a time. An example industry that does this are the mass data harvesting/targeting companies like Meta, Alphabet, or political companies (NGPVan, Actblue, etc.). They cross-reference a lot of PII floating around the internet, and/or harvest their own and then sell that to advertisers or political campaigns, but only a slice at a time, and at prices that they determine. You can of course pay to scrape any one slice of it, but if you wanted the whole dataset, you'd probably end up paying more than the entire company's worth.

That, or their data is inherently time-sensitive, such that older copies of it aren't as valuable. Stocks, real estate sites, news tickers, etc. come to mind, where sure, you can scrape their stuff, but unless you perform some sort of value-added collation/analysis on top of it, it's going to be stale by the time you serve it to your own users. The data originators are always one step ahead of you.

If your data isn't proprietary to begin with (i.e. you're not the one making it and adding updates) AND you want it to be publicly accessible without an account... it's only a matter of time before some botnet or another scrapes all of it.

You can do things to slow down the scraping, such as adding Cloudflare, but realistically, bots and labor are very cheap in much of the world, and if someone really wants your data, they'll get it. It's essentially free to them, especially if you've done all the hard work of collecting it and putting it all on a single website.

It will always take more time for you add to manually add filter permutations than it takes a script & botnet to enumerate through them. They can just tweak parameters and send them through thousands of headless browsers running in dispersed instances across the world.

You can require account signup and verification before accessing the data, but that's also trivially faked unless you're requiring real payments.

Identifying real users vs bots is anything BUT trivial. Google and Cloudflare and hCaptcha have spent decades trying to solve that with huge teams and world-class researchers. And even they only have limited success rates, especially since anybody can spend pennies to hire real humans to run through your captchas. And that problem is only going to get harder, much harder, with all the advancements in machine learning, natural language processing, and machine vision.

Sorry for the bad news =/ I hope I'm wrong, but I'm fairly confident you can't really accomplish this.

markden · a year ago
While you are right, this isn’t what I was hoping to hear :), I do really appreciate the helpful response. Thank you!
markden commented on Learn Something New Everyday by Email   nowiknow.com/... · Posted by u/aniketpant
duck · 13 years ago
Dan Lewis started this newsletter about the same time I started my Hacker News related one so we have collaborated a good bit the past 2+ years. It has been awesome seeing how fast he has been able to grow it. He does a great job with the writing and finding things that you never have heard of... I can't recommend it enough!
markden · 13 years ago
Also agreed. Enjoy getting these as one of the first emails in the morning. They are always well written, informative, and fun.

u/markden

KarmaCake day3August 10, 2012View Original