Aside: At the end of the article they have a "Bonus: Speeding things up" section where they automate adding 300~ coupons via 300 HTTP connections in 5 seconds (instead of 60~ seconds).
In my opinion if you're going to automate stuff like this, you should do so with the goal of minimizing disruption (and frankly, detection). They run the script automatically at midnight in the background via cron, so why was going slower problematic? 300 requests in a span of 5 seconds seems much more likely to trigger an IDS[0], get flagged as unusual in the logs, or similar than 300 over 60 seconds. Particularly at midnight.
I'd be trying to look at human as possible and not set off automated security systems. Heck you could add a randomized delay (e.g. 1~2 seconds) between requests and it would still be completed inside of 10 minutes. Plus then nobody can reasonably accuse you of trying to "DoS" them/violate the CFAA.
Yeah from my experience they'll run into problems real fast making 300req/s in 5 seconds. It's so "noisy" and potentially disruptive.
I'm honestly surprised they weren't rate limited. I mean, your SPA would have to be really messed up to make more than a handful of requests a second (even then why aren't you using sockets?) - So it's super reasonable to say that if anyone is making over 100req/s then maybe they should take an hour timeout.
If you don't want to get caught doing this then you'll want a randomized delay like you said, and preferably a pool of IP addresses you proxy through. This is to prevent the automated stuff from catching you though - a human can still look at your account and wonder why you have every coupon activated 24/7.
I agree - I actually built the thread pool stuff for work while I was testing, since I had the code from a previous project.
If I was more worried about Safeway catching on I'd probably do something as you suggested (at the very least I'd add user agent headers and the other cookies expected from a real session, as right now it's trivial to detect my requests).
Sneaker bots do this exceedingly well - it's a constant cat and mouse game to make the requests look as human as possible, very interesting space right now.
While I agree with spanning it over six+ hours, you might be surprised at the activity level in shopping centers at midnight. Once upon a time, I worked nightshift at walmart stocking shelves- on my days off, I'd still have to be awake all night to maintain a sleep schedule. To retain my sanity, I'd shop at other grocery stores so it didn't feel like I lived at walmart, and I was pretty surprised by how many shoppers were out and about at 1 and 2 am at stores like Ridley's or Smith's.
I did something similar with Safeway's API back in the day. But rather than chew up their API with unsupported usage--CFAA case territory--I now just log into Safeway's website and issue a one-liner on the coupon page:
I did that for a while too, but realized the mental overhead wasn't really worth it - occasionally I'd get the super generic coupons of "Spend $20, get $2 back" and those were worth it, but most of the time I was adding coupons I'd never use.
The mental effort of going to the site at all was what I was trying to circumvent - now I don't even need to think about it (the clearer abstraction is that going from "2 to 1" is a much less drastic jump than going from "1 to 0" - completely removing the overhead is what makes this worthwhile, IMO, not the actual speed of the actions)
I use to use a similar bookmarklet that I would run on the Safeway Coupons page. But it stopped working after they moved away from Angular and I was too lazy to update it.
This is what I was thinking the whole time reading and looking at the site images... why script all that, when you could simply click the buttons in the browser...
Rube Goldberg Machine's are not good in software projects...
This reminds me of a very similar case a decade ago I never figured out.
Anyone remember Coupon Guy from 4chan in the 2009/2010 timeframe? You could make your own "Buy 1 Get 1 Free" or variant (for both N, "buy 1 get 10", or for kind of coupon, like "buy 1 get half off") Tv, Xbox 360, near anything etc. For Walmart, Best Buy, other chains etc. The tools were passed around stegonographically in the instruction images. Since coupons weren't properly accounted for until they hit a place in Texas, whole threads full of Anons over the course of weeks fabricating working coupons.
Until they stopped working, and of course rumors of "the FBI" apparently grabbing the guy.
I never did figure out what happened tech wise under the hood there.
If I had to guess it would be something such as concatenating a bar code.
In UK, we have items which when reduced all they do is add the new price to the end of the bar code.
ie. If some product is barcode of 3035555074225 and it's then reduced in price, the reduction including some checksum is added.
So, if the new price of 3035555074225 is 50p, the code becomes 303555507422500502 (Where we add 0050 for the price, and 2 is the checksum).
Next time you are in the supermarket just look at the barcode and you'll spot the pattern.
So for your example, maybe there was some EPOS system that the guy had inside understanding of how barcodes worked on coupons and could easily pair them.
It's easy to make a file that both a valid image and a valid zip file, because zip readers start from the end-of-central-directory at EOF and most other things start from a header at the start of the file. Many such images included a winrar icon.
There might have also been some steganographic images, but I know for sure I saw several of the image+archive kind around then.
This is the kind of stuff where I tell people that coding can be a real practical skill (that gives you an unfair advantage these days).
I did something similar (read-only) for Home Depot Truck Rentals. To check if the truck was available at my local store, each time, you had to put in your zip code, and click a couple of times. Once I found that was an API, I rebuilt the call in Postman and kept hitting that endpoint until a truck was available.
That way I could check really fast.
The twist: None of it mattered because their data itself didn't update accurately (I saw one in the parking lot and they had one available and never updated their site). :)
I did the same thing to find available camping spots in Hawaii because they are super difficult to get. Wrote a script that would query their "API" every 5 minutes and alert me if a spot became available anywhere.
Have you considered whether doing this might be...wrong?
Presumably those campsites are permitted by some government agency (NPS, BLM, the state of Hawaii, etc.), and presumably that agency designed permitting system with the assumption that people with limited time and attention would be vying for the permits by having to visit the site themselves to get one.
This encodes a particular definition of fairness: that those who register early, or are very motivated, or simply those with a lot of free time to refresh the site, will get permits.
I can also whip up a quick script to replace refreshing an unprotected HTTP API with a notification email. Does that make me more deserving of the camping spot?
When working in a kitchen in California, I had to take the ServSafe Food Handler course online, which consisted of long, patronizing, unskippable videos explaining basic food safety concepts (which I already knew from previous experience) and little "check your knowledge" multiple-choice questions peppered throughout. The only thing that actually affected whether you got the certification was a quiz at the end.
I just poked at the main JS file for a couple of minutes until I found the statement I needed:
mainController.loadNextScreen()
This turned what would have been a 3+-hour slog into 5 minutes, and I passed the quiz just fine.
Never underestimate the error introduced by a lack of incentive for humans to comply with the API (or the API itself being human-hostile).
My favorite example is the Domino's "Your pizza is ready" signal. Since the data feeding the signal also feeds the store's performance analysis (i.e. they track how fast employees are getting pizzas ready), there's significant incentive for employees to lie to the algorithm and hit "It's ready" before it's physically ready, on the assumption that customers will take nonzero time to wander over and show up for pickup.
There are companies[0] that provide realtime (I think) satellite imagery apis. One use case I saw was stock traders tracking parking lots of large retailers to gain early insights into sales metrics.
Another tip: (area code) 867-5309 works for Safeway club card discounts in most area codes while piping tracking of your purchases to essentially null.
At risk of ya'll taking "my" safeway gas rewards, this is almost always good for whatever the maximum discount off per gallon is at any safeway that also has a gas station.
Its amazing to me the degree to which people will go to save 3% or less on gas once a month. If you fill up 4 times a month you're saving about $4, and all it costs is a profile of everything you buy attached to your phone number.
Similar trick works at AMC movie theaters with their rewards program. You can use it to get discount tickets on Tuesdays and popcorn discounts.
Similarly for all these other rewards programs you see at restaurants nowadays. I will never understand the idea of using a phone number as authentication without any additional PIN or text message or anything. If I have an acquaintance that I know goes to a lot of movies, and I either know or can find their phone number, I can drain their rewards account. Or you can drain their Safeway rewards account, etc. I wonder how much longer the situation will last?
Works at my local Safeway, as of yesterday. There was a period of time over a year ago during which Jenny's number didn't work at a different Safeway but it's been reliable at this one, so far.
Is there a private API for a particular store's prices?
It seems like coupled with couponing, you can build a decent price tracker that can tell you if you're actually getting a good deal. (like https://camelcamelcamel.com/ for amazon, https://steamdb.info/sales/ for steam games)
Otherwise I've noticed that many(though not all) coupons are for items which recently had their base price increased to make the coupon seem like a better deal than you're actually saving.
When I see access to private API's like this, I wonder how judges would interpret these actions as they relate to the CFAA. By accessing "private" API's like this, are you knowingly accessing a computer without authorization? Are you exceeding your authorized access?
The fact that Aaron's Law never went through has disturbed me...
In EU you are allowed to automate everything you legally can do manually, even if they have a large sign on their page saying that automation is not allowed.
The only thing that is limiting you from selling is copyright, but by-products from ordinary business is always legal.
The US Gov't has tried to apply CFAA to someone they claimed was breaking a website's TOS[0]. The EFF filed an amicus brief in support of the defendant about this[1]. Who knows what the TOS is for grabbing coupons?
Sure, the judge who heard this case said this would be an "overly broad" interpretation of the law at the time, but the question has come up in subsequent criminal cases as well. I'd feel better if that was actually codified and not left up for interpretation by other judges or courts.
There's an ongoing case Linkedin v HiQ where Linkedin said HiQ was scraping publicly available linkedin profiles but there was a robots.txt that told them not to. HiQ kept doing it until Linkedin threatened them under the CFAA. HiQ just won a preliminary injunction to get to continue where the court said it was unlikely that they were violating the CFAA but they might change their minds as the case progresses:
The Safeway API in question requires a user to be logged-in, authorized, and have explicitly agreed to a TOS (clickwrap). The HiQ v. LinkedIn case was centered around publicly accessible content which did not require the user to be authenticated nor have explicitly agreed to a TOS (browsewrap is not enforceable).
On a slightly unrelated note, I was trying to figure out why this guy is using a .ca domain when I realized, his name is Jon Luca. At first I thought his name was Jon Lu (and therefore asian). Good use of domain name.
I'm in Canada and shop at Sobeys, which owns the Safeway brand up here, thought there might be some relevance to me thanks to the .ca domain name. Unfortunately, there was no relevance but it was still an interesting read.
On a side note, Safeway and Sobeys in Canada don't have a loyalty program, instead they piggy back off of Air Miles. All of the special offers available just amount to bonus Air Miles, so they're not actually that worthwhile (IMO).
What a blast from the past. I am actually Canadian but have been living in the Bay Area for over 10 years so I've already forgotten that you can collect Air Miles through Safeway (which was why I clicked on this article in the first place thinking of the .ca)
I don't think Air Miles are completely useless. I do remember redeeming air miles at least once in the past for a one-way trip somewhere...
Not really a good use since the .ca TLD is supposed to be reserved for Canadian entities (individuals and businesses) and the site would seem to break policy.
If it's indeed the case that he has no residency, he's one report to CIRA away from the domain cancelled.
I saw someone use this for their email address a few years back. Say their name was Jane smith (anonymised for obvious reasons), their email address was j@nesmi.th (actually using a .ch domain).
Aside: At the end of the article they have a "Bonus: Speeding things up" section where they automate adding 300~ coupons via 300 HTTP connections in 5 seconds (instead of 60~ seconds).
In my opinion if you're going to automate stuff like this, you should do so with the goal of minimizing disruption (and frankly, detection). They run the script automatically at midnight in the background via cron, so why was going slower problematic? 300 requests in a span of 5 seconds seems much more likely to trigger an IDS[0], get flagged as unusual in the logs, or similar than 300 over 60 seconds. Particularly at midnight.
I'd be trying to look at human as possible and not set off automated security systems. Heck you could add a randomized delay (e.g. 1~2 seconds) between requests and it would still be completed inside of 10 minutes. Plus then nobody can reasonably accuse you of trying to "DoS" them/violate the CFAA.
[0] https://en.wikipedia.org/wiki/Intrusion_detection_system
I'm honestly surprised they weren't rate limited. I mean, your SPA would have to be really messed up to make more than a handful of requests a second (even then why aren't you using sockets?) - So it's super reasonable to say that if anyone is making over 100req/s then maybe they should take an hour timeout.
If you don't want to get caught doing this then you'll want a randomized delay like you said, and preferably a pool of IP addresses you proxy through. This is to prevent the automated stuff from catching you though - a human can still look at your account and wonder why you have every coupon activated 24/7.
If I was more worried about Safeway catching on I'd probably do something as you suggested (at the very least I'd add user agent headers and the other cookies expected from a real session, as right now it's trivial to detect my requests).
Sneaker bots do this exceedingly well - it's a constant cat and mouse game to make the requests look as human as possible, very interesting space right now.
These projects are fun IMO, but it's best to not hammer anyone's servers if it isn't necessary.
Q: If you're worried about detection, why would one blog about it/post it on HN?
Isn't the fastest way to shutdown a loophole to make it public?
$(".grid-coupon-clip-button button").click();
The mental effort of going to the site at all was what I was trying to circumvent - now I don't even need to think about it (the clearer abstraction is that going from "2 to 1" is a much less drastic jump than going from "1 to 0" - completely removing the overhead is what makes this worthwhile, IMO, not the actual speed of the actions)
https://gist.github.com/pbojinov/d572b5494a4f26390aeb5136d70...
https://www.safeway.com/justforu/coupons-deals.html
for(let i=0;30>i;++i)setTimeout(function(){btn=document.querySelector("#coupon-grid_0 > div.coupon-grid-container > div.load-more-container > button");btn && btn.click()},1e3i);elems=document.querySelectorAll(".grid-coupon-clip-button button");for(let i=0;i<elems.length;++i)setTimeout(function(){elems[a].click()},500i);
Rube Goldberg Machine's are not good in software projects...
> what does this do exactly?
The command allows you to click all buttons in the page at once:
• The dollar sign is an alias for jQuery [1];
• The text between double quotes is a CSS selector;
• Select every DOM element with a “grid-coupon-clip-button” and “button” CSS class;
• The thing at the end is a JavaScript function call which triggers an “onclick” event [2];
[1] https://jquery.com/
[2] https://api.jquery.com/click/
Deleted Comment
Anyone remember Coupon Guy from 4chan in the 2009/2010 timeframe? You could make your own "Buy 1 Get 1 Free" or variant (for both N, "buy 1 get 10", or for kind of coupon, like "buy 1 get half off") Tv, Xbox 360, near anything etc. For Walmart, Best Buy, other chains etc. The tools were passed around stegonographically in the instruction images. Since coupons weren't properly accounted for until they hit a place in Texas, whole threads full of Anons over the course of weeks fabricating working coupons.
Until they stopped working, and of course rumors of "the FBI" apparently grabbing the guy.
I never did figure out what happened tech wise under the hood there.
They did get him: http://www.thesmokinggun.com/documents/internet/fbi-busts-4c...
This story about a similar arrest has a good explanation about how to fake coupons: https://www.wired.com/2015/05/inside-a-million-dollar-dark-w...
In UK, we have items which when reduced all they do is add the new price to the end of the bar code.
ie. If some product is barcode of 3035555074225 and it's then reduced in price, the reduction including some checksum is added.
So, if the new price of 3035555074225 is 50p, the code becomes 303555507422500502 (Where we add 0050 for the price, and 2 is the checksum).
Next time you are in the supermarket just look at the barcode and you'll spot the pattern.
So for your example, maybe there was some EPOS system that the guy had inside understanding of how barcodes worked on coupons and could easily pair them.
Whoa, really? I remember the coupons flying around 4chan of course. Had no idea the tools were in the images.
There might have also been some steganographic images, but I know for sure I saw several of the image+archive kind around then.
I did something similar (read-only) for Home Depot Truck Rentals. To check if the truck was available at my local store, each time, you had to put in your zip code, and click a couple of times. Once I found that was an API, I rebuilt the call in Postman and kept hitting that endpoint until a truck was available.
That way I could check really fast.
The twist: None of it mattered because their data itself didn't update accurately (I saw one in the parking lot and they had one available and never updated their site). :)
I did the same thing to find available camping spots in Hawaii because they are super difficult to get. Wrote a script that would query their "API" every 5 minutes and alert me if a spot became available anywhere.
Presumably those campsites are permitted by some government agency (NPS, BLM, the state of Hawaii, etc.), and presumably that agency designed permitting system with the assumption that people with limited time and attention would be vying for the permits by having to visit the site themselves to get one.
This encodes a particular definition of fairness: that those who register early, or are very motivated, or simply those with a lot of free time to refresh the site, will get permits.
I can also whip up a quick script to replace refreshing an unprotected HTTP API with a notification email. Does that make me more deserving of the camping spot?
I just poked at the main JS file for a couple of minutes until I found the statement I needed:
This turned what would have been a 3+-hour slog into 5 minutes, and I passed the quiz just fine.My favorite example is the Domino's "Your pizza is ready" signal. Since the data feeding the signal also feeds the store's performance analysis (i.e. they track how fast employees are getting pizzas ready), there's significant incentive for employees to lie to the algorithm and hit "It's ready" before it's physically ready, on the assumption that customers will take nonzero time to wander over and show up for pickup.
Deleted Comment
[0]: http://www.digitalglobe.com/products/satellite-imagery
Yes, Jenny’s number.
I thought that number was drilled into everyone's brain, but I guess you have to be a certain age.
According to my last receipt, my YTD "savings" has been $959.68
So I am not the only Jenny out there!
At the self checkout machines at Giant there is a "Forgot my card" option. It gives you all the discounts without entering anything.
https://en.wikipedia.org/wiki/867-5309/Jenny
Similarly for all these other rewards programs you see at restaurants nowadays. I will never understand the idea of using a phone number as authentication without any additional PIN or text message or anything. If I have an acquaintance that I know goes to a lot of movies, and I either know or can find their phone number, I can drain their rewards account. Or you can drain their Safeway rewards account, etc. I wonder how much longer the situation will last?
Too bad? I don’t have a car
It seems like coupled with couponing, you can build a decent price tracker that can tell you if you're actually getting a good deal. (like https://camelcamelcamel.com/ for amazon, https://steamdb.info/sales/ for steam games)
Otherwise I've noticed that many(though not all) coupons are for items which recently had their base price increased to make the coupon seem like a better deal than you're actually saving.
The fact that Aaron's Law never went through has disturbed me...
Sure, the judge who heard this case said this would be an "overly broad" interpretation of the law at the time, but the question has come up in subsequent criminal cases as well. I'd feel better if that was actually codified and not left up for interpretation by other judges or courts.
0 - https://en.wikipedia.org/wiki/United_States_v._Drew#Indictme...
1 - https://www.eff.org/cases/united-states-v-drew
There's an ongoing case Linkedin v HiQ where Linkedin said HiQ was scraping publicly available linkedin profiles but there was a robots.txt that told them not to. HiQ kept doing it until Linkedin threatened them under the CFAA. HiQ just won a preliminary injunction to get to continue where the court said it was unlikely that they were violating the CFAA but they might change their minds as the case progresses:
https://www.eff.org/deeplinks/2019/09/victory-ruling-hiq-v-l...
I'd argue the above case does not apply here.
On a side note, Safeway and Sobeys in Canada don't have a loyalty program, instead they piggy back off of Air Miles. All of the special offers available just amount to bonus Air Miles, so they're not actually that worthwhile (IMO).
I don't think Air Miles are completely useless. I do remember redeeming air miles at least once in the past for a one-way trip somewhere...
If it's indeed the case that he has no residency, he's one report to CIRA away from the domain cancelled.