I'm trying to deal with a very interesting (to me) case. Someone is proxy-mirroring all content of my website under a different domain name.
- Original: https://www.saashub.com
- Abuser/Proxy-mirror: https://sukuns.us.to
My ideas of resolution:
1) Block them by IP - That doesn't work as they are rotating the IP from which the request is coming.
2) Block them by User Agent - They are duplicating the user-agent of the person making the request to sukuns.us.to
3) Add some JavaScript to redirect to the original domain-name - They are stripping all JS.
4) Use absolute URLs everywhere - they are rewriting everything www.saashub.com to their domain name.
i.e. I'm out of ideas. Any suggestions would be highly appreciated.
p.s. what is more, Bing is indexing all of SaaSHub's content under sukuns.us.to ¯\_(ツ)_/¯. I've reported a copyright infringement, but I have a feeling that it could take ages to get resolved.
I wrote a HN post about it as well: https://news.ycombinator.com/item?id=26105890, but to spare you all the irrelevant details and digging in the comments for updates - here is what worked for me - you can block all their IPs, even though they may have A LOT and can change them on each call:
1) I prepared a fake URL that no legitimate user will ever visit (like website_proxying_mine.com/search?search=proxy_mirroring_hacker_tag)
2) I loaded that URL like 30 thousand times
3) from my logs, I extracted all IPs that searched for "proxy_mirroring_hacker_tag" (which, from memory, was something like 4 or 5k unique IPs)
4) I blocked all of them
After doing the above, the offending domains were showing errors for 2-3 days and then they switched to something else and left me alone.
I still go back and check them every few months or so ...
P.S. My advice is to remove their URL from your post here. This will not help with search engines picking up their domain and ranking it with your content ...
You could make a page that shames their domain name for stealing content. You could make a redirect page that redirects people to your website. Or you could make a page with absolutely disgusting content. I think it would discourage them from playing the cat and mouse game with you and fixing it by getting new IPs.
Not if you value the people who might move to the real domain.
In most countries in the western world, there are 3-4 major ISPs and this is where 99% of your legit traffic comes from. Regular people don't browse the web proxying via hosting centres as Cloudflare will treat them with suspicion on all the websites they protect.
https://www.ovh.com/abuse/
Found the hosting information from here: https://host.io/us.to
Deleted Comment
Don't block them. Show dicks instead
For starters: copyright-infringing material.
Deleted Comment
Deleted Comment
there were a bunch of sites mirroring 8chan to steal content
these were useful because they had both a simpler / lighter / better user interface (aside from images being missing), and posts / threads that were deleted would stay on the mirrors. being able to see deleted posts / threads was highly useful as the moderation on such sites tends to be utterly useless and the output of a random number generator. it was hilarious reading "zigforum" instead of "8chan" in all the posts as the mirror replaced certain words to thinly veil their operation. they even had a reply button that didnt seem to work or was just fake.
tl;dr the web is broken and only is good when "abused" by proxy/mirrors
Those variables are populated by the browser, unless proxying server is rewring them, your web-server will be able to detect imposter and serve him/her with a redirect. If rewrites are indeed in place, then check in the frontend. Blocking by IP is the last option if nothing else works.
2. Create fake html elements and put unique strings inside. And you can search that string in search engines for finding similar fake sites on different domains.
3. Create fake html element and put all request details in encrypted format. Visit adversary's website and look for that element and flag that ip OR flag the headers.
4. Buy proxy databases, and when any user requests your webpage, check if its a proxy.
5. Instead of banning them, return fake content (fake titles and fake images etc) if proxy is detected OR the ip is flagged.
6. Don't ban the flagged ip's. She/He's gonna find another one. Make them angry and their user's angry so they give up on you.
7. Maybe write some bad words to the user on random places in the HTML when you detect flagged ip's :D So the user's will leave the site and this will reduce the SEO point of the adversary. Will be downranked.
8. Enable image hotlinking protection. Increase the cost of proxying for them.
9. Use @document CSS to hide the stuff when the URL is different.
10. Send abuse mail request to the hosting site.
11. Send abuse mail request to the domain provider.
12. Look for the flagged IPs and try to find the proxy provider. If you find, send mail to them too.
Edit: More ideas sparkled in my mind when I was in toilet:
1. Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.
2. When you detect proxy, return too big fake HTML files (10GB) etc. That could crash their server if they load the HTML into the memory when parsing.
Reminds me of a time some real estate website hotlinked a ton of images from my website. After I asked them to stop and they ignored me I added an nginx rewrite rule to send them a bunch of pictures of houses that were on fire.
For some reason they stopped using my website as their image host after that.
I'm curious if they are stealing anything else, e.g. are they selling ads/tracking, do they replace order forms with their own...
Additionally if they decide to blackhole the fake/honeypot url, since you mentioned they pass along the user agent, you could mixin some token in a randomized user agent string that your scraper uses so that you could duck-type the request on your end to signal when to capture the egress ip.
[0]: https://caniuse.com/mdn-css_at-rules_document
SRI is for the situation where a CDN has been poisoned, not this.
For example, I had an app developer start stealing API content, so once I determined points to key from them, instead of blocking them I simply randomized the API content details returned to their user's apps.
Hey, API calls look good, the app looks like it is working, no problem right? Well, the users of the app were pissed and the negative reviews rolled in. It was glorious.
As a side note, their domain is linked in this thread so they are seeing HN in their access logs and probably reading this. It should make for an interesting arms race. Or red/blue team event.
Then, write a little script that repeatedly hits that honeypot URL. I quite like this idea.
> 6. Don't ban the flagged ip's. She/He's gonna find another one. Make them angry and their user's angry so they give up on you.
There's a popular blog that no longer gets linked on HN.
The author didn't like the discussions HN had around his writing, so any visitors with HN as the referer are shown goatse, a notorious upsetting image, instead of the blog content.
> Create fake big css files (10MB etc). And repeatedly download that from the adversary's website. This should cost them too much money on proxies.
Be careful when doing things like this, including the shock image option mentioned in other comments, as then it could become an arsehole race with them trying to DoS your site in retribution. Then again, going through more official channels could also get the same reaction, so…
> When you detect proxy, return too big fake HTML files (10GB) etc. That could crash their server if they load the HTML into the memory when parsing.
Make sure you are setup to always compress outgoing content, so that you can send GBs of mostly single-token content with MBs of bandwidth.
Doesn't that also cost you an equal amount? You'll be serving them an equal amount that they proxy to the end user.
It's not even necessarily a cost for them; you're assuming that the host is owned and paid for by the abuser. If it's simply been hijacked (quite possible), you're just racking up costs for another victim.
Not sure how you actually do it and if it serves your purpose but sounded neat.
[1] https://www.youtube.com/watch?v=jnDk8BcqoR0
Deleted Comment
Nope, since anybody doing this and it has at least minimum intelligence are using residential botnets as proxies.
Deleted Comment
You can also write some obfuscated inline JavaScript that checks the current hostname and compares to the expected one and redirects when not aligned.
https://webmasters.stackexchange.com/questions/56326/canonic...
I noticed that the other domain is hotlinking your images. So you can disable image hotlinking, by only allowing certain domains as the referers. If you block hotlinked images then the other domain will not look as good. Remember to do it for SVGs too.
https://ubiq.co/tech-blog/prevent-image-hotlinking-nginx/
Finally I also see they are using a CDN called Statically to host some assets off your domain. You can block their scrapers by user agent listed here:
http://statically.io/docs/whitelisting-statically/
If the TLS ciphers the client proposes for negotiation doesn’t align with the client’s User-Agent they get a CAPTCHA.
I would suspect that whoever is doing this proxy-mirroring isn’t smart enough to ensure the TLS ciphers align with the User-Agent they’re passing through.
Deleted Comment
By the way, I've also reported the abuser as a phishing/fraud website through https://safebrowsing.google.com/safebrowsing/report_phish/?u...
> 4) Use absolute URLs everywhere - they are rewriting everything www.saashub.com to their domain name.
Instead, plot a few different changes and throw them in all at once. Preferably in a way where they will have to solve all of the changes at the same time to figure out what happened and get things working again. Also, favor changes that are harder to detect. E.g., pure IP blocks are easier to detect than tarpitting and returning fake/corrupted content. The longer their feedback loops, the more likely it is that they'll just give up and go be a parasite somewhere else.
I recently had to employ such a strategy against some extremely aggressive card testers (criminals with lists of stolen credit cards who automate stuffing card info into a donation form to test which cards are still working). Instead of blocking their IPs, I started feeding them randomly generated false responses with a statistically accurate "success" rate. They ran tens of thousands of card tests over many days, and 99% of the data they collected was bogus. It amuses me to know that I polluted their data and wasted so much of their time and effort. Jerks.
FIND THE IP FOR THE DOMAIN
REVERSE DNS TO FIND HOST Apparently it's "Dedipath".And that WHOIS lookup gives an abuse email address:
So you could try emailing that address. They may take the site down, or hopefully more than that...It's not the same thing, but I'm reminded now of email in the past, when you would usually get an undeliverable message if something went wrong. But later that was almost entirely stopped - because of spam. Massive volumes of spam was sent from forged addresses, and much of it led to those replies. So that made things worse by doubling the volume, plus the innocents whose addresses had been forged got deluges of confusing undeliverable messages!
I think you're right in that changing IPs would be easy for them. But, changing hosts would be significantly more work and hassle. So if the abuse reporting worked, that could have much more of an impact...
[1] https://github.com/brianhama/bad-asn-list