Readit News logoReadit News
edent · 7 months ago
About 60k academic citations about to die - https://scholar.google.com/scholar?start=90&q=%22https://goo...

Countless books with irrevocably broken references - https://www.google.com/search?q=%22://goo.gl%22&sca_upv=1&sc...

And for what? The cost of keeping a few TB online and a little bit of CPU power?

An absolute act of cultural vandalism.

toomuchtodo · 7 months ago
https://wiki.archiveteam.org/index.php/Goo.gl

https://tracker.archiveteam.org/goo-gl/ (1.66B work items remaining as of this comment)

How to run an ArchiveTeam warrior: https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior

(edit: i see jaydenmilne commented about this further down thread, mea culpa)

progbits · 7 months ago
They appear to be doing ~37k items per minute, with 1.6B remaining that is roughly 30 days left. So that's just barely enough to do it in time.

Going to run the warrior over the weekend to help out a bit.

pentagrama · 7 months ago
Thank you for that information!

I wanted to help and did that using VMware.

For curious people, here is what the UI looks like, you have a list of projects to choose, I choose the goo.gl project, and a "Current project" tab which shows the project activity.

Project list: https://imgur.com/a/peTVzyw

Current project: https://imgur.com/a/QVuWWIj

xingped · 7 months ago
For those in the now, is this heavy on disk usage? Should I install this on my disk drive or my SSD? Just want to avoid tons of disk writes on an SSD if it's unnecessary.
jlarocco · 7 months ago
IMO it's less Google's fault and more a crappy tech education problem.

It wasn't a good idea to use shortened links in a citation in the first place, and somebody should have explained that to the authors. They didn't publish a book or write an academic paper in a vacuum - somebody around them should have known better and said something.

And really it's not much different than anything else online - it can disappear on a whim. How many of those shortened links even go to valid pages any more?

And no company is going to maintain a "free" service forever. It's easy to say, "It's only ...", but you're not the one doing the work or paying for it.

justin66 · 7 months ago
> It wasn't a good idea to use shortened links in a citation in the first place, and somebody should have explained that to the authors. They didn't publish a book or write an academic paper in a vacuum - somebody around them should have known better and said something.

It's a great idea, and today in 2025, papers are pretty much the only place where using these shortened URLs makes a lot of sense. In almost any other context you could just use a QR code or something, but that wouldn't fit an academic paper.

Their specific choice of shortened URL provider was obviously unfortunate. The real failure is that of DOI to provide an alternative to goo.gl or tinyurl or whatever that is easy to reach for. It's a big failure, since preserving references to things like academic papers is part of their stated purpose.

gmerc · 7 months ago
Ahh classic free market cop out.
HaZeust · 6 months ago
>"It wasn't a good idea to use shortened links in a citation in the first place, and somebody should have explained that to the authors"

???

DOI and ORCID sponsored link-shortening with Goo.gl. Authors did what they were told would be optimal, and ORCID was probably told by Google that it'd hone its link-shortening service for long-term reliability. What a crazy victim-blame.

epolanski · 7 months ago
Jm2c, but if your resource is a link to an online resource that's borderline already (at any point the content can be changed or disappear).

Even worse if your resource is a shortened link by some other service, you've just added yet another layer of unreliable indirection.

whatevaa · 7 months ago
Citations are citations, if it's a link, you link to it. But using shorteners for that is silly.
zffr · 7 months ago
For people wanting to include URL references in things like books, what’s the right approach to take today?

I’m genuinely asking. It seems like its hard to trust that any service will remaining running for decades

toomuchtodo · 7 months ago
https://perma.cc/

It is built for the task, and assuming worse case scenario of sunset, it would be ingested into the Wayback Machine. Note that both the Internet Archive and Cloudflare are supporting partners (bottom of page).

(https://doi.org/ is also an option, but not as accessible to a casual user; the DOI Foundation pointed me to https://www.crossref.org/ for adhoc DOI registration, although I have not had time to research further)

edent · 7 months ago
The full URl to the original page.

You aren't responsible if things go offline. No more than if a publisher stops reprinting books and the library copies all get eaten by rats.

A reader can assess the URl for trustworthiness (is it scam.biz or legitimate_news.com) look at the path to hazard a guess at the metadata and contents, and - finally - look it up in an archive.

danelski · 7 months ago
Real URL and save the website in the Internet Archive as it was on the date of access?
AbstractH24 · 7 months ago
What's the right approach to take for referencing anything that isn't preserved in an institution like the Library of Congress?

Say the interview of a person, a niche publication, a local pamphlet?

Maybe to certify that your article is of a certain level of credibility you need to manually preserve all the cited works yourself in an approved way.

kazinator · 7 months ago
The act of vandalism occurs when someone creates a shortened URL, not when they stop working.
djfivyvusn · 7 months ago
The vandalism was relying on Google.
toomuchtodo · 7 months ago
You'd think people would learn. Ah, well. Hopefully we can do better from lessons learned.
api · 7 months ago
The web is a crap architecture for permanent references anyway. A link points to a server, not e.g. a content hash.

The simplicity of the web is one of its virtues but also leaves a lot on the table.

jeffbee · 7 months ago
While an interesting attempt at an impact statement, 90% of the results on the first two pages for me are not references to goo.gl shorteners, but are instead OCR errors or just gibberish. One of the papers is from 1981.
justinmayer · 7 months ago
In the first segment of the very first episode of the Abstractions podcast, we talked about Google killing its goo.gl URL obfuscation service and why it is such a craven abdication of responsibility. Have a listen, if you’re curious:

Overcast link to relevant chapter: https://overcast.fm/+BOOFexNLJ8/02:33

Original episode link: https://shows.arrowloop.com/@abstractions/episodes/001-the-r...

SirMaster · 7 months ago
Can't someone just go through programmatically right now and build a list of all these links and where they point to? And then put up a list somewhere that everyone can go look up if they need to?
QuantumGood · 7 months ago
When they began offering this, their rep for ending services was already so bad I refused to consider goo.gl. Amazing for how many years now they have introduced then ended services with large user bases. Gmail being in "beta" for five years was, weirdly, to me, a sign they might stick with it.
crossroadsguy · 7 months ago
I have always struggled with this. If I buy a book I don’t want an online/URL reference in it. Put the book/author/isbn/page etc. Or refer to the magazine/newspaper/journal/issue/page/author/etc.
BobaFloutist · 7 months ago
I mean preferably do both, right? The URL is better for however long it works.
nikanj · 7 months ago
The cost of dealing and supporting an old codebase instead of burning it all and releasing a written-from-scratch replacement next year
eviks · 7 months ago
> And for what? The cost of keeping a few TB online and a little bit of CPU power?

For the immeasurable benefits of educating the public.

lubujackson · 7 months ago
Truly, the most Googly of sunsets.
asdll · 7 months ago
> An absolute act of cultural vandalism.

It makes me mad also, but something we have to learn the hard way is that nothing in this world is permanent. Never, ever depend on any technology to persist. Not even URLs to original hosts should be required. Inline everything.

Dead Comment

Dead Comment

Dead Comment

mrcslws · 7 months ago
From the blog post: "more than 99% of them had no activity in the last month" https://developers.googleblog.com/en/google-url-shortener-li...

This is a classic product data decision-making fallacy. The right question is "how much total value do all of the links provide", not "what percent are used".

bayindirh · 7 months ago
> The right question is "how much total value do all of the links provide", not "what percent are used".

Yes, but it doesn't bring in the sweet promotion home, unfortunately. Ironically, if 99% of them doesn't see any traffic, you can scale back the infra, run it in 2 VMs, and make sure a single person can keep it up as a side quest, just for fun (but, of course, pay them for their work).

This beancounting really makes me sad.

quesera · 7 months ago
Configuring a static set of redirects would take a couple hours to set up, and literally zero maintenance forever.

Amazon should volunteer a free-tier EC2 instance to help Google in their time of economic struggles.

socalgal2 · 7 months ago
If they wanted the sweat promotion they could add an interstitial. Yes, people would complain, but at least the old links would not stop working.
ahstilde · 7 months ago
> just for fun (but, of course, pay them for their work).

Doing things for fun isn't in Google's remit

Deleted Comment

Dead Comment

HPsquared · 7 months ago
Indeed. I've probably looked at less than 1% of my family photos this month but I still want to keep them.

Deleted Comment

sltkr · 7 months ago
I bet 99% of URLs that exist on the public web had no activity last month. Might as well delete the entire WWW because it's obviously worthless.
chneu · 7 months ago
Where'd all my porn go!?
SoftTalker · 7 months ago
From Google's perspective, the question is "How many ads are we selling on these links" and if it's near zero, that's the value to them.
fizx · 7 months ago
Don't be confused! That's not how they made the decision; it's how they're selling it.
esafak · 7 months ago
So how did they decide?
firefax · 7 months ago
> "more than 99% of them had no activity in the last month"

Better to have a short URL and not need it, than need a short URL and not have it IMO.

esafak · 7 months ago
What fraction of indexed Google sites, Youtube videos, or Google Photos were retrieved in the last month? Think of the cost savings!
nomel · 7 months ago
Youtube already does this, to some extent, by slowly reduce the quality of your videos, if they're not accessed frequently enough.

Many videos I uploaded in 4k are now only available in 480p, after about a decade.

handsclean · 7 months ago
I don’t think they’re actually that dumb. I think the dirty secret behind “data driven decision making” is managers don’t want data to tell them what to do, they want “data” to make even the idea of disagreeing with them look objectively wrong and stupid.
HPsquared · 7 months ago
It's a bit like the the difference between "rule of law" and "rule by law" (aka legalism).

It's less "data-driven decisions", more "how to lie with statistics".

FredPret · 7 months ago
"Data-driven decision making"
JimDabell · 7 months ago
Cloudflare offered to keep it running and were turned away:

https://x.com/elithrar/status/1948451254780526609

Remember this next time you are thinking of depending upon a Google service. They could have kept this going easily but are intentionally breaking it.

fourseventy · 7 months ago
Google killing their domains service was the last straw for me. I started moving all of my stuff off of Google since then.
nomel · 7 months ago
I'm still shocked that my google voice number still functions after all these years. It makes me assume it's main purpose is to actually be an honeypot of some sort, maybe for spam call detection.
thebruce87m · 7 months ago
> Remember this next time you are thinking of depending upon a Google service.

Next time? I guess there’s a wave of new people that haven’t learned that that lesson yet.

jaydenmilne · 7 months ago
ArchiveTeam is trying to brute force the entire URL space before its too late. You can run a Virtualbox VM/docker image (ArchiveTeam Warrior) to help (unique IPs are needed). I've been running it for a couple months and found a million.

https://wiki.archiveteam.org/index.php/ArchiveTeam_Warrior

pimlottc · 7 months ago
Looks like they have saved 8000+ volumes of data to the Internet Archive so far [0]. The project page for this effort is here [1].

0: https://archive.org/details/archiveteam_googl

1: https://wiki.archiveteam.org/index.php/Goo.gl

localtoast · 7 months ago
Docker container FTW. Thanks for the heads-up - this is a project I will happily throw a Hetzner server at.
chneu · 7 months ago
im about to go setup my spare n100 just for this project. If all it uses is a lil bandwidth then that's perfect for my 10gbps fiber and n100.
wobfan · 7 months ago
Same here. I am geniunely asking myself for what though. I mean, they'll receive a list of the linked domains, but what will they do with that?
hadrien01 · 7 months ago
After a while I started to get "Google asks for a login" errors. Should I just keep going? There's no indication on what I should do on the ArchiveTeam wiki
ojo-rojo · 7 months ago
Thanks for sharing this. I've often felt that the ease by which we can erase digital content makes our time period susceptible to a digital dark ages to archaeologists studying history a few thousand years from now.

Us preserving digital archives is a good step. I guess making hard copies would be the next step.

AstroBen · 7 months ago
Just started, super easy to set up
cedws · 7 months ago
Why wouldn’t Google just publish a database of URLs? Even just a CSV file? Infuriating.
devrandoom · 7 months ago
I suspect there are links to some really bad shit in there. Google is probably in damage control mode.
cpeterso · 7 months ago
Google’s own services generate goo.gl short URLs (Google Maps generates https://maps.app.goo.gl/ URLs for sharing links to map locations), so I assume this shutdown only affects user-generated short URLs. Google’s original announcement doesn’t say as such, but it is carefully worded to specify that short URLs of the “https://goo.gl/* format” will be shut down.

Google’s probably trying to stop goo.gl URLs from being used for phishing, but doesn’t want to admit that publicly.

growthwtf · 7 months ago
This actually makes the most logical sense to me, thank you for the idea. I don't agree with the way they're doing it of course but this probably is risk mitigation for them.
cedws · 7 months ago
That could be an explanation but even so, they could continue to serve the redirects on some other domain so that at the very least people can just change goo.gl to something else and still access whatever the link was to.
jedberg · 7 months ago
I have only given this a moment's thought, but why not just publish the URL map as a text file or SQLLite DB? So at least we know where they went? I don't think it would be a privacy issue since the links are all public?
DominikPeters · 7 months ago
It will include many URLs that are semi-private, like Google Docs that are shared via link.
ryandrake · 7 months ago
If some URL is accessible via the open web, without authentication, then it is not really private.
chneu · 7 months ago
That's not any better than what archiveteam is doing. They're brute forcing the URLs to capture all of them. So privacy won't really matter here.
charcircuit · 7 months ago
Then use something like argon2 on the keys, so you have to spend a long time to brute force them all similar to how it is today.
high_na_euv · 7 months ago
So exclude them
Nifty3929 · 7 months ago
I'd rather see it as a searchable database, which I would think is super cheap and no maintenance for Google, and avoids these privacy issues. You can input a known goo.gl and get it's real URL, but can't just list everything out.
growt · 7 months ago
And then output the search results as a 302 redirect and it would just be continuing the service.

Deleted Comment

devrandoom · 7 months ago
Are they all public? Where can I see them?
jedberg · 7 months ago
You can brute force them. They don't have passwords. The point is the only "security" is knowing the short URL.
Alifatisk · 7 months ago
I don't think so, but you can find the indexed urls here https://www.google.com/search?q=site%3A"goo.gl" it's about 9,6 million links. And those are what got indexed, it should be way more out there
spankalee · 7 months ago
As an ex-Googler, the problem here is clear and common, and it's not the infrastructure cost: it's ownership.

No one wants to own this product.

- The code could be partially frozen, but large scale changes are constantly being made throughout the google3 codebase, and someone needs to be on the hook for approving certain changes or helping core teams when something goes wrong. If a service it uses is deprecated, then lots of work might need to be done.

- Every production service needs someone responsible for keeping it running. Maybe an SRE, thought many smaller teams don't have their own SREs so they manage the service themselves.

So you'd need some team, some full reporting chain all the way up, to take responsibility for this. No SWE is going to want to work on a dead product where no changes are happening, no manager is going to care about it. No director is going to want to put staff there rather than a project that's alive. No VP sees any benefit here - there's only costs and risks.

This is kind of the Reader situation all over again (except for the fact that a PM with decent vision could have drastically improved and grown Reader, IMO).

This is obviously bad for the internet as a whole, and I personally think that Google has a moral obligation to not rug pull infrastructure like this. Someone there knows that critical links will be broken, but it's in no one's advantage to stop that from happening.

I think Google needs some kind of "attic" or archive team that can take on projects like this and make them as efficiently maintainable in read-only mode as possible. Count it as good-will marketing, or spin it off to google.org and claim it's a non-profit and write it off.

Side note: a similar, but even worse situation for the company is the Google Domains situation. Apparently what happened was that a new VP came into the org that owned it and just didn't understand the product. There wasn't enough direct revenue for them, even though the imputed revenue to Workspace and Cloud was significant. They proposed selling it off and no other VPs showed up to the meeting about it with Sundar so this VP got to make their case to Sundar unchallenged. The contract to sell to Squarespace was signed before other VPs who might have objected realized what happened, and Google had to buy back parts of it for Cloud.

gsnedders · 7 months ago
To some extent, it's cases like this which show the real fragility of everything existing as a unified whole in google3.

While clearly maintenance and ownership is still a major problem, one could easily imagine deploying something similar — especially read-only — using GCP's Cloud Run and BigTable products could be less work to maintain, as you're not chasing anywhere near such a moving target.

rs186 · 7 months ago
Many good points, but if you don't mind me asking: if you were at Google, would you be willing to be the lead of that archive team, knowing that you'll be stuck at this position for the next 10 years, with the possibility of your team being downsized/eliminated when the wind blows slightly in the other direction?
olejorgenb · 6 months ago
Does maintaining a frozen service like this[1] really require a team with a leader? I get that someone need to know the service and do maintenance when necessary, but surely that wouldn't be much more than a 20% position or something? At least if some ground work is done to make the now simplified[2] service simpler to run.

[1] Almost the simplest possible services (sans the scale I guess) you can imagine except simple static webpages

[2] The original product included some sort of traffic counter, etc. IIRC

spankalee · 7 months ago
Definitely a valid question!

Myself, no, for a few reasons: I mainly work on developer tools, I'm too senior for that, and I'm not that interested.

But some people are motivated to work on internet infrastructure, and would be interested. First, you wouldn't be stuck for 10 years. That's not how Google works (and you could of course quit) you're supposed to be with a team a minimum of 18 months, and after that, transfer away. A lot of junior devs don't care that much where they land, the archive team would have to be responsible for more than just the link shortener, so it might be interesting to care for several services from top to bottom. SWEs could be compensated for rotating on to the archive team, and/or it could be part-time.

I think the harder thing is getting management buy-in, even from the front-line managers.

ElijahLynn · 7 months ago
OMFG - Google should keep these up forever. What a hit to trust. Trust with Google was already bad for everything they killed, this is another dagger.
phyzix5761 · 7 months ago
People still trust Google?