Readit News logoReadit News
xbmcuser · 2 years ago
I think a bigger problem than 38% of webpages being dead is a lot of it is entities/groups/businesses now use facebook pages almost exclusively and have no other web presence outside of Facebook. In other words a Facebook account becomes a requirement to interact with them.
nicbou · 2 years ago
The same happened with forums. They're all subreddits, Facebook groups or Discord chats now. A lot of valuable information is kept hidden in those groups now, and it makes me really sad.
daniel_reetz · 2 years ago
I love forums. I've kept the DIY Book Scanner forum online since... 2009? Recently (last two years) these damn AI scrapers have killed PHPBB over and over again. They got me kicked off my shared web hosting plan by abusing search and other forum features.

I upgraded to VPS for $500. The other admin spent 15-20 hours fixing/troubleshooting/transferring. And you know what? At the end of all this, I paid to give my data to these jerks, to keep it online for them to harvest. The forums are dead quiet.

Now I think, Discord is fine. They'll just sell the data to AI companies directly, the burden won't fall on me.

matwood · 2 years ago
Reddit at least shows up in searches. I also think it's important not to look at the past with rose colored glasses. I think some random forum is much more likely to disappear than a subreddit.
wil421 · 2 years ago
Car forums are still alive but yea the shift from thread discussions to comment and/or video discussions really kills a lot of knowledge. It’s great to find old forum posts showing you how to work on your car. It’s tiresome skipping through videos to find what you need to or even searching Reddit.

The big thing about discord is you can chat now with people but the knowledge is not in a good format to come back to later.

Barrin92 · 2 years ago
I'm probably in the minority of people who appreciate that trend. Valuable information being hidden means that community comes before information. If you want to gain access to other people's knowledge you have to opt-in and interact with and understand the people who made it, and that creates an incentive to contribute back and use knowledge in an appropriate way.

The open internet seems increasingly predatory and a place where some gigantic ML company just vacuums up your stuff or resells your content for ad revenue, parasitic.

I don't mind the fact and think it's honestly a natural reaction to this that people guard their information. It's sort of like a medieval monastery version of the internet where people recognize that information is cultivated rather than just some commodity you scrape off the web.

lolive · 2 years ago
AI will make all that even worse. Data staying hidden behind nice UX is VERY bad news. May be all that will lead to an equivalent of open-source, but for data.
Iulioh · 2 years ago
Man, i hate it.

Not only a lot of communities are hidden because of Discord (at least with Reddit they were more discoverable), the worst part is the fact that they are unsearchable or behind a paywall.

Like the "join my discord if you pay at least 3$/mo!" is pretty innocent but you are gatekeeping a community that before was pubblic.

If we are talking about something like a content creator focused about an hobby or pc problems you can see how Google will become even more useless.

Reddit was the least bad choice between it and Discord but has failed the "i want to be a social network".

Deleted Comment

every · 2 years ago
I only use Facebook to stay in touch with widely dispersed family members. Nothing else. One peek a day to see what's up. Assuming you have an account, I find this makes the task much easier:

https://www.facebook.com/?filter=friends

KORraN · 2 years ago
Thanks for this tip! I've added this to the parameter that I use to have most recent posts at the top: https://www.facebook.com/?sk=h_chr&filter=friends
Log_out_ · 2 years ago
And meta keeps things endlessly. Not just a hyper compressed picture and a set of references to local files. That part of the siloed web vanishes too, just less dangly and obvious.
rchaud · 2 years ago
Are there any businessses of any notable size that are using Facebook alone? Local businesses near me have plenty of info on Google Maps. The website if they have one is usually out of date, but calling them directly answers my questions.
skeeter2020 · 2 years ago
Also 38% of a web filled with diversity, no hidden agenda, and amateurs (in the first best of ways). This number is probably now .00001% of a much bigger, far more homogeneous web. a web 1.0 site > today's walled garden "group page".
pier25 · 2 years ago
I've been to restaurants where they only have the menu in digital and uploaded to FB. And they looked at me as if I was a weirdo when I told them I don't use FB.
Brosper · 2 years ago
Many times I recommend to my clients to use Facebook instead of their own websites. It was overkill. Often having your own website is a waste of money.
coffeebeqn · 2 years ago
I’ve had multiple pages and blogs since 2013 that I just didn’t feel like maintaining or paying the hosting and domain fees for anymore.
carlosjobim · 2 years ago
You can see business pages and their info on Facebook without an account. If they publish their email you can also contact them.
dzhiurgis · 2 years ago
I get your sentiment, but facebook also acts as a spam filter - not entirely bad thing for business owner
spurgu · 2 years ago
From a user perspective Facebook's feed is spam.

You used to be able to see a custom feed of a selected friend lists but since they removed that option the site has been completely unusable, unless perhaps you do something like remove 90% of your "friends" and groups but that would hurt usability in different ways.

soulofmischief · 2 years ago
Works both ways, too!

If a business is only on Facebook, I don't do business with them as I don't use Facebook.

A win-win in my book, as I prefer doing business with people whose ethics overlap with my own.

Zambyte · 2 years ago
Funnily enough I specifically don't use Facebook or other Facebook owned services because of all of the spam.
elorant · 2 years ago
If you advertise through Facebook you get a lot of fraudulent traffic. So I don’t see how they fight spam.
detourdog · 2 years ago
Also acts as a customer filter for us old fogeys.
iamacyborg · 2 years ago
You’re not really running a business if all your content is on someone else’s platform and they don’t pay you for it.
philistine · 2 years ago
I’m doing my part. The non-profit I steer only had a Facebook page. I made them a website.

Dead Comment

amanzi · 2 years ago
Some of the better websites at least make an effort to archive old content. e.g. here's CNN and BBC websites with coverage from the 9/11 attacks:

http://news.bbc.co.uk/hi/english/static/in_depth/americas/20...

http://edition.cnn.com/SPECIALS/2001/trade.center/index.html

Don't expect many of the links to work properly, but it's still interesting to see what the web used to look like.

mhh__ · 2 years ago
Some of the interactive stuff on old BBC election coverages still almost work to this day.

Hard to imagine that with many sites now 20 years on. It's not even that it;s impossible with the technology, it's probably closer to how writing got worse after the invention of the word processor. Every thing is managed and structured now so the freedom / bubble needed to make things good in a way that can't be easily explained is gone.

squarefoot · 2 years ago
Be sure to donate some quid to the Internet Archive (archive.org) to support their efforts to preserve (not just) old content, then do your best to make local copies of anything you find of value, just in case they disappear one day. A good number of mostly technical pages I have in my bookmarks file, that grew steadily and has been moved during installations for over 20 years, now point to their latest complete backup before the said page went silent. The Internet Archive is a huge boon to everyone.
massysett · 2 years ago
I realized I was overusing bookmarks. I now save webpages (perhaps as PDF) if it contains information I want to refer to later, such as an insightful article, technical information, a humorous bit, or the like.

Bookmarks are good only for links to things for which only the most current version is worth accessing. That’s my banking websites, a shopping site, my employer’s remote desktop system, etc.

dewey · 2 years ago
There's also https://archivebox.io which can take your bookmarks and archive them in many ways. Unfortunately back when I tried it last time it was a big buggy, I wish there was a better solution to build a nice archive of the sites I visit more often just in case.
ByThyGrace · 2 years ago
On that same vein, shoutout to Epub Press[0]: a browser extension/web service that packs selected open tabs into a neat conformant .epub file.

0: https://epub.press

rchaud · 2 years ago
I save webpages as PDF because they retain the images and fonts of the original page. One issue I run into is that sticky headers/footers used on websites often obscure top/bottom text of the page when exported to PDF. This can be addressed by using UBO to remove the sticky DOM elements before saving, but it's a bit of a hassle.
toomuchtodo · 2 years ago
Others have recommended ArchiveBox, I will recommend using any bookmarking tool that fires off a web request to the Wayback Machine to archive a page when you create the bookmark.
amelius · 2 years ago
This is something a browser could in theory do for you.
dotancohen · 2 years ago
How much disk space did that consume?

I like the idea that in addition to saving the page, you can annotate it as well.

astrostl · 2 years ago
I wish the Internet Archive would split itself into two entities: one that simply archives web sites, and the other that does everything else (e.g., edgy IP testing of ebooks and video games). That way if the "other" entity gets sued into oblivion, the web sites remain. I think what the former is doing is a critical service for humankind, and I do donate, but I worry about their future.
toomuchtodo · 2 years ago
Don't worry.
earthboundkid · 2 years ago
I have run a news website since 2019. Every hour, I have a crawler look for dead links. I replace about one link a day with a link to archive.org. The funniest ones are the day after an election when all the candidate websites go blank. The saddest are the government websites that go offline from 3am to 5am every week.
notRobot · 2 years ago
Interesting, does your crawler check every link every hour, or does it go through them a batch at a time?
earthboundkid · 2 years ago
onion2k · 2 years ago
I'm surprised it's not more. 2013 was long after the days of hobbyist websites of the early net, and into the time when most new sites were business driven. Given how long businesses last I'd expect many more sites to be long gone 11 years later. I guess maybe the death of a lot of community-building spaces (angelfire, Geocities, etc) probably counts for a lot of them going.

What would be particularly interesting would be to graph how long websites last for. I suspect quite a lot of the content from the early days is still around, and this period (2008 - 2018) is the peak of sites vanishing.

rchaud · 2 years ago
A lot of the content from the early days was on platforms which are long dead:

- Geocities

- University-provided FTP folder (deleted after you graduate)

- ISP-provided FTP folder (all those Earthlink, Juno, Comcast sites: probably deleted)

lagniappe · 2 years ago
I hope not all things last forever. A while back I stumbled upon my first .com, from the 90s, which was hosted on Angelfire and dutifully rehosted by archive.org and it went about how you'd imagine.

Despite being in 4th grade when my little friend and I made the webpage, things on there (while fine for the era) are just not okay by today's standards even if I understand the context for what led to it being there. It was nothing terrible, but just distasteful in a blissfully unaware way a 4th grader in the 90's would be. I realize that stuff will probably never be off my conscience and I just have to deal with it and hope nobody sees it.

otachack · 2 years ago
I have similar material. If it's reassuring, we all were just kids/teens and learning of the world. I feel a lot for the youth after us that made the Internet more accessible and, at times, more permanent.
BirAdam · 2 years ago
I feel your pain.

Thankfully, even the archive occasionally takes stuff off.

zokier · 2 years ago
Everything on internet is intrisically ephemeral. Embrace that instead of fighting against it. If you want to archive stuff then make offline copies. PDF/A (especially the -1 and -2 versions) is format explicitly designed for archiving and works well for static content.

I think it is bit of a shame that mirroring is not more readily built into web stack (=http/html); if you could trivially make links that included local copy (as fallback?) this linkrot would be far lesser concern. The way how for example wikipedia links everything through archive.org is bit of a hack imho

badgersnake · 2 years ago
I’m surprised it’s that low to be honest. Most of the web seems to by SEO crap these days.
brabel · 2 years ago
Agree. Sometimes you just experiment with something, put up a tiny website somewhere... forget about it until you decide it's no longer relevant for whatever reason and you pull the plug on it... it's not a bad thing. But it's great to have stuff like web archives though, to keep our collective memory for worthwhile content. I specially hope that accurate accounts of events gets preserved, as it was originally written, somewhere it can't be changed. That's because rewriting history seems to be a favourite these days and preserving the original accounts as things were happening can help combat this, and even if the account were not completely accurate, it can help understand the actions of contemporary actors - i.e. you may be able to understand what they thought was true at the time, even if that was later revealed to be incorrect.
nicbou · 2 years ago
Some things still exist but are just no longer surfaced by Google.