Readit News logoReadit News
whatever_dude · 9 years ago
I've had a similar problem. In updating my portfolio site recently, I noticed a vast majority of links were dead. Not just live projects published maybe 3 years or more ago (I expect those to die). But also links to articles and mentions from barely one year ago, or links to award sites, and the like. With a site listing projects going back ~15 years, one can imagine how bad things were.

I had to end up creating a link component that would automatically link to an archive.org version of the link on every URL if I marked as "dead". It was so prevalent it had to be automated like that.

Another reason why I've been contributing $100/year to the Internet Archive for the past 3 years and will continue to do so. They're doing some often unsung but important work.

cheapsteak · 9 years ago
For portfolios, you should really also look into https://webrecorder.io/

It's _not_ a video recording service. It saves and can replay all network requests during a session (including authenticated requests). It's open source, you can self host, I'm not affiliated even though I'm very happy that it exists

no_news_is · 9 years ago
Thanks for mentioning this site. I've done this with mitmproxy but it's complicated, and this was super easy -- I could recommend to anyone.
whatever_dude · 9 years ago
That's impressive. I'll take a look, thanks for the tip.
nfriedly · 9 years ago
I've updated my portfolio before and noticed that as well. I usually include a screenshot or two when I first add a project, so at least that remains.

If the site goes down later, I just remove the link and don't worry about it. My code from 15 years ago is probably atrocious, so I'll consider it a small blessing :P

jedberg · 9 years ago
I actually downloaded a copy of the NYT article I was quoted in in 1996 specifically because I feared it would fall off the internet at some point.

It's behind a paywall now, but at least I have a digital copy!

Asooka · 9 years ago
Isn't that illegal, since you don't own the copyright? Or are you not distributing it and keeping it for archive purposes?
LoSboccacc · 9 years ago
beware: robots.txt can retroactively clear archive.org data
viddi · 9 years ago
To clear things up: robots.txt can retroactively hide content from the archive. If it's changed back to allowing the archive's crawler, content from before the ban can be accessed again.
deadalus · 9 years ago
another reason to use archive.is

Deleted Comment

CharlesDodgson · 9 years ago
I miss the optimism of the early web, when you could create a simple web page, join a web ring and going online was an event. It's richer and deeper now, but the rawness and simpleness of it all was enjoyable and novel.
addicted · 9 years ago
Is it richer and deeper now?

Maybe at the fringes, but I feel that the internet today, with my emphasis being on the "inter" (different) "net" (networks) part of it, is far less deeper or richer than before. What we basically have reduced to are a bunch of siloed netowrks such as Facebook.

When I searched for something when Google first came out I got a mix of results from a variety of sites I had never heard about. Today it's basically Wikipedia at the top, with results from the same list of about 3-4 sites depending on the topic of what I searched for.

hawski · 9 years ago
I'm beginning to think that there is a niche for a peculiar kind of a search engine. A search engine for static almost-none to none JavaScript pages. It would penalize pages for ad-network usage.

I would really like to not have in search results most sites that try to monetize on my attention. I want raw facts and opinions. No click-bait to grab my attention or feed my internal cave man with rage. No ad-networks or data extraction operations. Just pages put there by people that want to share knowledge and ideas. I mostly find it on pages that lack ads and often are pure HTML - no CSS and no JS. At least in areas that interest me.

Maybe there is a place for a search engine that would index only pages like that? It certainly would be easier than competing with Google on indexing whole of the attention-whoring Internet.

userbinator · 9 years ago
When I searched for something when Google first came out I got a mix of results from a variety of sites I had never heard about. Today it's basically Wikipedia at the top, with results from the same list of about 3-4 sites depending on the topic of what I searched for.

...and if you actually try to search for more obscure/"fringe" subjects/phrases with Google, you either get no results (despite knowing that there are still active sites with those phrases), or it starts thinking you're a bot sending "automated queries" and blocking you for a while (not even giving you the option of completing a CAPTCHA.)

The first time that happened to me, which was within this year, was my realisation that Google had truly changed, and not in a good way.

krapp · 9 years ago
Don't confuse people's tendency to no longer bother looking past the top of the first page of Google with the internet somehow shrinking into whatever fits those slots. Of course the most popular sites now dominate the top of Google's search results, but Google isn't the internet any more than Facebook is.

The breadth and depth of information on the web now vastly surpasses what was available in 1994. Youtube and other video and music streaming sites have provided a media revolution to compare with the transition from radio to television. Social media, whatever its drawbacks are, allows people to communicate and collaborate far more personally than email or basic chatrooms would have.

And let's not even get into the ways that Javascript, HTML5 and Webassembly have and will transform the web into a platform in which virtual machines will converge to becoming just another content type. I know people here like to rend their garments and scream Javascript Delenda Est[0] into the void and just hope everything that happened to the web in the last 20 years just goes away, but the day is coming where all archived and obsolete code will have a URL endpoint that bootstraps a VM and runs it. The best the web of 1994 could do is file downloads, maybe Java applets and flash.

Sometimes the way people here seem to dismiss the modern web is baffling. I get it, but look at it from the point of view of the mainstream web user. The web offers access to so much more than would even have been possible in 1994, and lets people interact with one another on a much more direct and complex level.

Yes, the added richness and depth comes with a lot of baggage, but its undeniably there.

[0]https://news.ycombinator.com/item?id=11447851

[1]https://news.ycombinator.com/item?id=14697520

honestoHeminway · 9 years ago
The really bonus of the internet of old was that text was most of its content. Today you have video, pictures, emojys and music and - in my opinnion it reduces the experience.
dep_b · 9 years ago
I remember browsing the internet was much more of a networked thing. If you want to know what it was like you could take a look at Wikipedia, where you still can get lost in a never ending deeper web of links. However WikiPedia is a very cleaned up version of the early web. It lacks animated .gifs for sure.

But the difference with Wikipedia was that people would maintain a Links section full of interesting stuff and people would join web rings for various subjects, interlinking vastly different sites. Finding information often happened through Yahoo! (AltaVista was there too but it was lacking the quality of handpicked results) through a tree based discovery system, to continue through whatever you could find through links on an interesting page. Exchanging links was something that really frequently happened.

It resulted in an internet where you just kept clicking and discovering and digging. Sometimes also frustrating as browsers lacked tabs and I would navigate all links one by one by loading it and going back. I would forget how I arrived at a certain page sometimes because it was so deep and I never found the breadcrumbs again.

neonnoodle · 9 years ago
> It lacks animated .gifs for sure

Um excuse me what do you call this masterpiece https://upload.wikimedia.org/wikipedia/commons/e/ea/Ellipses...

awj · 9 years ago
> If you want to know what it was like you could take a look at Wikipedia, where you still can get lost in a never ending deeper web of links.

I would have said TVTropes, but the core point is the same.

I remember having to restart my computer, because IE lacked tabs and Windows would let you open so many instances of it that the whole OS ground to a halt.

It's weird to think that now, in the absence of Google, I couldn't find my way from anything to anything else.

Dead Comment

lkrubner · 9 years ago
I especially miss the blogosphere of the era 1999-2006, before the emergence of Facebook and Twitter. I miss the era when tech people could debate a new technical protocol by posting thoughtful essays on their blogs, and then other technical people would post rebuttals on their own blogs, and the conversation would go back and forth, among the various blogs, but out in the open, and very democratic. Nowadays a lot of the new protocols are, for all practical purposes, designed inside of Google or Facebook or Apple, and then announced to the world, without much debate.

For a close look at the earlier era, see this very long essay I wrote in 2006 (which was popular back in 2006) in which I summarized the tech world's blog debate about RSS:

http://www.smashcompany.com/technology/rss-has-been-damaged-...

"RSS has been damaged by in-fighting among those who advocate for it"

77pt77 · 9 years ago
I just miss that it was mainly text and could be used in a terminal without missing much.

More content density, better S/N ratio.

Mateon1 · 9 years ago
Recently I came across the NASA Astronomy Picture of the Day website[1], it's old-school HTML, started in 1995 and still updated today.

All the HTML takes 2-4 kB on average, but you might not get much use from the site in a terminal :)

[1]: https://apod.nasa.gov/apod/archivepix.html

Frondo · 9 years ago
Not specifically directed at you, but sort of directed at you:

I wish everyone who pined for the 25-years-ago days of mostly text pages would, instead of pining, just go out there and produce that content they want to see.

Instead of pining, start writing. Hosting is cheap or free, browsers still parse simple HTML, there's nothing stopping anyone from creating a return to that simpler form.

Apocryphon · 9 years ago
Maybe Neocities should add a webring feature.
kyledrake · 9 years ago
Hi, founder of Neocities.

I get this request a lot actually. The reason I decided to not do it was because webrings, though nice, had a lot of problems. The main issue was that people's sites would go away, and then the ring would break. I also didn't want to introduce any functionality that would make sites depend on Neocities backend APIs to function. Web sites are more long-term and durable if they remain (mostly) static.

I tried using "tags" that could bind sites together on Neocities, but to be honest the idea has largely been a failure. People will tag their site "anime" and their site will have nothing to do with anime... but it's a popular tag so they add it in just so they're on a popular tag. Geocities had this problem to a certain extent too (a tech site being in the non-tech neighborhood). You can get a flavor of the problem here: https://neocities.org/browse?tag=anime

One idea I'm considering is to only allow a site to have one tag, rather than 3 like I do right now. Maybe that will stop people from adding tags that are irrelevant to the content of their site. Or it may compound this problem. I'm on the fence about it.

Another idea I'm considering is allowing people to create curated lists of their favorite sites on Neocities, similar to playlists on Youtube. The "follow site" functionality kindof does this, but in a generic way, and it tends to be a bit... I guess nepotistic (hey you're popular, follow me so my site can get more popular too!)

I'm always happy to hear ideas on how to improve this. I do like the idea of related sites being able to clump together, but in practice it doesn't work as well as I would like it to. But maybe it works well enough and I'm overthinking it.

wcummings · 9 years ago
Is this a joke?
qrbLPHiKpiux · 9 years ago
I miss the dial-up BBS boards.
ChrisSD · 9 years ago
Related to this, I had trouble finding examples of pre 1996 web design. The internet archive has a lot from 1997 onwards. The oldest live examples of sites from that era, that I know of are:

http://www.w3.org/History/19921103-hypertext/hypertext/WWW/T...

http://oreilly.com/gnn/gnnhome.html

http://www.trincoll.edu/zines/tj/tj12.02.93/tjcontents.html

The BBC also donated its Networking Club to the Internet Archive: https://archive.org/details/bbcnc.org.uk-19950301

rsync · 9 years ago
The Space Jam (film) website is a golden example and I find it amazing that it is still functional:

https://www.warnerbros.com/archive/spacejam/movie/jam.htm

(1996)

gitgud · 9 years ago
The trailers still work!

https://www.warnerbros.com/archive/spacejam/movie/cmp/jamcen...

A good demonstration of how far video on the internet has come.

160px X 120px, 8 frames per second

koz1000 · 9 years ago
You could try my site for Pinball Expo 1994.

http://www.lysator.liu.se/pinball/expo/

Did it by hand with MS Notepad, MS Paint, Apple QuickTake 100 camera, Chameleon TCP/IP, and Mosaic for PC.

I did, however, drop the webmaster email address about 20 years ago.

1k2ka · 9 years ago
As a student at liu who had no prior knowledge about lysator it's always interesting to see lysator links in the wild, they seem to pop up when least expected.

How come the site is hosted at lysator and how come it's still up?

Sidenote: The man hosting the site has a very on-topic profile page[0].

[0] https://www.ida.liu.se/~davby02/

sdrothrock · 9 years ago
Lysator is incredibly nostalgic for me; when I was just starting on the internet around 1994-1995, a lot of my favorite websites were on lysator, including the gigantic Wheel of Time Index.
Sleeep · 9 years ago
Some good screenshots here: http://www.telegraph.co.uk/technology/0/how-25-popular-websi...

CNN's original online coverage of the OJ Simpson murder trial (1995) is still online and mostly intact - http://www.cnn.com/US/OJ/

Welcome to Netscape (94) - http://home.mcom.com/home/welcome.html

It's later than 96 but I don't know who is still paying for hosting for a stadium that was demolished over 16 years ago http://www.3riversstadium.com/index2.html

adventured · 9 years ago
There's always the famous Dole Kemp '96 campaign site (a fair representation of mid 1990s web design):

http://www.dolekemp96.org/main.htm

And this one from CNN (still 1996, but appropriate representation of that 1995/96 era when design had changed a bit from the earlier plain white backgrounds & basic text layouts):

http://edition.cnn.com/EVENTS/1996/year.in.review/

mynameishere · 9 years ago
I actually watched the debates and couldn't believe Dole closed out his first debate with Clinton imploring youngsters to "tap into" his "homepage" (the above link)

https://www.youtube.com/watch?v=lZhyS5OtPto&t=89m55s

macintux · 9 years ago
Oh, man. Dole/Kemp was definitely designed in the "Make sure everything fits on a 640 × 480 display, and downloads reasonably fast on a 14.4k modem" era.
davehtaylor · 9 years ago
I absolutely adore that CNN 1996 site. Man, designs like that were great.
davehtaylor · 9 years ago
I think it's fascinating the change in how we view site navigation. Another commenter gave a link to an old Microsoft site that had links all over the place. But for the most part, it seemed like sites started to standardize on navigation vertically on the left side. Now, we generally see them horizontally on the top, or in hamburger menus. But it's interested how that paradigm shifted. It seems like vertical side navigation would be more prevalent now, given how much wider monitors are.
SwellJoe · 9 years ago
With wider monitors, my browser is actually narrowed. I split the screen in half and devote one half to the browser and the other half to a text editor and terminal. I used to have two monitors to do that, but now just one is fine, but the side effect is that I browse in a pretty narrow window. A narrow window also makes reading somewhat nicer on some sites, since it's harder to read very long lines of text.
darpa_escapee · 9 years ago
> given how much wider monitors are.

Monitors might be wider, but screens in general are much narrower.

rayiner · 9 years ago
God I miss when the web was designed for people who could read.
ChrisSD · 9 years ago
Then you'll "love" Microsoft's 1994 page: https://www.microsoft.com/en-us/discover/1994/
Sleeep · 9 years ago
I don't miss flash intos though. Or frames.

Good thing they were usually optional.

bahjoite · 9 years ago
A research project presenting pages from Geocities: http://oneterabyteofkilobyteage.tumblr.com/

Pages from 1996 start 102 pages prior to the last page (at present: http://oneterabyteofkilobyteage.tumblr.com/page/10743 )

wiremine · 9 years ago
Ironically, I think you'll have much better luck looking at web design books from that era, instead of links on the web. I just pitched a bunch of my design books from that area, and they were full of "cutting edge" examples.
hmhrex · 9 years ago
That O'Reilly GNN site is beautiful. I miss this simplicity.
WrtCdEvrydy · 9 years ago
Need 3MB worth of JQuery and some swooping in animations.
liveoneggs · 9 years ago
I genuinely like the graphic + nav
observation · 9 years ago
It loaded so fast!
cJ0th · 9 years ago
orbitur · 9 years ago
There's even a Twitter account (created in 2013) that checks if it's still up, every 3 hours.

https://twitter.com/SpaceJamCheck

0134340 · 9 years ago
/r/abandonedweb. It's also ironically now a victim of time.

Deleted Comment

lbhnact · 9 years ago
If you want to go all the way back, UNC still hosts ibiblio.org, which has links to the first website at CERN http://info.cern.ch/ and TBL's first page.
culot · 9 years ago
Hard to believe Softhome.net is still around, and has changed little since 1996:

https://web.archive.org/web/19961226121720/http://www.softho...

Also funny that my email address I signed up for in 1996 with them is still active, even though I dont check it for years at a time.

keithnz · 9 years ago
My web page is still around http://homepages.ihug.co.nz/~keithn/ it was mostly done pre 96 .... not that it was really well designed, I just spammed bezels and had a play with this new cool java thing.

its an embarrassingly amusing slice of life :)

akira2501 · 9 years ago
The US Presidential office has archived the many iterations of it's 'White House' webpage.

https://www.archives.gov/presidential-libraries/archived-web...

zafka · 9 years ago
See : zafka.com while a small amount has been added, it was designed around 1994.
kutkloon7 · 9 years ago
This is a very important reason why books, in general, contain better information that websites. On websites, people care a lot less about the correctness of the information. You can just update stuff later (of course, this doesn't always happen).

Also, sites are a very volatile medium. I often bookmark pages with interesting information to read later, and it inevitably happens once in a while that a site went down and I just can't find the information anymore.

rahiel · 9 years ago
> Also, sites are a very volatile medium. I often bookmark pages with interesting information to read later, and it inevitably happens once in a while that a site went down and I just can't find the information anymore.

I had the same experience and that's why I made a browser extension that archives pages when you bookmark them. (https://github.com/rahiel/archiveror)

kbenson · 9 years ago
Maybe something that archives to IPFS would be interesting. As things are marked as interesting, they are both archived and distributed based on interest.
jandrese · 9 years ago
I still have my bookmarks.html file I started building in 1995, but almost everything in it has rotted away. It's a shame too because a lot of the stuff in there would still be useful or interesting, but nobody wants to pay even a nominal fee to keep it online.
j_s · 9 years ago
I collected a list of ~15+ archival tools on a discussion of Wallabag last month: https://news.ycombinator.com/item?id=14686882

Happy to discover yours!

Crontab · 9 years ago
> I often bookmark pages with interesting information to read later, and it inevitably happens once in a while that a site went down and I just can't find the information anymore.

I've recently had this problem with some online fiction that I had bookmarked. Now, I was able to recover thanks to the Wayback Machine, but I really shouldn't depend on that.

I should really put some thought into archiving pages I like or getting a Pinboard account.

concernedctzn · 9 years ago
I have this problem too, thankfully archive.org has been able to resurrect most of the text based sites I bookmarked ages ago. Such an invaluable resource.
jacquesm · 9 years ago
Linkrot is a real problem. Especially for those sites that disappear before the archive can get to them.

On another note, the more dynamic the web becomes the harder it will be to archive so if you think that the 1994 content is a problem wait until you live in 2040 and you want to read some pages from 2017.

TheAnimus · 9 years ago
Turns out the solution to every stack overflow post will be

"JavaScript is required"

rahiel · 9 years ago
Content from Stack Overflow has higher odds to survive than this, they've uploaded a data dump of all user-contributed data to archive.org: https://archive.org/details/stackexchange. It's all plaintext. This is really generous of Stack Exchange and shows they care for the long-term.
stordoff · 9 years ago
That's actually one of the reasons all my personal stuff gets built as HTML/CSS, then just use Javascript for quality of life stuff (image lightboxes that work without putting #target in browser history, auto-loading a higher-res image -that sort of thing).

I know I won't be maintaining it forever, but I want it to be accessible through the archive.

TazeTSchnitzel · 9 years ago
Server-side rendering please save us.
sah2ed · 9 years ago
Well, there's now Chrome headless which is slowly edging out PhantomJS for such use cases.
notgood · 9 years ago
It's actually fairly easy to record web sites despite how dynamic they are; all you have to do is save the response data of each XHR (and similar requests) and the rest of the state (cookies, urls, date/time, localStorage, etc).

For even more accuracy save a Chromium binary of the version at the time so it'll look exactly as intended.

amelius · 9 years ago
Solution: https://ipfs.io

> The average lifespan of a web page is 100 days. Remember GeoCities? The web doesn't anymore. It's not good enough for the primary medium of our era to be so fragile.

> IPFS provides historic versioning (like git) and makes it simple to set up resilient networks for mirroring of data.

Jtsummers · 9 years ago
IPFS is good and useful, but it only retains what people choose to retain.

If geocities.com/aoeu isn't popular, then IPFS won't store it unless someone bothered to pin it. And as soon as they stop, it'll disappear.

You need a dedicated host (like archive.org) to retain it, or volunteers willing to coordinate and commit their resources. Otherwise it's just more resilient (a good thing), but not permanent.

ric2b · 9 years ago
> Otherwise it's just more resilient (a good thing), but not permanent.

It's not "just" more resilient, It's also much more elegant and convenient: with the current web you need to go find some archive version of the dead link you found, while with IPFS the link can simply work, even after the creator stops hosting it.

ric2b · 9 years ago
If literally no one in the world bothers to keep a piece of content, not even archive organizations, what do you suppose could possibly work?
miguelrochefort · 9 years ago
Hello fellow dvorakist!
pmlnr · 9 years ago
IPFS is not a real solution at the moment. It's hard to use it, the default daemon is so agressive Hetzner nearly blocked my server due to it's scanning, and your site needs to be relative-url based to be put on IPFS.

On the other hand, nobody is talking about the problem of domains: yes, linkrot is a thing, but many are due to dead domains and dead blogging/content silos.

lgierth · 9 years ago
I've had to deal with Hetzner and IPFS too -- my conclusion is that it's Hetzner who are aggressive here. In of the cases I had fixed the dialing-local-networks behaviour, and then Hetzner still continued to block the server for about a week. They blocked it on 25-Dec and released it on 31-Dec.
jacquesm · 9 years ago
> Remember GeoCities? The web doesn't anymore.

Bad example:

http://reocities.com/

adventured · 9 years ago
Mixed example. Only a small fraction of Geocities pages / content have been preserved. Most of it is lost permanently.
mfoy_ · 9 years ago
A similarly really annoying thing is when you find old technet articles, stack overflow questions, or blog posts that seem potentially really useful, but that have broken images, broken links, etc... so the content (possibly extremely useful at the time) is completely useless now.

It really stresses the importance of directly quoting / paraphrasing the content you want in your plain text, and not relying on external resources for posterity.

c22 · 9 years ago
The one I hate is when I find old forum posts explaining how to do something in the physical world and all the embedded photos are broken. Not because the image host went out of business or the user deleted them, but just because they didn't log in for a year and the host deactivated their account. This is why whenever I link a photo I upload it to my own server and I never change the URL.
CM30 · 9 years ago
Or when they're broken because the image host discontinued third party image hosting/started charging for it, despite said feature being the only reason their site caught on in the first place.

Looking at you Photobucket. And all those useful images now replaced with a meaningless Photobucket placeholder.

Sleeep · 9 years ago
I apologize in advance as there's no non-morbid way to ask this but... what happens to the images on your server if you die tomorrow? It would be exactly the same situation, right? They would exist until your bill is due in a couple years then your account will be deactivated and your images will linkrot.
mod · 9 years ago
Also 500px (I think--if not a similar image host) has recently banned all 3rd-party images, at least at the free level, which has broken a TON of the old forum posts I want to see. It was the defacto image host, kind of like imgur is now.
tonto · 9 years ago
Wikipedia also hit pretty hard by link rot, nice thing there at least is that volunteers can try to fix it
jake-low · 9 years ago
For anyone curious, you can help fix dead reference links on Wikipedia in just a few seconds. If you find a page that has a dead link (or several), click the "View History" tab at the top, then click "Fix Dead Links" to run the InternetArchiveBot on the page.

More info: https://en.wikipedia.org/wiki/Wikipedia:Link_rot

toomuchtodo · 9 years ago
Wikipedia is working with the Internet Archive to automate the prevention of link rot.
D-Coder · 9 years ago
I'm reading Raymond Chen's Old New Thing blog articles from 2006. Most of the links that I try (75%?) are dead.
indescions_2017 · 9 years ago
See also: Best of the Web '94 Awards. Presented at the First International Conference on the World-Wide Web, Geneva, Switzerland, May 1994.

https://en.wikipedia.org/wiki/First_International_Conference...

What's cool isn't how fast some of these technologies become obsolete, such as various Java applets and cgi-bin connected webcams. It's the static content that can survive until the end of time.

Like Nicolas Pioch's Web Museum. Bienvenue!

http://www.ibiblio.org/wm/paint/

jerven · 9 years ago
I must say the swiss-prot links from then still work. You are redirected to the uniprot.org website but the links work.