I tracked everything I read on the internet for a year

Great writeup. I too have a long reading list - currently at 133.

I use my own little side project (Savory) to track the list. When I come across a page that I don't have the time or energy to finish right now, I save it and add the "reading" tag to it. When I have free time, I can open the reading tag and pick up something new to read.

The best part is that I usually add a couple more tags (e.g. "security" or "economics" etc.) when I save a link. This way, the reading list allows me to filter by topic. It has been an unexpected hack to attack the growing list, since I am usually able to finish multiple articles in a single run, all in the same topic, because there is usually a link between them even when I might have saved them days or weeks apart.

Anyway I like how OP actually has a reading history. I really need to add something similar in Savory. Right now, when I finish reading something, I just remove the "reading" tag and I don't get a neat history.

dmos62 · 3 years ago

I do the same thing but in one big text file. I store all my general notes there. If it's an article I want to come back to, I write "article", if it's a video, then "video", etc. I also write any other keywords and/or information that I might find useful when coming back to it. Then it's a text search away.

Here's something I saved the other day:

    22-09-05

    article Tactical Decision Game (TDG)
      - https://www.shadowboxtraining.com/news/2022/04/29/film-at-eleven/
      - title: Film at Eleven
      - scenario of the month
        - make decisions
          - make notes about your thinking
        - compare with experts

It's mumbly, low effort and holds all the information I need to both find it and then to see why I was initially interested.

closedloop129 · 3 years ago

Have you considered adding ActivityPub comments?

I would love to see every social bookmarking service and RSS reader to have an interoperate-able comment section. Why should I manage my bookmarks outside of the browser if not for the benefit of receiving additional information about the articles?

samstave · 3 years ago

So if I book the same url, and you do as well, then have both conglomerates comments (anon) and private notes as well?

That sounds good.

samstave · 3 years ago

Anyone here recall “delicious” (wasn’t that what it was called, basically the first bookmarking app…

This sounds like that.

I want a cross between delicious and workflowy

codetrotter · 3 years ago

Another similar site that I used back in the day was StumbleUpon.

“Just one more click.” Addictive stuff xD

Anyone who remembers StumbleUpon will likely relate to (and probably also remember) this image https://beta-techcrunch-com.cdn.ampproject.org/ii/w820/s/bet...

https://en.wikipedia.org/wiki/StumbleUpon

jcynix · 3 years ago

Yes, del.icio.us, and something similar still exists at pinboard.in, but sadly without the full social sharing features which where available at del.icio.us

GM_xmlhttpRequest({ method : 'GET', url : 'https://myloggingurl/?client='+client+'&url='+encodeURIComponent(window.location.href)+ '&title='+encodeURIComponent(document.title), responseType : 'json', onerror : function(e) { console.error('URL-Logger', e); }, });

leokennis · 3 years ago

My reading list strategy:

- Send to Feedbin (https://feedbin.com/blog/2019/08/20/save-webpages-to-read-la...)

- Never look at it again

dewey · 3 years ago

I can confirm that this also works perfectly well for browser bookmarks!

infinityplus1 · 3 years ago

That's why I keep hundreds of tabs open. If it is open, I'll read it one day. Otherwise there is no chance.

athenot · 3 years ago

I use Add to Reading List in Safari, and like you for the most part don't look at it again... except when I'm on an airplane without wifi, because in the mean time it will have downloaded the page and synced with my phone. So I have a collection of offline reading material always there.

vongomben · 3 years ago

I do the same, but with pocket and the kobo aura.

Nice read things asynchronously

clarents · 3 years ago

Oh cool, I didn't know they had this feature. I wish they had feedly integration, too.

phgn · 3 years ago

Devil's advocate: Why do you save links if you never look at them again?

layer8 · 3 years ago

It relieves FOMO, and your mind can maintain the illusion that you might get around to looking at them after all.

tankado · 3 years ago

I have a similar approach but I use Instapaper

akshaykumar90 · 3 years ago

Xeoncross · 3 years ago

I'm still waiting for a web extension that sends a copy of the webpage I'm looking at (for more than 1 minute) to an endpoint I specify along with some metadata like the URL and user-agent. Obviously, block certain domains like banks or email.

I'd like something to build up a searchable index of everything I've read recently so I can easily find content again yet this is NOT something I want a 3rd party to do. I want to self-host something like a tiny Go or Rust server that only uses 10mb of ram to index all the pages into an embed rocks/level/badger/etc. database.

rcarr · 3 years ago

If you’re on mac, have a look at DevonThink, they offer a server version so you can access your stuff from anywhere but you wouldn’t even necessarily need it if you’re just using all your own devices. With the normal version all your databases are just stored with whatever cloud provider you want to use or I think you can do WebDAV if you want to use your own. I absolutely love it.

https://www.devontechnologies.com/apps/devonthink

leobg · 3 years ago

That app was the reason I bought my first Mac, after a lifetime of Windows. Must have been in 2007 or something. I had read Steven Berlin Johnson’s account of how he uses DevonThink to research his books. Nice to hear people still use it in times of Roam, Notion and Obsidian.

akpa1 · 3 years ago

You could probably do the client-side stuff with Tampermonkey and a little bit of JavaScript and `setTimeout`.

kixiQu · 3 years ago

Yeah, that's what I was thinking. Have the destination be localhost and you're golden.

metalliqaz · 3 years ago

Same

caprock · 3 years ago

This is a neat writeup. It's fun to think about how to potentially automate this kind of tracking.

> I wish there was an easy way to filter by "independent websites"

This side comment from the post is intriguing. Other than manual curation, I wonder if there is a way to identify commercial vs independent domains? This would make a really good entry point for a specialty search engine for indie sites only.

Barrin92 · 3 years ago

>It's fun to think about how to potentially automate this kind of tracking

I download my entire history regularly and use https://yacy.net/ to index at all. It's essentially like a local search engine. Also works on the local file system and across machines.

birdyrooster · 3 years ago

I have been wanting this for years. Thank you for sharing what you found.

jstanley · 3 years ago

You might check out https://search.marginalia.nu/ - the creator posts about it on HN occasionally, it seems like exactly what you're after.

entropie · 3 years ago

I have a bookmarklet that saves the current page in some kind of weblog (tumbleblog) where I can modify/tag/unlock them for (public) visibility on a webpage. Its pretty easy to save the current page with JS somewhere via bookmarklet.

Swizec · 3 years ago

What about indie commercial domains? If a solo writer makes a few grand per year from their blog, is it indie?

Sure! Admittedly I was being a bit casual with the distinction.

With a little time and experience maybe one could develop a set of heuristics. Or maybe site selection would just be case by case.

Curation is a long standing challenge, and perhaps growing in importance given the recent influx of advancements in generative technology.

aa-jv · 3 years ago

I Print-to-PDF everything I've ever found interesting to spend longer than 2 minutes lingering on .. and as a result, I've got 20+ years of Internet articles to go through and read offline, any time.

Its very interesting to see the change of quality of technical writing over the last two decades. There's a definite, observable increase in click-bait style writing.

ciroduran · 3 years ago

Now there's an article I would read :D

MezzoDelCammin · 3 years ago

just out of curiosity - have You read them all? Or is it a general archive of "interesting idea, could come in handy"?

I have read them all. And I often go looking for articles in the archive. Its really quite handy to have every single interesting thing I've ever read, accessible this way. I suggest you try it out for a year and see for yourself!

qabqabaca · 3 years ago

This sounds like a great idea. Do you have any system for tagging or sorting them? Other than date of course

Nope, I just print-to-PDF onto my Desktop and then dump all my .pdf files into a "PDFArchive" folder once or twice a week.

I've got over 70,000+ .pdf files in that archive now. Its very easy to query for data in this archive - ls is a surprisingly useful search tool when combined with a few |'s and grep's... and pdf2text is highly useful as well!

One of these days I'll get around to making some word cloud tools or so, maybe scanning the archive to find my favourite 3rd party sites, etc.

lolive · 3 years ago

I have started using Obsidian (at work). And I copy/paste into it any web content, text or image I find useful from the intranet, emails, meet. I try my best to organise things. But for the most part, I use the search engine and the [autogenerated] links.

The only requirement when adding content is to figure out whether it should be added to an existing note or a new dedicated note should be created. [btw, note nesting does exist in Obsidian]

With this simple workflow, you completely eliminate the notion of provenance of the knowledge. The knowledge is here and up to your organisational habits.

After some time doing that, you end up with VERY dense notes (in term of knowledge/line ratio), and very few useless (distracting) content.

For the moment I like that A LOT !

chrisweekly · 3 years ago

I love Obsidian too -- esp with Readwise (and now ReadwiseReader) integration.

But Readwise only syncs highlights, it doesn't help to find "that article about topic X I read last month".

Udo · 3 years ago

Tampermonkey is great for this, because it can log everything and it brings its own XMLHttpRequest:

I've been logging all my web activities since 2018, it's been a great tool. On the server side, I filter out ad spam and other extraneous URLs, and then run a cronjob that converts all new HTML documents it sees to PDFs with wkhtmltopdf. It's been a great tool for finding stuff in those moments where I go "hm, I remember seeing something about this months ago..."

That's pretty neat!

The PDF idea is something a few people have mentioned - I might steal it, since I've been thinking of archiving that list for a little while now, too.

I can imagine that's ended up being a pretty large repository of PDFs, though!

The PDFs are a couple of GB every year - I could limit that further by putting more aggressive filtering in place. There is also a rule that copies the contents of image and PDF URLs verbatim without converting them.

Over time I'm going to extend all of this to a more fully-fledged "sousveillance" dataset. I would love to add health and location data, but alas Apple is such a walled garden. But I did add a note-taking system, Zettelkasten-style.

cgb_ · 3 years ago

The author doesn't appear to have documented the bookmarklet itself. If they are here or another person, can you suggest what it might look like to have a bookmarklet collect the url, page title, meta description and image, and then set window.location.href ?

Here you go! https://gist.github.com/codemicro/f7d4d4b687c3ec2e7186ef7efe...

It didn't even cross my mind that people might be interested in seeing that.

- author

forgotmypw17 · 3 years ago

I use such a bookmarklets to paste things into my own pastebin-like site. It logs the selected text, page title, and url. Clips are then grouped by url and domain.

My profile has the link