Readit News logoReadit News
tfederman commented on Newspapers Are Recommending AI-Hallucinated Novels   countercraft.substack.com... · Posted by u/greenie_beans
tfederman · 3 months ago
In the GPT-2 era I created CouldReads, a big data set of generated book titles/synopses trained on thousands of e-books. It was a fun project in the naivete of 2020 but it's less amusing now.
tfederman commented on Crawlers impact the operations of the Wikimedia projects   diff.wikimedia.org/2025/0... · Posted by u/edward
tfederman · 4 months ago
A while back I wrote up a way to turn the big Wikipedia XML dump into a database. Not a generic table with articles but thousands of tables, one for each article "type". I'm not sure if this is still the best way to go about it.

https://feder001.com/exploring-wikipedia-as-a-database-part-...

tfederman commented on Show HN: Fountain of RSS – 30k active feeds and code for sourcing more   github.com/tfederman/foun... · Posted by u/tfederman
DIYgod · 4 months ago
Great work, but in fact, you can get countless RSS feeds through [RSSHub](https://docs.rsshub.app/).
tfederman · 4 months ago
That looks like a crowdsourced project for turning arbitrary sites into RSS which is very cool, but I don't see a way to get a large RSS data set out of it. And with about 5000 sources (I think) it's not as large as what I was hoping for, but it could be a good complementary source.
tfederman commented on Ask HN: What are you working on? (March 2025)    · Posted by u/david927
tfederman · 5 months ago
RSS reader through Bluesky custom feeds: https://github.com/tfederman/stroma-news

Bluesky API library spun off from the other project: https://github.com/tfederman/pysky

Haven't really started it yet, but a master list of RSS feeds and the code I used to source them: https://github.com/tfederman/huge-rss-list

And also a new project to fetch all links seen in the Bluesky firehose and gather metadata to build a database of sites and pages at a more granular level than the domain. For example, is account X posting video links from one YT channel or many?

tfederman commented on Show HN: AI-Less Hacker News   save-buffer.github.io/no_... · Posted by u/sakras
tfederman · 2 years ago
Just for fun I wanted to do a simple server-side version of this where the submissions would be truly hidden on my account, so it would take effect on mobile too. And avoid client side artifacts like messed up numbering.

https://github.com/tfederman/hacker-news-topic-hider

tfederman commented on From hell to HTML: releasing a Python package to easily work with Wikimedia HTM   techblog.wikimedia.org/20... · Posted by u/todsacerdoti
tfederman · 2 years ago
If anyone's interested in an approach to processing the data set quickly, I got something working and wrote it up when I was curious about turning the content into structured data for database tables.

https://feder001.com/exploring-wikipedia-as-a-database-part-...

https://feder001.com/exploring-wikipedia-as-a-database-part-...

tfederman commented on “Just remove the duck” (2013)   rachelbythebay.com/w/2013... · Posted by u/tarikozket
bgroins · 10 years ago
As someone who is responsible for 10-15 concurrent IT projects across numerous teams, many of whom I don't manage, this sounds terrible. I think the "no PMO' approach may work for homogenous groups whose focus is one project, but for a large organization, effective project management is a must.
tfederman · 10 years ago
That's maybe true, and also a good reason I wouldn't work for a large company again.
tfederman commented on “Just remove the duck” (2013)   rachelbythebay.com/w/2013... · Posted by u/tarikozket
tfederman · 10 years ago
My company just doesn't have product or project managers and it's wonderful. Fewer meetings, no intermediaries, more agility. It could only work in practice, it could never work in theory.

u/tfederman

KarmaCake day28May 27, 2014
About
tfederman

at

outlook

dot

com

View Original