Atom feed format was born 20 years ago

Atom vs RSS is a great example of how technical correctness is trumped by social factors, in this case namely support from makers of popular software and content as well as social influence and documentation skills of creators.

The person who pushed RSS to success (IMO) Dave Winer was superb at communicating and evangelizing his goals, connecting partners like Netscape and NYT, and documenting his work including the RSS related tools he built.

His spec was “worse” in the sense that it was under specified but better in the sense that it achieved wide support (both in text and podcast form) among people who made content. This is partly because Dave had first an influential email newsletter and Wired column (DaveNet) and second an influential very early blog Scripting News. He had also been working with news companies for years at prior startups. He could write well. He showed up for and arranged meetings with people who did not at first understand the need for something like rss. He was clear and relentless in his promotion which was borne out of what seemed to be a genuine desire for open standards in this area rather than greed / trying to do lock in.

People with technical backgrounds in places like this tend to fixate on the technical aspects of Atom vs RSS. There is no question Atom is more technically correct. There is also no question (IMO) it came too late and focused on the wrong things — being correct and complete at the expense of being complicated and hard to understand — and more importantly was led and promoted by people who lacked the social skills to make it popular outside of technical circles. (These folks could be brutal about rss's flaws without seeming to have awareness of this shortcoming in their own effort.)

throw0101c · 2 years ago

> Atom vs RSS is a great example of how technical correctness is trumped by social factors

Or perhaps having at six year head start (RSS 0.90, 1999; Atom, 2005) and having it compound.

This includes podcasting, as the term was coined in 2004, but which was happening even before that (and before Atom was finalized):

* https://en.wikipedia.org/wiki/Podcast#Etymology

First mover advantage, leading to network effects, can be a thing.

acdha · 2 years ago

I’d also add Google dropping reader and trying to force open web content into various proprietary schemes. Atom’s many improvements over RSS didn’t matter as much when the oxygen was getting sucked out of the room and few people were investing more time in feeds.

samstave · 2 years ago

So is this as BetaMax was to VHS?

EDIT on 'podcast':

I was literally wondering this AM where 'pod'cast came from - thank you for the link (I love in-sects, which is why I love Etymology!)

westurner · 2 years ago

RSS > RSS compared with Atom https://en.wikipedia.org/wiki/RSS#RSS_compared_with_Atom

CDATA > CDATA in XML: https://en.m.wikipedia.org/wiki/CDATA#CDATA_sections_in_XML

IIRC RSS was not originally an XML document, so CDATA tags (to prevent XSS) didn't work; and the issue remains with content syndication: feed elements should somehow HTML escape their content to prevent XSS (arbitrary JS on a different Origin)

XSS: Cross-site Scripting: https://en.wikipedia.org/wiki/Cross-site_scripting

Same-origin policy: https://en.wikipedia.org/wiki/Same-origin_policy https://developer.mozilla.org/en-US/docs/Web/Security/Same-o...

Content Security Policy (CSP) https://en.wikipedia.org/wiki/Content_Security_Policy https://developer.mozilla.org/en-US/docs/Web/HTTP/CSP

CSRF: Cross-site request forgery: https://en.wikipedia.org/wiki/Cross-site_request_forgery

JSONP > CSRF: https://en.wikipedia.org/wiki/JSONP#Cross-site_request_forge...

The whole internet was broken, and RSS helped us realize it: the one-way, one-time syndication advantage.

These days it's all about https://schema.org/CreativeWork JSON-LD instead of RDFa, which you can try to sanitize with Mozilla/bleach like arbitrary HTML in comments on the page.

DonHopkins · 2 years ago

https://news.ycombinator.com/item?id=7728020

DonHopkins on May 11, 2014 | parent | context | favorite | on: The Unix Haters Handbook (1994) [pdf]

[...]

The worst use of the <BLINK> tag ever was the discussion held in the early days of RSS about escaping HTML in titles, whose attention-grabbing title went something like this: "Hey, what happens when you put a <BLINK> tag in the title???!!!"

The content of that notorious discussion went on and off and on and off for weeks, giving all the netizens of the RSS community blogosphere terrible headaches, with people's entire blogs disappearing and reappearing every second, until it finally reached a flashing point, when Dave Winer humbly conceded that it wasn't the user's fault for being an idiot, and maybe just maybe there was tiny teeny little design flaw in RSS, and it wasn't actually such a great idea to allow HTML tags in RSS titles.

eduction · 2 years ago

Ya I remember this period. A lot of people got caught up in these types of questions. “What if you want to put an unescaped greater than symbol in your post title??” People spent years on this sort of pedantry.

Meanwhile Dave added enclosures and popularized podcasting making RSS even more important. He knew where to focus and what mattered.

talideon · 2 years ago

The craziest thing I see though is people importing the Atom namespace into RSS feeds to get some of the elements. At that point, I can't fathom what advantages there is to the producer to not just produce an Atom feed.

RSS 1.0 was basically the first attempt to produce something like Atom without reinventing too much, but Dave threw his toys out of the pram, so the only option was to reinvent it with a different name and clean up the remaining sharp corners.

It's tragically funny. Atom is both more correct and easier to produce and parse than any RSS variant.

Were I to join Automattic in the morning, the first thing I'd try is to attempt to get them to abandon their weird RSS mashup as their default feed format. There's no good reason why Wordpress still generates RSS feeds.

Yes, I'm a bit bitter, but I was in the feed wars, and it still stings.

ttepasse · 2 years ago

> The craziest thing I see though is people importing the Atom namespace into RSS feeds to get some of the elements.

Interestingly enough you can’t to do the reverse: because RSS never got a proper XML namespace, its elements can’t be embedded into other XML. I can understand that the early RSS 0.9x efforts didn’t do XML namespaces since those were rather new then, but the publication of RSS 2.0 could have been the right moment to introduce one, I think.

rcade · 2 years ago

Atom is a great feed format with a rock-solid spec.

I wouldn't call it a weird mashup for WordPress to use atom:link in their RSS feeds. It does things no RSS element can do, such as allow a feed to identify its own URL.

For those who don't know, every WordPress site with an RSS feed also has an Atom feed available by adding "/atom" to the end of the RSS feed URL, such as this:

RSS feed: https://wordpress.com/blog/feed/

Atom feed: https://wordpress.com/blog/feed/atom/

toyg · 2 years ago

> I can't fathom what advantages

The advantage is that the imported Atom bits will be namespaced, so an old client can decide to ignore them and just treat the feed as RSS.

In format wars, wider compatibility tends to win, even if the overall standard is just worse. Developers are lazy.

treve · 2 years ago

Almost any reader and platform supports both so I don't even know why the discussion is relevant.

If you have a preference, use it and feed readers will likely just work.

thaumasiotes · 2 years ago

> Atom vs RSS is a great example of how technical correctness is trumped by social factors

I don't follow. Out there in the world, RSS feeds provide their feeds in Atom format. The technical format is called "Atom" and the functionality that the Atom format implements is called "RSS".

In what sense did technically-correct Atom get trumped by anything? This is like complaining that social factors caused "the SAT" to get trumped by "standardized testing".

eloisant · 2 years ago

RSS and Atom are 2 different formats.

Your confusion probably comes from the fact that RSS is older, so it's sometimes used as the name of the functionality but it's improper. They're 2 different formats.

majormajor · 2 years ago

> Out there in the world, RSS feeds provide their feeds in Atom format.

I just checked a few of the ones I follow, and ... turns out I don't immediately know how to distinguish when there's not a specific xml namespace reference in the doc or such.

But according to Wikipedia the RFC822 timestamps I'm seeing suggest they're RSS2 instead of Atom?

still dunno what's the difference between rss1.0, 2.0 and atom

x2rj · 2 years ago

Atom is an order of magnitude more complex and strict standard by people who really love xml in contrast to the really simple and less strict rss 2.0. For example almost everything is optional in rss 2.0 so you can have a reasonable feed for stuff like tweets or linkblogs where there is no obvious title. In contrast atom enforces a title for every item which makes this a messy expirience.

I have implemented rss 2.0 parser faster then understanding the atom specification. Atom can do encode stuff like encode html inline the xml instead of as a CDATA string. In theory this sounds great, but is ends up in a big mess of complexity (e.g. a blogpost with handwritten invalid html).

These days there is also JSONFeed which is really easy to parse, simple and flexible, but it is not supported everywhere yet.

simonw · 2 years ago

The RSS 2.0 spec has a horrible flaw:

https://www.rssboard.org/rss-2-0-1-rv-6#hrelementsOfLtitemgt

> An item may also be complete in itself, if so, the description contains the text (entity-encoded HTML is allowed; see examples)

Note "is allowed", not "is required". This caused SO MANY problems back in the day, because the spec didn't clarify if you should or should not include HTML in that element - and there was no way of telling, when parsing a feed, if the author was in the "entity-encoded HTML" or "YOLO and just stick plain text in there" camp.

IIRC, Atom came about precisely because the RSS specifications didn't provide the level of detail needed for a spec to be truly interoperable.

talideon · 2 years ago

Atom is not a magnitude more complex or strict. It has _two_ places where it requires something even slightly onerous, and that's in the summary and content elements, where, shockingly, it allows you to specify if the content is XHTML, HTML, or text, and for HTML, it's just a matter of escaping the contents or putting it in CDATA. That's it.

I don't know what you're doing that RSS 2.0 is somehow faster to parse than Atom. I've written parsers for both over the past twenty years with a negligible difference between the two besides the fact that the RSS feeds often need hacks. I've also wrote a whole bunch of blog and linkblog backends that produce Atom feeds, and have never and issue with any. Let's look at the required elements of an entry: updated, title, id. Nothing remotely onerous there. In fact, it's purposely minimal, more minimal than RSS. And in RSS 2.0, title is a required element (because if something it's explicitly noted as optional in the RSS 2.0 spec, it's assumed to be optional).

In my personal linklog, I use the title of the target page of the link as the title, because it's the sensible option. With tweets, you have half a point. Only half a point, because title is required, but Twitter also post-dates the early 2000s considerably. But here's the thing: 'title' is required in RSS and Atom, but there's nothing saying it can't be empty. I know, I've blown your mind!

And then there's JSONFeed, which, of course, can somehow gracefully cope with people dropping '"' in random parts of the file because people generate JSON files like that by hand, right?

Right?

Just like they write RSS and Atom feeds, right?

Right?

ttepasse · 2 years ago

> I have implemented rss 2.0 parser faster then understanding the atom specification. Atom can do encode stuff like encode html inline the xml instead of as a CDATA string. In theory this sounds great, but is ends up in a big mess of complexity (e.g. a blogpost with handwritten invalid html).

The same thing can also happen in RSS feeds (and JSON Feeds): Entity-encoded HTML strings or CDATA HTML strings do not have any guarantee of well-formed-ness. The direct embedding of XHTML into Atom as namespaced elements just surfaces potential invalid markup higher up.

Deleted Comment

masklinn · 2 years ago

The TLDR is RSS is messy shit while Atom is well specified and works for anything you need outside of podcasts.

As long as you don’t need to consume feeds, just use atom.

The less TL is that RSS1 and RSS2 are basically two different branches of the original:

- Netscape released RSS 0.90 as an RDF application (RSS literally stood for RDF Site Summary) - RSS 1.0 was an update / direct evolution of RSS 0.90 by a dedicated working group using final RDF 1.0 semantics (as RSS 0.90 had been based on an earlier working draft) - RSS 1.1 an evolution of RSS 1.0 by unrelated people

This is called the RDF branch, for obvious reasons.

However a few months after RSS 0.90 Netscape also released RSS 0.91, which dropped RDF entirely, rebranded to “Really Simple Syndication”, and added some elements from Userland’s own syndication format.

This is the start of the “Harvard” (formerly “Userland”) branch, Userland / Dave Winer released his own variant of 0.91 (timeline with netscape has never been super clear to me), then went on to release 0.92 with an <enclosure> element, followed by 0.93 and 0.94. He then released RSS 2.0 to mark a bit of a compatibility break, as RSS 2.0 adds namespace support and removes some elements from his 0.9, and also to fuck with the RSS WG’s 1.0 release.

Because the Harvard branch was the first to support enclosures (embedding audio) and Userland had built support for that, it became the de-facto format for podcast feeds, Atom also supports enclosures but I’m not sure any podcast client (or podcasting source) supports them.

pferde · 2 years ago

While RSS is, as you say, quite messy, Atom has also brought lots of headaches to this poor feed aggregator developer over the years. Its specification is a lot tighter than RSS', but there is still enough wiggle room for feed generators to get creative and make you want to strangle somebody.

ttepasse · 2 years ago

I believe iTunes/Podcasts since its beginning also supported Atom podcasts until they deprecated it some years ago. But I believe they were some of the very few.

talideon · 2 years ago

> As long as you don’t need to consume feeds, just use atom.

Nah, you should just use Atom.

The only vaguely good reason to use RSS is spotty Atom support in podcast apps.

So, better formulation: only bother with RSS if you're a podcast.

gerikson · 2 years ago

Please read this comment that summarizes the situation:

https://news.ycombinator.com/item?id=36378817

benoliver999 · 2 years ago

I pick one at random, paste it into my feed reader, the feeds appear, hey presto!

lapcat · 2 years ago

I prefer Atom to RSS.

1) Atom has separate <updated> and <published> fields, while RSS just has <pubDate>. Moreover, RSS wants to you add a redundant day of the week in the date, i.e., "Sat, 07 Sep 2002 0:00:01 GMT", which is dumb.

2) Atom allows you to use <content type="html"><![CDATA[]]></content> where you can just stick in HTML, whereas RSS <description> just specifies "entity-encoded HTML is allowed".

3) RSS has redundant <guid isPermaLink="true"> vs. <link>. Which one is a feed reader supposed to use?

Atom is slightly better than RSS, but didn't brought enough improvements to kill RSS.

justinator · 2 years ago

But enough where my app supports both and will support both for the remainder of time!

Hamuko · 2 years ago

As a consumer, I don't really have any preference. Both work fine with my software. My personal blog uses Atom only because my static site generator had an Atom plugin.

The rss pubdate format is the date time spec from rfc822 for email so is very widely supported. Cdata is supported in rss too — it’s in the xml spec so would be a bit redundant for rss to explicitly support it. You’ll find it in description elements very very often. I never understood why a link would not be a guid but the difference seems clear enough - use guid if you need a guaranteed unique idea and use link if you need a url to the list.

xefer · 2 years ago

There is the Atom Feed Format and there is the Atom Syndication Protocol:

  https://datatracker.ietf.org/doc/html/rfc4287
  https://datatracker.ietf.org/doc/html/rfc5023

These specs and the discussion about them at the time really are from a different era of the web. The Syndication Protocol fully embraced REST which was also white hot then. There was a real feeling that with a good format and a standardized way to consume and interact with the resources, it would allow for easier sharing of not just blog posts but other data as well.

As intense as the discussion was around the development of RFC-5023, it was basically ignored from the moment it was released and even the main spec author declared it basically dead not very long afterward:

https://web.archive.org/web/20090421042741/http://bitworking...

Needless to say, the web took an entirely different direction and while these specs exist, there isn't much interest in them any longer.

There are also some extensions for richer data models and working with changing feeds:

RFC 4685: Atom Threading Extensions

https://www.rfc-editor.org/rfc/rfc4685.html

RFC 4946: Atom License Extension

https://www.rfc-editor.org/rfc/rfc4946.html

RFC 5005: Feed Paging and Archiving

https://www.rfc-editor.org/rfc/rfc5005.html

RFC 6721: The Atom "deleted-entry" Element

https://www.rfc-editor.org/rfc/rfc6721.html

There were more stuff thought of, as far as I recall, and I bookmarked a lot of them. But on Delicious. Somewhere in some backup there must be my archive.

msla · 2 years ago

https://datatracker.ietf.org/doc/html/rfc4287

https://datatracker.ietf.org/doc/html/rfc5023

robobro · 2 years ago

Whenever I need to provide a feed, I always choose Atom because (a) the spec is better and (b) anything that can handle RSS will generally also handle Atom.

zokier · 2 years ago

Does ActivityPub these days replace the traditional RSS/Atom feeds? Feels like it would be the natural successor. Is there anything missing besides people publishing and consuming?

miki123211 · 2 years ago

ActivityPub, despite using JSON rather than XML, is a much more complex protocol with a lot more moving parts. You can technically write an RSS feed by hand, drop it on a virtual hosting provider via FTP, and you have a feed. On the server side, all you need is static file hosting, which is ubiquitous, available for free and amenable to things like CDNs. On the client side, all you need is a single device, which may be behind NAT, and is capable of occasional network connections to pull the updated feeds.

With ActivityPub, you're supposed to have an account at an instance and receive content through that instance. This means that the server needs to keep track of followers and automatically broadcast all new content to all of them, which requires a database, some kind of content-publishing API, and probably a job queue and Redis to boot. On the client side, you need a box that is online 24/7 and can be connected to, so that you can receive your content. You get much faster delivery, but at a much higher operational costs and with many more scalability issues. Hosting static content at scale is a solved problem, and you can reuse all that existing knowledge for RSS. Sending AP activities at scale is technically solvable, but the infrastructure just isn't there yet.

kevincox · 2 years ago

IMHO no. I thought about this as I run a feed-reader and it would be cool to support ActivityPub as a feed protocol.

The main issue is that subscribing shows up on follower lists. Maybe for individual users this is fine but as I ran a service I didn't want to do this. I ended up with a number of reasons why ActivityPub push wouldn't work well.

1. I didn't want to appear to be advertising the service with a generic account subscribed to many feeds.

2. I didn't want a generic account to provide access to "followers only" toots to unintended users. To properly allow access approval I would need to put some subscriber info into the account. It would also be important to make sure that this can't be used as a form of spam (for example if I allowed them to put whatever name/message they wanted).

3. I didn't want to reveal who was subscribed to what.

4. I didn't want to have dozens of different accounts subscribed to popular feeds.

5. If the user also wants to comment on a post themselves they will need a separate account.

You can still poll a user's outbox like any other feed, but now you are back to an equivalent to Atom/RSS with no WebSub support. (I mean you could use WebSub, it works for any URL but no one does, and why would you when you already have ActivityPub for push?). So it seems that the anonymity of the older feed formats can be useful in some scenarios.

So in the end if I was just going to poll as any other feed format, and most services that support ActivityPub also have other feeds there was really no point to doing this. Feature requests welcome if there is a use case that I missed.

cxr · 2 years ago

> So it seems that the anonymity of the older feed formats can be useful in some scenarios.

Huge understatement; it's more than just "some". None of the blogs in my feed reader are written by people that I have a public (reified) follower/followee connection with over social media. Nor do I have that kind of relationship with the authors of the books I check out from the library, for that matter.

matsimitsu · 2 years ago

I'm in the same boat, and came to the same conclusion.

One more point I'd add:

6. Not every ActivityPub enabled service allows for federation. (Some of the more popular block all federation, unless you get allowlisted). So you're stuck polling the outbox regardless.

rakoo · 2 years ago

> Is there anything missing

Simplicity. Atom/RSS can perfectly be a static file (it doesn't even need internet connectivity, an Atom file can be passed around on a usb key with 100% of functionality). ActivityPub requires a live server with computation. Think about the step it requires for every website and every account and every service to have an atom feed vs an activitypub actor.

> Does ActivityPub these days replace the traditional RSS/Atom feeds?

No. In fact, Mastodon has RSS feed support.

Downloading an XML file is dead simple. ActivityPub is vastly more complex.

unshavedyak · 2 years ago

I'm writing some ActivityPub stuff, and wanted to make sure to get RSS in, done right, etc. However i never use RSS so i have little insight to good/bad/irrelevant RSS impls.

Any wants or must-haves to an RSS implementation over a Fediverse instance?

My rough plan is to just include RSS in any places i expose JSON endpoints for data. Though i'm a bit undecided if there's any value in user activity stuff, like comments on an article, etc.

goffi · 2 years ago

XMPP does use Atom as its (micro)blogging format (XEP-0277), and making followers/following lists (subscribers/subscribed in XMPP terms) public is opt-in with XEP-0465.

Note: I'm very involved in XMPP, and the author of the latter XEP.

Edit: forgot to mention that it's also available to ActivityPub thanks to the XMPP <=> AP gateway (that I've authored too)

olzhas · 2 years ago

Initially, my thought was "wow, that thing was invented in 80s".

jzawodn · 2 years ago

Sigh. That happens to me too.

ilyt · 2 years ago

bullen · 2 years ago

For my blog tool I made feeds in 3 flavours; RSS, Atom and my own:

http://sprout.rupy.se/feed?rss

http://sprout.rupy.se/feed?atom

http://sprout.rupy.se/feed

Now that I look at them they are equally bad.

That said ActivityPub with JSON does not strike me as better.

Seriously considering getting into the fray...

If you want to expand even more, there are even more options:

• RSS 1.0 – the RDF/XML serialisation.

• JSON Feed

• h-feed/hAtom – embedding the “feed“ as Microformats markup inside the HTML.

• schema.org/Blog – embedding the “feed“ inside the HTML, either with RDFa or with JSON-LD or with Microdata.

• If you want to annoy the most people at once, there is a great solution: The data model from RSS 1.0 is of course RDF-based. The modern serialisation of RDF is JSON-LD – simply use the RSS 1.0 vocabulary in a “JSON-LD-Feed”.

It's interesting you made your own feed format, but... why? Does anything actually support it?

May be worth making a jsonfeed while you're at it, if you just want to make more options.

I wanted something simpler and was just playing around.

JSON I use in my distributed DB (http://root.rupy.se), it's less verbose but XML to me is also a development tool that gains with human readability.

The bottleneck is almost never bandwidth on small solutions.

What I'm considering now is making my own ActivityPub, adding Reddit like features and Email.

I would like to make the format pipe (|) separated instead of XML/JSON...

The social protocol to end all social protocols. XD

Now remembered I already did JSON: http://sprout.rupy.se/feed?json