Readit News logoReadit News
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
_mattt · 3 days ago
Not yet, but I plan to open source it soon. Just gotta tidy up a little bit, you know?
_mattt · 10 hours ago
Update — Source is now available here: https://github.com/NSHipster/sosumi.ai
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
_mattt · 3 days ago
I agree, it'd be great if Apple provided accessible documentation in the first place. Time was, Apple published self-contained docsets that you could download and read offline.

Apple's ToS pretty explicitly forbid the kind of automation required to download everything. But even if someone did that, it'd only be a snapshot in time. And a lot can can change between OS releases.

As for the hosted web app, I wanted to provide this as a public service. I plan to open source it, so anyone can self-host instead, if they're inclined.

_mattt · 10 hours ago
Update — Source is now available here: https://github.com/NSHipster/sosumi.ai
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
_mattt · 3 days ago
Thanks for pointing that out. That’s most likely a mistake in how I’m translating into Markdown. I’ll look into this.
_mattt · 2 days ago
Following up — I just pushed a fix for this. This latest version significantly improves how references like protocol conformances and default implementations are rendered.
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
miki123211 · 3 days ago
Just saying, sites like these are also pretty great for accessibility, screen reader users in particular.

I think this one would be slightly better if it rendered that Markdown as simple HTML if accessed through a real browser, but I can imagine even this version being pretty useful.

I think it could also make the "Small web" crowd pretty happy too.

_mattt · 2 days ago
Amen to that. It's funny how virtues like accessibility and good API design matter just as much as ever in this new era. Good ideas don't go out of style.
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
saagarjha · 3 days ago
Is it possible to download an archive of the data so I can run searches against it locally (without AI)?
_mattt · 2 days ago
Hi Saagar! One could indeed use this method to export this information en-masse, however that would require the kind of automated access that the site ToS explicitly forbids.

For the Swift standard library and other open-source frameworks, you could probably extract this from documentation comments using DocC, Jazzy, or the like.

_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
danielfalbo · 3 days ago
How to reliably HTML to MD for any page on the internet? I remember struggling with this in the past

How hard would it be to build an MCP that's basically a proxy for web search except it always tries to build the markdown version of the web pages instead of passing HTML?

Basically Sosumi.ai but instead of working on only for Apple docs it works for any web page (including every doc on the internet)

_mattt · 2 days ago
I think HTML -> Markdown is a bit of a red herring.

In many cases, a Markdown distillation of HTML can improve the signal-to-noise ratio — especially for sites mired in <div> tag soup (intentionally or not). But that's an optimization for token efficiency; LLMs can usually figure things out.

The motivation behind Sosumi is better understood as a matter of accessibility. The way AI assistants typically fetch content from the web precluded them from getting any useful information from developer.apple.com.

You could start to solve the generalized problem with an MCP that 1) used a headless browser to access content for sites that require JS, and 2) used sampling (i.e. having a tool use the host LLM) to summarize / distill HTML.

_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
Someone · 3 days ago
Hm, I would have extracted the markdown from the Swift source code. That’s what Apple uses to generate their pages, using https://www.swift.org/documentation/docc/.

For example, AFAIK, https://github.com/swiftlang/swift/blob/main/stdlib/public/c... is used to generate https://developer.apple.com/documentation/swift/array.

_mattt · 3 days ago
This is only possible for some of Apple‘s open source Swift code, including the Swift standard library. This is not the case for hundreds of other SDK frameworks, such as SwiftUI.

Even for those open source projects, there is still some value added in the generated documentation that isn’t directly available from documentation comments, such as type members and protocol conformances (though a LLM could certainly suss that out with the right context).

_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
qazxcvbnmlp · 3 days ago
Great promise; sometimes need to reference docs to build context.

I looked at the examples you posted and did a quick glance. For example

'''init?(exactly: Float80)'''

the tool converted it to

'''- [initexactly-63925](/documentation/Swift/Double/init(exactly:)-63925)'''

To achieve its goal I would be worried that it dropped the verbatim function signature. Claude still figured it out, but for more obscure stuff that could be an issue.

_mattt · 3 days ago
Thanks for pointing that out. That’s most likely a mistake in how I’m translating into Markdown. I’ll look into this.
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
jcoletti · 3 days ago
This is awesome and timely for me...going to give it a whirl. Thanks for building. Also, there should totally be an easter egg where clicking something somewhere plays the sound!
_mattt · 3 days ago
Great idea! I just added that. Try clicking the icon in the header.
_mattt commented on Show HN: Sosumi.ai – Convert Apple Developer docs to AI-readable Markdown   sosumi.ai/... · Posted by u/_mattt
ChrisMarshallNY · 3 days ago
I don't even bother trying to render docc catalogs into JS. It's a royal pain that breaks easily.

If GitHub could support .docc files, that would be great. Otherwise, I still use Jazzy Docs.

_mattt · 3 days ago
Once upon a time, I built a project called `swift-doc`, which eventually got Sherlocked. I think what I was most upset about was their decision to call their thing "DocC". Like, adding redundant consonants to avoid name collisions is my shtick.

Long live Jazzy.

u/_mattt

KarmaCake day54June 25, 2017
About
https://github.com/mattt
View Original