Recently I have ventured into technical writing. At the company I work for, documentation is scattered around ~4 different tools.
1. Google Docs 2. Confluence 3. GitHub (READMEs) 4. Slack
Each of those serves a purpose of course, Google Docs are very collaborative, Confluence is our source of truth, GitHub is mostly for engineering and finally Slack usually has some threads you can find if you run into certain issues.
I am not suggesting we should put all of this into a single tool, but I am wondering if there is a methodology for organizing documentation. I am aware of Diataxis, and want us to use this for certain services / products. What I am looking for in this ASK HN post though, is an overarching methodology of organizing all documentation.
I prefer two types of documentation:
1. Executable documentation - tests, asserts, even things like Jupyter notebooks that can be tested and executed
2. Timestamped documentation - documentation that has a clear date on it of when it was valid. So the reader has an expectation "This was true at X date, but may not be true now". This includes detailed pull requests and git commit messages.
https://softwaredoug.com/blog/2023/10/13/fight-undead-docume...
The documentation lives in the same git repository as the code that it documents.
Inaccuracies or out-of-date documentation is treated as a severe bug. Issues are filed, the documentation gets fixed.
This is crucial, because if zombie documentation is allowed to persist it causes people to lose trust in the documentation, which means they won't refer to it and they won't contribute to it.
Once the documentation is in a trustworthy state, keeping it that way gets a LOT easier. It becomes part of the code review process - a PR won't be accepted unless it updates the relevant documentation that accompanies the code change.
I've been using this policy for my own projects for quite a few years now, and the result is that I'm proud of the state of my documentation for almost the first time in my entire career.
I think it basically came down to incentives. Those who were developing the code every day had all the relevant details in their heads. So ensuring that docs matched the code was basically a distraction from their primary goals.
It makes me a little angry because it screws over users of the software. The docs don't even mention that they might be out of date.
Why does this work with Git but not with non-Git? Being in Git doesn't seem to relevant to raising documentation bugs as bugs.
1) LLM scans code updates and approximates output 2) Developer reviews and updates if needed 3) Feed back tagged code and updated output to LLM 4) ? 5) Profit
My response is almost "too bad". :) Keeping things documented is part of the work, and if it isn't done, then the work is not completed.
Your suggestions of (1) and (2) are great suggestions to have as components to the documentation system. Notebooks are really fantastic for this.
Absolutely. Part of reviewing is checking that the eg. README is up to date and works as expected.
If I need to do a thing, and I don't know how to do it, I search for the most obvious sequence of words I can thing that is vaguely like my problem in Confluence. I do this maybe 3 to 5 times.
If I find something, I open it in edit mode and start reading through it. The instant I hit upon anything not obvious to me, add whatever obvious thing is missing.
If I don't find anything in there, I create a page in the Diataxis format (usually a HOWTO) and write it myself. I use short sentences, plenty of screenshots, and plenty of code blocks, to make it as copy-and-paste friendly as possible.
I never ask just how basic this thing actually is - most of my most viewed articles in any organization turn out to be the most basic ones. "How to make a network drive in Windows." "How to set up your Git credentials." These are very often much more popular than "How to build a custom VM inmage with QEMU and Ansible." I take my own confusion as an existence proof that this is sufficiently obscure enough to confuse one generally competent but non-expert person, and take faith that most people in my org are not experts in most things.
I trust other people to be able to look at the timestamps and the history of the docs and to figure out whether what they're reading is too outdated to be useful. I pretend, despite evidence to the contrary, that other people will follow roughly the same algorithm as me, and read pages and make updates on the fly as they work. If they don't, well, that's them ceding their cultural power, which they probably don't want anyway (and that is entirely fair).
For example, a development setup at a company or group may require a certain setup that depends upon the operating system and IDE/editor an individual developer uses. This type of information is perfect to put on wikis. It is effectively "global" information, whereas repositories contain local information. Putting information like this into a repo can increase the barrier to keeping the documentation updated and also requires source-code control access to view, which not everyone has or should have to view documentation.
In my opinion, a combination of wiki documentation plus documentation within the repository are very good. In addition to that, I often use Google's office suite or Microsoft 365 for working documents, that is documents that need to just be written, get collaborative feedback, shared between external and internal people, etc. Then, once they start to solidify and start to get more atomic updates, it makes sense to move them to the wiki or a Markdown document in a repository.
Then each team has their own wiki page with a list of "things we care about", each of which links to a separate page that was written using a template of headings, to get minimal (and QUICK TO CREATE) documentation. The template includes about a dozen or less things like: where is the source code, who are the key stakeholders, how do you build it, how & where is it deployed, how are backups done & who is responsible for them, what are the key high-level inputs and outputs, and what else is essential that you want to know, if anything. It is OK to put "N/A" as an answer (if that is true), but all sections are to be completed before it is released.
Those things are separate from the code which is why they are not documented inside the code. They can change even when the code does not, and might sometimes be maintained by non-coders. Code doc comments are more about saying why something was done, in the code, the way it was done.
Then have new people start with the wiki. It should include a section on what to tell new people. This is a potential way to learn, and the new person can have their first task be to update portions of it as they work on new things and work with existing team members. Every attempt to change the culture should be included in the wiki (for example: "we have a rule: no new technical debt. How will we preserve that rule going forward and not just forget like we did in the past? It goes in the wiki and we review it periodically and evaluate how we are doing."). Existing team members should subscribe to change emails so they can verify non-trivial changes!
If there is a QA function, they periodically evaluate (maybe just ask the team) how well the team does at maintaining the wiki pages and following the code review checklist, and reports that to management.
If you don't have people who can or are willing to do that, might be good to ask why, and we all start by looking at ourselves.
If you destroy wikis without an alternative, they'll be replaced with one of the worse options.
But Wikis are useful for things that are shared, or that are not tied to a particular product, or don't exist in the product yet. eg - "What's the temporary workaround for this bug?" - "How do I get started as a new employee?" - "What information do we need on customer requests?" - "What's the team process for handling escalations?" - "Here's the preliminary design for this new feature"
You still need someone to update those docs, but it's nice if, eg, the manager (or product/project manager) who isn't in Git all day can do it easily instead of asking a dev to do it.
When there's a very limited set of curators, a specific topic, and a self-incentive to keep things "up-to-date" wiki's can be a great choice... mostly as like a knowledge base or internal glossary/dictionary.
For true "documentation" (if you don't take the wiki as a whole), and in the corporate world, I agree with the issues against using wiki's.
Especially if making changes to the over all architecture or introducing a new dependency! The team shouldn’t let the PR ,edge until the docs are sufficient
Bakes into your existing workflows, direct line of accountability to both the code reviewer and the dev.
The editing software can be anything really.
Wikis do work for very large user bases, say documenting Stardew Valley mechanics. I think for very large teams, say 50, they start to make sense, especially if you're spending more time in meetings updating everyone than doing work.
Dokuwiki for example has a sane, plain text format. It can be extended relatively easily compared to Confluence (I have tried both).
Unlike certain other wikis it has access control and unlike Confluence you can edit a single paragraph or section at a time.
And finally, it is actually a wiki, wiki originally meant "quick" I think and I think calling Confluence a wiki in that context is somewhat ironic.
I span up a search engine that covered as many of those systems as possible (just SQLite FTS with Datasette, cron tasks that indexed various things and a simple custom search UI) and it helped a lot, because people at least had a fighting chance of finding stuff.
I believe there are off-the-shelf solutions for this kind of thing now, though I don't have experience with any of them myself.
I've since recreated aspects of the search system I built there as https://github.com/dogsheep/beta - you can see a working example of that system on the Datasette site here: https://datasette.io/-/beta?q=geojson
Deleted Comment
Documents in a file system, Confluence, a Wiki, docs in project repositories and a special documentation repo.
We have decided that the best place for the "single source of truth" for that is right next to the appropriate code in git, with the various build/deployment scripts ensuring that copies (explicitly unmaintained, unmaintainable, read-only) of that get packaged with the actual systems, with the packaged libraries, linked in their web backends, etc. We don't care much about the format of the document, whatever fits the particular needs best - e.g. sometimes it's markdown, sometimes it's Excel.
The key factor here is to ensure that (a) there's a single source of truth; (b) you can have the same atomic commit/pullrequest/whatever altering both the system and the documentation at the same time; (c) every artifact has the appropriate version of the documentation, instead of going to some internal site or document which might have a different, newer version, you know what is supposed to be true for this release which actually is on this particular server.
One process that ends up being really valuable for documentation purposes is our "Architecture Review Documents". This is a standard document that team leads fill out before starting work on a new Saga/Epic/Feature/whatever. It includes the scope and business value of a new feature or large block of work, high level technical architecture of implementation, the impact on existing database schemas and service APIs, etc. This document is presented in a meeting with technical leadership in our organization who deep dive on the topic and explore potential pitfalls in the plan.
The document and recording of that meeting live on forever, and this information is very useful when getting acquainted with a certain part of our product/codebase. You are able to read and hear clearly the intention of a certain service or module, and you can identify several relevant points of contact to ask questions to.
[0] https://backstage.io/