I learned about UML in a course, but have never used it or seen it in practice as a junior developer. Sometimes I'll see a flowchart, but that's not too common. Is this the same in other companies?
With some of our code, the designers are either gone or sometimes unavailable, and it can be tricky to see how all the pieces fit together. A good IDE makes the job a little easier (finding references, ctrl+click to go to declarations, etc.), but it'd be nice to have a diagram or something for visualization.
So, is it a good idea to try documenting the code design through some sort of visualization? If so, using UML or something else? I suppose there might be tools for doing this automatically in some languages? Otherwise I think if it was valuable enough, it could be something we make sure to review and update along with code changes.
Any thoughts would be appreciated!
I don't know if an automated visualization system is possible, but you'll have to understand the whole thing before doing so. Pen and paper was the most expedient solution for me at the time.
https://www.youtube.com/watch?v=_nTpsv9PNqo
I remember a whole bunch of light bulb moments when I showed other developers the "big picture". It's an awesome technique when you're forced to work on spaghetti!
Could do similar with bash text mangling tools, but language native would probably be best.
I dunno, just a thought in an EOD fog. I don’t own a printer these days, so I guess I’d need an alternative.
Tell computer to observe self and report back.
https://plantuml.com/
I'd say the big problem in visualizing big systems is that you can't usefully do it in one graph. For instance I worked on a system that had 2000+ database tables if you were going to make a diagram of that which shows everything it is going to take up a long wall. (This can be useful, but it is a big commitment)
A useful tool is going to let you make meaningful diagrams that show the subset of entities that are part of a story. I went to an art show of Mark Lombardi's works
https://en.wikipedia.org/wiki/Mark_Lombardi
who (before he was murdered) drew elaborate diagrams of conspiracies. One thing they showed was drafts that he made in the progress of creating his visualizations and he would sometimes make 40 or more of them. He would start out with a "hairball" that was disorganized and gradually figure out how to lay the diagram out in a way that made the meaning obvious.
I was blown away by the idea behind C4 when I first saw the presentation. I think what's missing is the tooling. I use C4 PlantUML do document my architecture designs.. what I'd really love though is a google maps style interface where I can zoom in or out of the current level I'm at. That'd be a game changer. Then you can really describe and understand the system.
The original presentation, in fact, used the google maps interface to illustrate the idea where you're first looking at a continent, then you zoom in to the city and finally the street level.
If you are using C4 right now, how do you compose the various level of architecture and navigate around them?
[0] https://news.ycombinator.com/item?id=31370268
That’s because it’s flat
Imo this changes the game
https://noda.io/
Imagine being able to make a fully 3D graph inside a space the size of a large building
At the moment, codeatlas is only a static gallery, but we're currently about 1-2 weekends away from releasing a Github action that deploys this diagram on github pages for your own repos - if you're interested, feel free to watch this repo: https://github.com/codeatlasHQ/codebase-visualizer-action
Here's my take: https://github.com/johntellsall/shotglass#demo-flask-a-small...
Complexity is an interesting measure too - I'm currently not sure how we'd model this, but this could definitely help codeowners understand which parts of their codebase is currently difficult for people to wrap their heads around. Or whether there's any complex parts that there's only a single contributor to, without whom the project would be left with a serious knowledge gap.
Once this can run as part of a CI pipeline and thus lives directly in the repo, I'd also love to add an overlay with the output of the testsuite to see which parts of the codebase aren't covered by tests! Or the output of a profiler, to see which functions are actually called the most.
For some reason, tools related to program understanding are not widely adopted by IDEs.
A while ago I used a tool for Java that was based on the Object-Oriented Metrics[0] book by Michele Lanza. But, that tool was discontinued and it doesn't exist anymore[1].
If you are interested in that topic take a look at Moose[2], a dig a little bit in the research papers. (honestly I tried Moose a few times, but I wasn't very comfortable with it).
For TypeScript projects, the TS compiler API is extremely powerful and easy to use. You can use that to extract information and analyze the code relationships (Graphviz is your friend here :) ).
[0]: https://link.springer.com/book/10.1007/3-540-39538-5 [1]: https://web.archive.org/web/20150428173717/http://www.intooi... [2]: https://moosetechnology.org/
You could map the way programs fit into machines, and the networks between them. This would be the topology.
You can map the way services call upon one another with requests. This is the service graph.
You can map how systems interact over events or shared resources. You could say this is the logical graph.
The problem happens when you try and graph them all at once. It's the same as trying to draw a real map, with all the services, bus routes, railways, shops and administrative regions superimposed on one image. It's very busy.
So I use separate maps.
Tools are another matter. Personally I use Mermaid for graphs. I also have my own tools that create SVG visualisations using DAGre. This can be helpful for interactive visualisations where you can click into different nodes and explore more detail.
My system uses CloudFormation templates and our in house deployment DSLs to figure out the "topology", then let the users see the different superimposed "graphs" as they see fit
I have even used it to render infrastructure diagrams of actual production systems (clusters, load balancers, etc)
I use https://mermaid-js.github.io/mermaid/#/ for the diagram itself because Github natively supports it in markdown files, so you can revision control the diagram. I managed to get reasonably close to the C4 diagrams minus a few features that mermaid does not support.
If that can't be done, there are some interesting things you can try. A lot of the suggestions in the thread are "top down" methods; you can get a lot of value out of "bottom up" visualizations too. Things like:
- Histograms of which lines / functions get called the most, or spent the most time in
- Which lines / functions / files get changed the most in the git history
- CPU flamegraphs
- Plain old print-debugging
In over-architected systems it can be difficult to figure out where the real "meat" of the code is, as opposed to the endless layers of configuration and wrappers and interfaces and indirection. UML diagrams may not help, or even be deceiving, but a stack trace never is.
I like the range of approaches in the Architecture of Open Source Applications books: http://aosabook.org/en/index.html