Readit News logoReadit News
GraphLover9000 commented on Anyone else witnessing a panic inside NLP orgs of big tech companies?   old.reddit.com/r/MachineL... · Posted by u/georgehill
ashout33 · 3 years ago
I don't really think that book is about building a graph database from scratch
GraphLover9000 · 3 years ago
You're probably right.

One of my initial prompts mentioned graph databases as an example of a scalable system, so I wanted to ask it about the design properties that make it so. I figured that because it was a book about designing systems, it could give me an outline of how a graph database works in practice.

It's pretty annoying how the site erases your prompt once you receive your output. By the time it finishes loading I've half forgotten what my original question was.

GraphLover9000 commented on Anyone else witnessing a panic inside NLP orgs of big tech companies?   old.reddit.com/r/MachineL... · Posted by u/georgehill
org3 · 3 years ago
GraphLover9000 · 3 years ago
I couldn't get "designing data intensive applications" to explain to me how to design a graph database (from scratch, without using existing graph frameworks or technologies), but it only suggested reasons why graph databases are useful and the properties I have to keep in mind while designing it. I want to know how I can build one in practice.

Using a prompt like "Tell me how to build a graph database from scratch. Specifically, how to design the data model, implement the data storage layer, and design the query language." only gives a very vague answer. Sometimes it suggests using existing technologies.

Anyone know what I'm missing?

GraphLover9000 commented on NetworkX 3.0 - create, manipulate, and study complex networks in Python   networkx.org/documentatio... · Posted by u/anigbrowl
anigbrowl · 3 years ago
It's kind of a mysterious art, and I too mostly rely on scientific papers. It's a surprisingly small field and you need to get into the habit of chasing down citations in papers, because many important ideas got laid out long before the computer power existed to realize them at scale. Sometimes a 20-30 year old paper of only a few pages has the actual algorithm, and it's so well known in the field that it no longer stands out in more recent papers.

Here's a few useful references:

Good overview on big graphs: https://towardsdatascience.com/large-graph-visualization-too...

A gallery of large graphs - horrid user interface, but you can click through and find an absolute wealth of resources. Curated by Yifan Hu, who developed one of the popular layout algorithms: http://yifanhu.net/GALLERY/GRAPHS/

Graphviz is a very well-documented library with a lot of the 'classic' layouts.

Astronomy, physics, and bio people have a lot of useful visualization tools and techniques for huge datasets, but you will have to go looking for them - not because they don't like to share, but because they mostly write to each other so you won't just land on stuff by browsing Github. Absolute must-have literature review: https://arxiv.org/abs/2110.01866

A lot of large graph visualization techniques are about using simple graph visualization techniques but first combing out the hairballs through the application of dimensionality reduction, motif extraction, backbone identification and so on. This is an important paper whose techniques have yet to be fully explored: https://jgaa.info/accepted/2015/NocajOrtmannBrandes2015.19.2...

For a combination of theoretical and practical reasons, most visualization zeroes in on rendering smallish graphs in 2 dimensions. Large graphs are either so densely connected as to be be intractable (the brain being the ultimate hairball) or so sparse as be like digital planetariums - gorgeous, impressive, and looking much the same in every direction.

I could go on at length but as you can maybe guess I'm a consumer of other people's research rather than an expert in implementing the fundamentals. Also I don't have any academic background whatsoever so I apologize for the haphazard infodump. I've been studying/applying stuff from this field for ~15 years but it's too out there for most people. Feel free to email though.

GraphLover9000 · 3 years ago
> I've been studying/applying stuff from this field for ~15 years

Would you mind elaborating on the work you do with graph visualisation?

Thank you for the offer, I'll certainly contact you after I familiarise myself with some more classic research material.

GraphLover9000 commented on NetworkX 3.0 - create, manipulate, and study complex networks in Python   networkx.org/documentatio... · Posted by u/anigbrowl
taubek · 3 years ago
Maybe this will be interesting to you. An open source graph visualization library called Orb.[1] There is a series of blog post that talk about the development and reasoning behind this library. [2]

[1] https://github.com/memgraph/orb

[2] https://memgraph.com/blog?topics=Orb#list

GraphLover9000 · 3 years ago
Thank you! I'll take a look.
GraphLover9000 commented on NetworkX 3.0 - create, manipulate, and study complex networks in Python   networkx.org/documentatio... · Posted by u/anigbrowl
taubek · 3 years ago
You mean algos related only to NetworkX or in general? If you are looking for NetworkX related stuff besides the official docs, NetworkX Guide [1] is a good starting point.

[1] https://networkx.guide/

GraphLover9000 · 3 years ago
In general! Graph drawing and interactivity is an interesting problem in my opinion.
GraphLover9000 commented on NetworkX 3.0 - create, manipulate, and study complex networks in Python   networkx.org/documentatio... · Posted by u/anigbrowl
anigbrowl · 3 years ago
By the way if you want to explore network science but don't really know where to start, consider Gephi (currently being refactored, and just updated a few days ago) or Cytoscape (if you're more drawn to bioinformatics).

Both make it easy to load/generate standard datasets, import tabular data, and have a good selection of plugins. It's easy to kick stuff out to either from NetworkX using /gefx or graphml, and you'll be able to experiment with a wide variety of layout algorithms and metrics. If you don't find the toy/benchmark networks intuitive, considering hitting the HN API for your source material; it's a lot easier to grasp the topic by studying relationships that are already familiar.

GraphLover9000 · 3 years ago
Do you have any advice/ideas about ways to learn about graph layout algorithms themselves, including dynamic/real-time algorithms (which allow for user interaction)? I have been skimming through various papers and the first book on this page [0], in particular the chapter on force directed algorithms, because they seem to be the earliest and most general graph drawing methods.

[0] http://graphdrawing.org/books.html

GraphLover9000 commented on Show HN: Python library for embedding large graphs (Written in Rust)   github.com/H4kor/graph-fo... · Posted by u/h4kor
GraphLover9000 · 3 years ago
Have you considered just writing a Rust library and also releasing a thin Python wrapper over it as a separate project? That way, other people could write their own thin wrappers in their high level languages of choice and use your fast implementation via FFI.

I have spent some time looking into graph drawing algorithms and it seems to me that writing a good, optimised algorithm is non-trivial!

u/GraphLover9000

KarmaCake day3November 29, 2022View Original