boogheta (u/boogheta)

boogheta commented on Ask HN: How to make my website exist for 100 years? · Posted by u/klgt

boogheta · 2 months ago

Survive that long!

boogheta commented on A graph explorer of the Epstein emails epstein-doc-explorer-1.on... · Posted by u/cratermoon

boogheta · 4 months ago

It's a bit too bad that the network visualisation relies on d3: it is really slow with big networks, and the force directed algorithm is far from the best. Have you tried using JS libraries built specifically to visualise graph networks such as Sigma.js, Vivagraph or Cytoscape?

boogheta commented on · Posted by u/sanix-darker

boogheta · 6 months ago

Nice! Funny coincidence, @Yomguithereal has been working for the past few weeks pretty much on the same idea of using SIMD for CSV parsing, but in Rust! https://github.com/medialab/simd-csv

boogheta commented on A love letter to the CSV format github.com/medialab/xan/b... · Posted by u/Yomguithereal

lyu07282 · a year ago

It makes me a bit worried to read this thread, I would've thought its pretty common knowledge why CSV is horrible and widely agreed upon. I also have hard time taking anybody seriously who uses "specification" and "CSV" in the same sentence unironically.

I suspect its 1) people who worked with legacy systems AND LIKED IT, or 2) people who never worked with legacy systems before and need to rediscover painful old lessons for themselves.

It feels like trying to convince someone, why its a bad idea to store the year as a CHAR(2) in 1999, unsuccessfully.

boogheta · a year ago

Maybe just read the love letter?

boogheta commented on A love letter to the CSV format github.com/medialab/xan/b... · Posted by u/Yomguithereal

seydor · a year ago

Just don't write that love letter in French ... or any language that uses comma for decimals

boogheta · a year ago

You should look at its author's nationality ;)

boogheta commented on A love letter to the CSV format github.com/medialab/xan/b... · Posted by u/Yomguithereal

afiori · a year ago

Actually even whitespace-separated json would be a valid format and if you forbid json documents to be a single integer or float then even just concatenating json gives a valid format as JSON is a prefix free language.

That is[0] if a string s is a valid JSON then there is no substring s[0..i] for i < n that is a valid json.

So you could just consume as many bytes you need to produce a json and then start a new one when that one is complete. To handle malformed data you just need to throw out the partial data on syntax error and start from the following byte (and likely throw away data a few more times if the error was in the middle of a document)

That is [][]""[][]""[] is unambiguos to parse[1]

[0] again assuming that we restrict ourselves to string, null, boolean, array and objects at the root

[1] still this is not a good format as a single missing " can destroy the entire document.

boogheta · a year ago

« a single missing " can destroy the entire document » This is basically true for any data format, so really worse argument ever...

boogheta commented on A love letter to the CSV format github.com/medialab/xan/b... · Posted by u/Yomguithereal

IanCal · a year ago

The byte for a capital I is the same as the start for an odd file format, slyk maybe? Excel has (or did if they finally fixed it) for years decided this was enough to assume the file (called .csv) cannot possibly be csv but must actually be slyk. It then parses it as such, and is shocked to find your slyk file is totally broken!

boogheta · a year ago

It sounds to me like as often the problem here is Excel, not CSV

boogheta commented on Ask HN: What is the best Covid-19 dashboard? · Posted by u/r0f1

boogheta · 6 years ago

Here's my contribution to allow simple comparisons using series and small multiples, and to automatically shift curves with a calculated delay: https://boogheta.github.io/coronavirus-countries/