Readit News logoReadit News
irridiance commented on Ask HN: How are you cleaning and transforming data before imports/uploads?    · Posted by u/_xsbz
irridiance · 10 months ago
I transform hundreds of tabular sources. For the cleaning / transformation, I found that a very small number of transformations is required, and that we need to review them as a team including business owners. So, I wrote a simple grammar that is very English-like; that gets translated into Polars operations under the covers in Python. It covers 98% + of my ingestion needs, and means that we focus on the needs of the logical data transformations as a team. Business users can easily make changes for sources they manage.

One of the concepts is a “map”, for old values to new values. Those we keep in Excel in Git, so that business users can edit / maintain them. Being Excel, we’re careful to validate the import of those rules when we do a run, mainly to indicate where there’s been a lot of change to identify where there might be an unintended change. Excel makes me nervous in data processing work in general (exploration with Pivots is great, though I’ve moved to Visidata as my first tool of choice). But for years of running in this way we’ve worked around Excel lax approach to data, such as interpreting numerical ID fields as numbers rather than strings.

For output “rendering”, because everything is in Polars, we can most frequently simply output to CSV. We use Jinja for some funky cases.

irridiance commented on 1 Dataset. 100 Visualizations   100.datavizproject.com/... · Posted by u/gaws
irridiance · a year ago
I think most of these are extremely poor. They can only be interpreted in many cases if you already understand the data, such as by reading the table first.
irridiance commented on The Documentation Tradeoff   tidyfirst.substack.com/p/... · Posted by u/thunderbong
irridiance · 2 years ago
I write a lot of documentation, knowing that it may be nobody else who reads it. Why? Because when I take the time to write clearly, I think clearly. It’s for my productivity and effectiveness, first.
irridiance commented on Ask HN: How do you ask users about their pain point?    · Posted by u/yr1337
irridiance · 2 years ago
We’ve been developing niche medical software successfully for some decades.

First, it helps that it’s niche—it avoids the “make healthcare better with electronic healthcare records” space, which can only but descend into making a much of text boxes available on a screen and promising that AI will do… something…

Second, we will listen to our clients, and probe their needs. But we’re most successful when we observe our clients. When we’re not in the thick of it, we have more space to ask “does it have to be this way?” We work very hard to formulate the problem so that a piece of software is not the default solution.

Few of the pain points are “exciting” or “glamorous”. But anything that means the practitioner is spending more time with the patient is a big win, even if it means applying some very boring technology.

Best of luck.

irridiance commented on ASK HN: What’s a small thing you’ve purchased which has made your life better?    · Posted by u/jjwtieke
irridiance · 3 years ago
My mosquito net. One of the best things I ever bought.
irridiance commented on Ask HN: What is the most memorable game you played?    · Posted by u/Joel_Mckay
irridiance · 3 years ago
Limbo. One of the most moving games I’ve ever played.

King’s Quest III. I think, as a young boy, I related more to the character than in King’s Quests I and II. I really got lost in the world.

Infocom’s Planetfall. Also, lost in the world.

irridiance commented on Show HN: Mystery-o-matic – A daily murder mystery to solve   mystery-o-matic.com/... · Posted by u/galapago
irridiance · 3 years ago
Good fun. I think, though, that precision in language might be a challenge. Some previous comments I concur. Over and above those, I was very strict not to infer anything outside the minimum of what was said. For example, “I was in bedroom from 10:00 to 10:15” does not imply that “I was not in the bedroom before or after that time”. Or, “I didn’t see anyone when I arrived” only means I saw no one in the destination room, not that there wasn’t someone in the kitchen (that I must have walked through) or the source room. Illogical that the murder could have happened up to 11:15, at exactly the same time that the police arrive—unless the victim phoned it in. These rules left ambiguity.

u/irridiance

KarmaCake day20July 7, 2023View Original