Readit News logoReadit News
cdbattags commented on Pricing Changes for GitHub Actions   resources.github.com/acti... · Posted by u/kevin-david
cdbattags · 3 months ago
Use Blacksmith. I promise you won't regret it.

Dead Comment

cdbattags commented on Show HN: Jq-Like Tool for Markdown   github.com/yshavit/mdq... · Posted by u/yshavit
lanstin · a year ago
Ironically one of the reasons markdown (and other text based file formats) were popular because you could use regular find/grep to analyze it, and version control to manage it.
cdbattags · a year ago
Definitely, but it's neat nonetheless because more and more things are "structured Markdown" these days. Extremely useful for AI reasoning and outputs.
cdbattags commented on Ask HN: What's your "it's not stupid if it works" story?    · Posted by u/_bbih
cdbattags · 2 years ago
I worked for an education technology company that made curriculum for K-8. There are long sales cycles in this space and different departments of ed have different rules. Think "vote every 4 years because our books are out of date or just old". The technology wave came fast and most of this curriculum from incumbent providers was formatted to fit in a book with maybe some of the most cutting edge people having a large InDesign file as the output.

The edtech company I worked for was "web first" meaning students consumed the content from a laptop or tablet instead of reading a book. It made sense because the science curriculum for example came with 40+ various simulations that helped explain the material. A large metropolitan city was voting on new curriculum and we were in the running for being selected but their one gripe was that they needed N many books in a classroom. Say for a class of 30 they wanted to have 5 books on backup just in case and for the teachers that always like a hardcopy and don't want to read from a device.

The application was all Angular 1.x based that read content from a CMS and we could update it in realtime whenever edits needed to be made. So we set off to find a solution to make some books. The design team started from scratch going page by page seeing how long it would take to make a whole book in InDesign but the concept of multiple editing doesn't really exist well in this software. Meanwhile, my team was brainstorming a code pipeline solution to auto-generate the book directly from the code that was already written for the web app.

We made a route in the Angular app for the whole entire "book" that was a stupid simple for loop to fetch each chapter and each lesson in that chapter that was rendered out on a stupidly long page. That part was more less straightforward but then came the hard part of trying to style that content for print. We came across Prince XML which fun fact was created by one of the inventors of CSS. We snagged a license and added some print target custom CSS that did things like "add blank page for padding because we want new chapter to start on the left side of the open book". But then came the devops portion that really messed with my head.

We needed a headless browser to render out all of this and then we needed the source with all the images, etc to be downloaded into a folder and then passed to Prince XML for rendering. Luckily we had a ECS pipeline so I tried to get it working in a container. I came up with a hack to wait for the end of the rendering for loop for the chapters/lessons to print something to console and then that was the "hook" for saving the page content to the folder. But then came the mother of all "scratching my head" moments when Chromedriver started randomly failing for no reason. It worked when we did a lesson. It worked when we did a chapter. But it started throwing up a non-descript error when I did the whole book. Selenium uses Chromedriver and Chromedriver is direct from Google and Chromium repo. This meant diving into that C++ code in order to trace it down when I finally found the stack trace. Well yeehaw I found an overflow error in the transport protocol that happens from Chrome devtools as it talks to the "tab/window" it's reading from. I didn't have the time to get to the bottom of the true bug so I just cranked the buffer up to like 2 GB and recompiled Chromium with the help of my favorite coworker and BOOM it worked.

But scaling this thing up was now a nightmare because we had a Java Dropwizard application reading a SQS queue that then kicked off the Selenium headless browser (with the patched Chromedriver code) which downloaded the page but now the server needed a whopping 2 GB per book which made the Dropwizard application a nightmare to memory manage and I had to do some suuuuper basic multiplication for the memory so that I could parallelize the pipeline.

I was the sole engineer for this entire rendering application and the rest of the team assisted on the CSS and styling and content edits for each and every "book". At the end of the day, I calculated that I saved roughly 82,000 hours of work because that was the current pace of how fast they could make a single chapter in a book multiplied by all the chapters and lessons for all the different states because Florida is fucked and didn't want to include certain lines about evolution, etc and so a single book for a single grade but for N many states that all have different "editions".

82,000 hours of work is 3,416.6667 days of monotonous, grueling, manual, repetitive design labor. Shit was nasty but it was so fucking awesome.

Shoutout to John Chen <zhanliang@google.com> for upstreaming the proper fix.

cdbattags commented on Lyft’s plan to take control of its maps and its future   lyft.com/rev/posts/lyfts-... · Posted by u/edward
cdbattags · 3 years ago
So which open source routing project do we believe they're using along with it? Valhalla?

Edit:

I'm co-founder and CTO of a heavy haul/oversize routing SaaS called Triple Axle (https://tripleaxle.com). We using OSM and Valhalla.

cdbattags commented on Faster than the filesystem (2021)   sqlite.org/fasterthanfs.h... · Posted by u/madmax108
reissbaker · 3 years ago
If you're using 16 PCI 4.0 lanes you max out at 32GB/s, although commercial drives tends to have much lower throughput than that maximum (~7.5GB/s for a good NVMe drive). Cat6a ethernet tops out at 10 gigabits per second, but plenty of earlier versions have lower caps e.g. 1 gigabit. My guess is you'll most likely be limited by either disk or network hardware before needing CPU parallelism, if all you're doing is copying bytes from one to the other.
cdbattags · 3 years ago
The other being a network socket in this case? But that socket might be two servers over? Meh, ideally they've optimized that as well.

So absolutely it is a network problem which means custom fiber?

cdbattags commented on Faster than the filesystem (2021)   sqlite.org/fasterthanfs.h... · Posted by u/madmax108
summerlight · 3 years ago
This reminds me of WinFS, which was probably one of the most ambitious architectural project ever in the history of Windows yet failed to materialize. The vision was so attractive, encode all the semantic knowledge of file schema and metadata into relational filesystem. So you can programmatically query on whatever information about the filesystem and its content.

IIRC the problem was its performance. I don't have any insider knowledge so cannot pinpoint the culprit but I suppose that the performance issue was probably not something fundamental tradeoff (as this article suggests) but more of its immature implementation. The storage technologies got much better nowadays so many of its problem could be tackled differently. Of course the question it has to answer is also different; is it a still worth problem to solve?

cdbattags · 3 years ago
That's a good take. I wonder if those engineers have hopped around and said similar when Apple or others tried to implement the same?
cdbattags commented on Faster than the filesystem (2021)   sqlite.org/fasterthanfs.h... · Posted by u/madmax108
BartjeD · 3 years ago
In 2017 you didnt have io uring yet. Though that doesn't explain it for windows and android.
cdbattags · 3 years ago
Ahhhh, this is the real answer!

Edit:

Holy shit, IO rings released on Windows Preview in 2021...

u/cdbattags

KarmaCake day326October 10, 2013
About
CTO @ https://Slabstack.com

https://cdbattaglia.com

Computer Science alumnus from Georgia Tech

View Original