Hello HN!
We're currently working on a work automation platform. We have a few capabilities that we're building to help you do your work better. For guidance it would be super helpful to hear directly from the HN crowd :)
Please do comment below about one thing you'd like to see automated. Would super awesome if you could complete this quick 1-min survey: https://bit.ly/2MjN6fd
Many thanks!
Technically I could write one command like this and go get a coffee:
cd dir1 && task1 && cd ../dir2 && task2 some arg && etc
But then the task would only start executing when I'm done typing all commands... I want it to start task1 right away and then tkae my time to enter the rest...
These tools need to be configured in certain ways depending on the business needs.
Having a nice way to look at the dataflow as a whole, configure these tools on a global level within some framework, and be able to nicely distribute the work on our internal server farm would be worth a good bit of money to the company.
I used Apache Airflow some years ago to do exactly this. It's pretty good. You build a workflow of tasks (in Python) and set a schedule of how often you want it run. It then runs these tasks on any number of machines that you run the Airflow worker on to orchestrate the running of whatever it is you are trying to do.
If a task fails it can notify you; and if you "miss" a run it can backfill it provided your toolchain understands the concept of time. Very useful for hourly/daily feeds that, if you miss one, the system can go back and retry it just for the slots it missed.
Comes with a nice UI, too.
“Airflow is not a data streaming solution. Tasks do not move data from one to the other (though tasks can exchange metadata!)“
Many of our tools take on the order of 5-200gb of data and do either some transformation (which gets passed along to the next tool; similar size, possibly after another automated validation step) and/or validation (whereby this particular branch of workflow ceases).
The automated modules we have are self-contained; each task in our case is “data + config parameters in, data out”, then use “data out” as “data in” after choosing configuration parameters for the next step.
Would this still be a good usecase — am I misunderstanding what the above quote is about?
Many channels on slack + email threads + conversations on DMs
I spend too much time looking at email and can’t seem to get the hang of the gmail labels system
In Google Apps Script, fetch all mail with that label, process as wanted, like sending replies, moving, deleting; & then remove the label.
I have to feed and give fresh & vitamin water to birds almost everyday. I wish, I could automate that somehow.
If there was a bot that kept up with my work and just summarised all the progress since my reviews ... that would really be super useful and I’d likely do more reviews