gormanc (u/gormanc) - Readit News

gormanc commented on Ask HN: Who is hiring? (May 2018) · Posted by u/whoishiring

simoes · 8 years ago

Datawheel (datawheel.us) | Front-End Developer and Back-End Developer and Product Designer | Cambridge MA and Washington DC | Full-time, ONSITE

Datawheel is a small but mighty crew of programmers and designers who are here to make sense of the world’s vast amount of data.

------------ Front-End Developer ------------ We are looking for someone proficient in Javascript, HTML, and CSS (React is a plus), but also someone who is passionate about what they do and can bring that to the projects assigned to them. You will be expected to communicate closely with back-end developers to connect to the API endpoints they make available, and with product designers to implement their mock-ups.

Requirements 3+ years experience with client-side web languages Familiarity with React and Node Comfortable with rapid prototyping

------------ Back-End Developer ------------ We are looking for someone proficient in Python (Pandas is a plus), but also someone who is passionate about what they do and can bring that to the projects assigned to them. Main responsibilities would include cleaning, structuring, and ingesting client data into a usable format, to then be delivered to front-end developers through a Rest API.

Requirements 3+ years experience with server-side web languages Familiarity with Pandas and/or other statistical software Familiarity with SQL Comfortable with rapid prototyping

Bonuses Experience with Scikit-Learn/Tensorflow or other machine learning libraries Experience working with columnar databases

Apply here: http://www.datawheel.us/apply/

gormanc · 8 years ago

ooo I'm moving back to Boston in a month or so and this is right up my alley :)

gormanc commented on Rent out GPUs to AI researchers and make ~2x more than mining cryptocurrencies reddit.com/r/gpumining/co... · Posted by u/mparramon

dpwm · 8 years ago

There's a lot of comments on that reddit thread about how awesome this would be as a service.

But there's a big problem of trust with this for ML.

How do I know you actually ran what I paid you for and not just generated random data that looks right in the shape I wanted it?

You could farm it out to two people and if the results disagree, then payment is decided by a third. But then you've just doubled the (paid) workload, and you've not really solved collusion by a significant portion of the workers.

gormanc · 8 years ago

According to the website (https://vectordash.com/hosting/) they use a highly isolated Ubuntu image, so the person hosting the service shouldn't have access to the VM with your model or data on it. It would be nice if there was some third party audit of the software though, the models, the code, and even the training data can be pretty sensitive for researchers.

gormanc commented on Ask HN: What tools have most helped your day-to-day productivity? · Posted by u/cadeljwatson

gormanc · 8 years ago

Visual Studio Code with the LaTeX Workshop addon is definitely my favorite LaTeX editor. The integration with the Chktex linter, latexmk, git, and all that jazz just makes it so much easier to focus on writing.

For research management I had been using Mendeley for a while and got a bit frustrated with the way it handled bibtex. Like it got really annoying when I had papers which fell into multiple categories and/or were used in multiple papers. My new setup is to use JabRef to manage individual bibtex files for specific projects and to use Mendeley just for document management and notes.

Oh also PyCharm is extremely good.

gormanc commented on Eager Execution: An imperative, define-by-run interface to TensorFlow research.googleblog.com/2... · Posted by u/alextp

josh11b · 8 years ago

I'm on the team that worked on this -- happy to answer questions!

gormanc · 8 years ago

Hot damn this has got me all giddy. How will this work on single node multi-GPU systems? For example, with PyTorch you have to either use threading, multiprocessing, or even MPI. Can you think of a not-too-scary way to use eager execution with multiple GPUs?

gormanc commented on Tensorflow sucks nicodjimenez.github.io/20... · Posted by u/nicodjimenez

orf · 8 years ago

Why not pip?

gormanc · 8 years ago

For installing it, yeah pip is great too, but for building conda includes third party tools and libraries and stuff. e.g. in order to use the MPI backend for PyTorch's distributed processing you need to build it yourself and conda just makes it a bit easier. That and I had a real bad experience with trying to build Tensorflow (and Bazel) to run on an HPC cluster.

gormanc commented on Tensorflow sucks nicodjimenez.github.io/20... · Posted by u/nicodjimenez

vaughngh · 8 years ago

I have tensors in later layers whose shapes depend on the training of the previous layers.

Rad! Do you have any examples (or literature) that explains when this is beneficial?

gormanc · 8 years ago

Not yet! I'm not using convnets or backprop or anything so I don't think it would be beneficial that way, but you could get something similar to what I'm doing by looking at Fritzke's Growing Neural Gas[1]

[1] http://papers.nips.cc/paper/893-a-growing-neural-gas-network...

gormanc commented on Tensorflow sucks nicodjimenez.github.io/20... · Posted by u/nicodjimenez

gormanc · 8 years ago

Hell, being able to effortlessly switch between PyTorch and Numpy/SciPy/sklearn/skimage has been so helpful for the project I'm working on. That and I have tensors in later layers whose shapes depend on the training of the previous layers.

gormanc commented on Tensorflow sucks nicodjimenez.github.io/20... · Posted by u/nicodjimenez

throw847333 · 8 years ago

A few more for your selection,

- Bloated build system that is near impossible to get working - who even uses maven ?! Pytorch/Caffe are super-simple to build in comparison; with Chainer, it's even simple: all you need is pip install (even on exotic ARM devices).

- The benefits of all that static analysis simply aren't there. In addition, PyTorch has a jit-compiler which one can argue lets one have their cake and eat it too.

- Loops are extremely limited. Okay, we know RNN/LSTMs aren't really TF's thing, but if you venture out to do something out of the ordinary even making it batch-size invariant is difficult. There isn't even a map-reduce op that works without knowing the dimension at compile time. You can hack something together by fooling one of those low level while_loop ops, but that just tells you how silly the whole thing is.

gormanc · 8 years ago

I love that PyTorch kind of went all-in with anaconda. Building it is so much easier than TF! I'm a recent convert but it's dang good.