A scoring function could definitely help guide the exploration and/or prune the tree, but only at the action nodes, not the environment randomness nodes. Rolling out the full tree more than 1-2 levels would be infeasible because of the randomness in the environment. When you take an action, the randomness can transport you into an exponential number of states, so you have a huge branching factor that is much larger than chess. I think in chess you have a factor of 40ish? Here it's more like ~1000 or ~10,000 depending on the item.
I also wouldn't know how to design a scoring function for this. If you do something simple to take the number of missing modifiers you will end up stuck in bad states. Maybe there is something really clever here that you can do, but I don't know what it is.
If you have an idea how to make tree search work for this I'd love to try it.
Location: Japan (or East Asia in general)
Willing to Relocate: No
Technologies (reverse-chronological order):
- AI / Deep Learning Research - previously at Google and have published papers. Mostly focused on NLP and RL, but I keep up with other subfields.
- Infra: Devops, Rust, Go, Kubernetes, Microservices, large-scale systems, all kinds of data stores. Have managed large clusters. Used to be an early Apache Spark engineer and was in a database research group in grad school
- Worked in HFT-style algotrading for a few hedge funds
- Worked at multiple early-stage startups, so I can do other things like full-stack web or app development, but I prefer not to do these full-time. But I can help if stuff comes up.
Résumé/CV: https://dennybritz.com/about
https://twitter.com/dennybritz
http://github.com/dennybritz
dennybritz [at] gmail
--Hi! 15+ years of engineering experience, and have been through a lot of technology cycles. I'm in an ok place right now and focusing on research and side projects. I'm not actively looking for work but if there's something at the intersection of my interests I'd love to talk. Not sure myself what that would look like, perhaps something around MLOps, infra/automation, Reinforcement Learning, Algorithmic trading, etc.
See you on the shore next Friday :)