Now turn to the machine learning problem we sought to solve with the new synthetic data: what is the P(y|X1, X2, ..., Xn) where y is usually a class like "bird". In other words given an image predict its label. Since the data was generated knowing only the statistics of the original data, it can add no value beyond plausible examples developed using the original data itself.
Will this improve the accuracy of a model by providing additional edge case examples and filling in gaps? Somewhat. Will it understand data not represented by the original data and substitute for more thorough, diverse datasets? Absolutely not.
In terms of model improvement, yes synthetic data can help. In terms of the arms race? No. True examples provide knowledge that is unique. If one used a physics engine (GTA is popular for self-drivings cars) one can gather truly novel data; this is not the case for GANS.
It's concerning how willing people are to write articles on this subject without understanding the mathematics underlying the technology.
Do your homework and RTFM.
Learning a new language wasn’t too hard when that language was Python, after all.
I guess academics like familiarity and Lua insistently refuses to be like other languages (arrays and maps in one type, 1-based arrays, nonstandard builtin patterns, etc).
I'm an AI researcher / practitioner. For me code accompanying papers is very useful and usually this code is in Python. Occasionally it's Matlab but let's be honest, who cares about those papers :). I'd love to use Julia but the package support just isn't there. Ironically people like me are supposed to be writing this code but with a demanding job and a family it's not likely I will be improving their DataFrame effort anytime soon.
Anyway the MAIN reason I use open source software is because if it isn't working correctly I simply fix the code myself. This isn't possible in the proprietary world. Why would you trust your research or production work with code you can't see and edit?
There's been a lot of talk about documentation. Docs are secondary sources, like WIRED, read the code if you're serious about being correct. Even (especially) hired hands make mistakes and fail to write good tests.
This article reminded me of the fictional Simpson's news article "Old Man Yells at Cloud". It's funny, and he may have a point, but it has no relevance.
but... what else is there, in life? Flipping big matrices around is nearly everything I do, and the python stuff seems too cumbersome for me to bother.