There was already some for radio/early news, but the landscape has changed so much, and it bothers me a ton that these platforms are being used.
After reading said announcement on Twitter, the first thing I’d do (if I cared about it) would be to head on over to sec.gov or use a search engine to find the official SEC site, then from navigate to find the official announcement. Any reputable news source should include a link in their announcement to the official announcement to save you this verification step.
At some point there may be so much targeted disinformation/misinformation out there that we need legislation to help protect against it but I don’t think we’re there yet.
I've found the problems that biologists cause are mostly:
* Not understanding dependencies, public/private, SCM or versioning, making their own code uninstallable after a few months
* Writing completely unreadable code, even to themselves, making it impossible to maintain. This means they always restart from zero, and projects grow into folders of a hundred individual scripts with no order, depending on files that no longer exists
* Foregoing any kind of testing or quality control, making real and nasty bugs rampant.
IMO the main issue with the software people in our field (of which I am one, even though I'm formally trained in biology) is that they are less interested in biology than in programming, so they are bad at choosing which scientific problems to solve. They are also less productive when coding than the scientists because they care too much about the quality of their work and not enough about getting shit done.
Ultimately I’d say the core issue here is that research is complex and those environments are often resource strapped relative to other environments. As such this idea of “getting shit done” takes priority over everything. To some degree it’s not that much different than startup business environments that favor shipping features over writing maintainable and well (or even partially) documented code.
The difference in research that many fail to grasp is that the code is often as ephemeral as the specific exploratory path of research it’s tied to. Sometimes software in research is more general purpose but more often it’s tightly coupled to a new idea deep seated in some theory in some fashion. Just as exploration paths into the unknown are rapidly explored and often discarded, much of the work around them is as well, including software.
When you combine that understanding with an already resource strapped environment, it shouldn’t be surprising at all that much work done around the science, be it some physical apparatus or something virtual like code is duct taped together and barely functional. To some degree that’s by design, it’s choosing where you focus your limited resources which is to explore and test and idea.
Software very rarely is the end goal, just like in business. The exception with business is that if the software is viewed as a long term asset more time is spent trying to reduce long term costs. In research and science if something is very successful and becomes mature enough that it’s expected to remain around for awhile, more mature code bases often emerge. Even then there’s not a lot of money out there to create that stuff, but it does happen, but only after it’s proven to be worth the time investment.
The example given at the end is interesting:
> So iCloud says the video is 128MB, I download it and the video is actually 48MB, and my free storage increases by ~170MB when I deleted it. Interesting!
This suggests that iCloud isn't simply misrepresenting the size of the example file, as then you'd expect that deleting the 128MB file would clear ~128MB of iCloud space. Instead, the deletion clears roughly the space it reports (128MB) plus the space of the downloaded version (48MB): 128MB + 48MB = 176 MB - which might be close enough, allowing for rounding errors, as iCloud reports the free space (from the article's example) to the nearest 10 MB.
Ultimately you’re increasingly tethered to some service for your storage that you pay for periodically based on total storage yet you have little-to-no information how to best optimize that storage if you want to operate in a fixed cost bracket or lower storage/cost ratio. So as a consumer, do I just wave my hands and keep throwing more and more money at the problem, especially now that devices are increasingly pushing everything, including storage, as a subscription service to meet my actual functional needs (that realistically could be met by local storage options if manufacturers didn’t have a vested interest in pushing me towards service based storage solutions)?
The modern business strategy in technology is simply hiding behind complexity. The cost is too complex for you to understand, it gives too much information away about our internals to competitors, and so on. Yet somehow these metrics are derived to assure the business is operating above cost because when the rubber meets the road it must be done, yet when the consumer wants to understand it’s suddenly too complex. The problem is that tech in many cases is growing to scales that really is too complex and business managers know this, so it’s often a valid excuse to hide behind. Conveniently that’s where they focus on investment and padding margins though.
Most fields are still left with piles and piles of potential solutions to sort through. They often select candidates that are the cheapest and most practical to approach or they have high suspicion of success and pursue those. At the end of the day though we don’t have full universe simulators at every scale we’d want, we have very specific area simulators within very specific bounds. You have to go out an empirically test these things.
But this is and has already been going on for decades across most disciplines I’ve interacted with, they just weren’t using DNN or LLMs at the time but domains are adopting these as well to leverage where feasible in the search process.
I work with a variety of people interested in leveraging simulation and everyone wants to take the successes they see in LLMs or say RL from AlphaStar or AlphaGo and apply them in their domain. It’s alluring, I get it, the issue is that we often lack enough real understanding in domains and the science isn’t as airtight and people think it is, its too general or narrow, or on some cases we have good suspicion of how to build better more accurate simulations but there’s not enough compute power or energy in the world to make them currently practical, so we need to take some tradeoffs and live with less accurate and detailed simulation which leads to inaccurate representations of reality and ultimately inaccurate solution suggestion candidates.
Overall, in any sort of cost/benefit analysis, the cost is just so low now the benefits don’t have to be much of anything, if anything at all. Entertainment factor alone, boredom, or perhaps a passing curiosity to try something are enough to create and present false or misleading information and push it out to the public, creating noise needed to filter through. There are plenty of other far stronger motives that make the problem even worse.
Misinformation and disinformation were already becoming an increasingly large societal issue IMHO. That is only going to get worse with wide access to generative AI. We already have a high degree of erosion in social trust where we pretty much have to consider motives and driving forces behind every transactional relationship we have these days and we could at least use costs to help sort that mess out: why would someone bother investing the resources to do this? Does it cost a lot to present me with false information and if so, is there enough potential motive behind that to make this information more likely to be false or misleading?
The answer to this is increasingly yes. It’s now far more difficult to start from a position of distrust and move to a point of trust or likelihood of trust and I think we’re going to see that even more in all sorts of aspects of daily life. I now have to assume most pieces of information out there are targeting me and attempting to manipulate me in some way (more than before). I fear we’re moving to a model of free speech that will put more weight on “authoritative” sources more so than in recent past in many cases considering liabilities authorities have when presenting false, misleading, or inaccurate information. Liabilities that in many cases aren’t real liabilities just perceived liabilities, granting authoritative information sources far more credit than is due.