I don't have a Phd in machine learning, but I have spent many years using it as a tool to solve problems. While the details here can get you a long way, without understand feature engineering or feature selection, you will have a hard time building accurate models.
For any engineers looking for more on feature engineering after reading this, I maintain an open source library for automated feature engineering called Featuretools (https://github.com/featuretools/featuretools). We also have demos on our website (https://www.featuretools.com/demos) if you want to see it in action.
and the proprietary and newly launched driverless ai (from h2o)
I'd argue that the core competency of Dropbox is its easy syncing. Dropbox wanted to get that to market quickly. If they had spent the time building out a data storage solution on their own, it would have meant months or years of work before they had a reliable product. Paying AWS means giving Amazon some premium, but it also means that you don't have to build out that item. It's not only about economies of scale and rapid demand. It's also about time to market.
I think it's a reasonable strategy to calculate out something along the lines of "we can pay Amazon $3N to store our data or store it ourselves for $N. However, it will take a year to build a reliable, distributed data store and we don't even know if customers want our product yet. So, let's build it on Amazon and if we get traction, we'll migrate."
S3 is a value-added service and creating your own S3 means sinking time. Even though data storage is very very near to Dropbox's core competency, it's really the syncing that was the selling point of Dropbox. To get that syncing product in front of customers as fast as possible, leveraging S3 made a lot of sense. It gave them a much faster time to market.
As time went on, they had traction, and S3 costs mounted, it made sense for them to start investing in their own data storage.
It's about figuring out what's important (the syncing is the product) and figuring out what will help you go to market fast (S3) and figuring out how to lower costs after you have traction (transitioning to in-house storage).
Yes, a lot of companies use cloud services when they don't need them. However, Google Cloud's compute pricing is reasonably similar to DigitalOcean (with sustained usage discounts) and from what I hear these companies will often negotiate discounts. AWS can seem a bit pricy compared to alternatives, but I'm guessing that Amazon offers just enough discounts to large customers that they look at the cost of running their own stuff and the cost of migration and Amazon doesn't look so bad.
Still, when you're trying to go to market, you don't want to be distracted building pieces that customers don't care about when you can rent it from Amazon for reasonable rates. You haven't even proven that someone wants your product yet and your time is better spent on delivering what the customers want rather than infrastructure that saves costs. As you mature as a company, the calculus can change and Dropbox seems to have hit that transition quite well.
Given the storage usecase of DropBox what would be the percent of saving if DropBox indeed went with Google or Digital Ocean?