I like it though. It's very convenient.
I like it though. It's very convenient.
To answer some questions here in one go - for the European Sovereign Cloud, EU laws always apply. The only people with operational control or access (physical, or logical) are EU people in the EU, and decisions about how lawful orders are handled are also made by EU people in the EU. This is one of the biggest pieces of what it means to be a "Sovereign Cloud" and comes directly from the requirements of our customers. Another is that there are no technical dependencies on non-EU infrastructure.
Of course another answer is that for data access it's also great to build systems like KMS, Nitro, Wickr, CMK encryption, etc ... where we as an operator simply have no access to customer data in the first place. And those protections stand too.
This IS a runtime.
You import bauplan, write your functions and run them in straight into the cloud - you don't need anything more. When you want to make a pipeline you chain the functions together, and the system manages the dependencies, the containerization, the runtime, and gives you a git-like abstractions over runs, tables and pipelines.
Perhaps surprisingly, we decided to co-design the abstractions and the runtime, which allowed novel optimizations at the intersection of FaaS and data - e.g. rebuilding functions can be 15x faster than the corresponding AWS stack (https://arxiv.org/pdf/2410.17465). All capabilities are available to humans (CLI) and machines (SDK) through simple APIs.
Would love to hear the community’s thoughts on moving data engineering workflows closer to software abstractions: tables, functions, branches, CI/CD etc.
> (…)
> People who stress over code style, linting rules, or other minutia remain insane weirdos to me. Focus on more important things.
What you call “stressing over minutiae” others might call “caring for the craft”. Revered artisans are precisely the ones who care for the details. “Stressing” is your value judgement, not necessarily the ground truth.
What you’re essentially saying is “cherish the people who care up to the level I personally and subjectively think is right, and dismiss everyone who cares more as insane weirdos who cannot prioritise”.
In conversations like this, we are all too quick to project our experiences on the package managers and not sharing in what circumstances we are using them.
And even that first run is not particularly slow - _unless_ you depend on packages that are not available as wheels, which last I checked is not nearly as common nowadays as it was 10 years ago. However it can still happen: for example, if you are working with python 3.8 and you are using the latest version of some fancy library, they may have already stopped building wheels for that version of python. That means the package manager has to fall back to the sdist, and actually run the build scripts to acquire the metadata.
On top of all this, private package feeds (like the one provided by azure devops) sometimes don't provide a metadata API at all, meaning the package manager has to download every single package just to get the metadata.
The important bit of my little wall of text here though is that this is all true for all the other package managers as well. You can't necessarily attribute slow dependency resolution to a solver being written in C++ or pure python, given all of these other compounding factors which are often overlooked.
https://en.wikipedia.org/wiki/Literacy_in_the_United_States