By "operator pushdown", i mean any ability to filter or map over the contents of the object on the server side in some way, sending only the results over the network to the client.
For example, say you have a huge CSV file of customer orders in a bucket. You might want to find the timestamp of all the orders which included a particular product. If all you can do is stream the whole file, then you need to do that, just to pick out a few timestamps. But you could imagine a kind of request where you say "only give me lines where the product ID is P01234, and only send the timestamp column". Perhaps you would express that as a pair of regular expressions, or a sed program, or a Lua script, or maybe the server would understand CSV and let you write something a bit like SQL. There are all sorts of ways it could be done. Providing a fully general way might be tricky, but it wouldn't need to be fully general to be useful.
I appreciate that if you want to do this sort of access frequently, you should probably be using a database, not object storage. But it seems like a very useful feature to layer on top of object storage, and one that feels like it should be fairly cheap to execute - the server has to do a small extra amount of computation, but then needs a lto less network bandwidth.
Despite it being an awesome feature I've been itching to use, I've never actually found a use for it beyond messing around. Most places where S3 Select might make sense seems to be subsumed (for my uses) by Athena. Athena has a rather large amount of conceptual and actual boilerplate to get up and running with, though, S3 Select requires no upfront planning beyond building a fancy query string (or using their SDK wrappers)
Where S3 Select is likely to become fiddly is anywhere multiple files are involved. Athena makes querying large collections of CSVs (etc) straightforward, and handles all the scheduling and results merging for you.
Deleted Comment
Deleted Comment
Deleted Comment
Given modern Python means type annotations everywhere, the convenience edge between it and modern C# (which dispenses with much of the javaesque boilerplate) is surprisingly thin, and the capabilities of the .net runtime far superior in many ways, making it quite an appealing alternative especially for perf sensitive stuff.
The N100 can be a fair step up compared to Rpi5 but even RK3588 is already 8 cores. Would be a shame if many of the current generation of exciting hackable x86 mini-platforms lock in at the N100 as it will feel obsolete years earlier the the N305.
I run/ran stuff on both, as well as various ARM SBCs and previous generations like J4125/N5XXX. Considering the core-count, RK3588 is still a better pick for many use-cases unless single-thread performance is that important. Benchmark comparison: https://bret.dk/intel-n100-a-challenge-to-arm/
The only thing I'd reserve judgement on is the tendency to throttle. I haven't got far enough to characterize it, but it's not clear how much value those extra cores will add over the n100 with TDP settings tweaked down in the BIOS, and if leaving the n305 to run at max TDP, heat/noise/cost/temperature-related instability may start to become an issue, especially when packing other hot components like a decent SSD into the tiny cases they come in.
Most recent example - converting huge amount of xml files to parquet. I started very fast with python + pyarrow, but when I realized that parallelizing execution would help enormously, I hit GIL or picking/unpickling/multiprocessing costs.
It did work in python, in the end, but I feel that writing that in Rust/C# (even if I don't know Rust besides tutorials) in the end would be much more performant.
> pickling
Sounds like if this is the tooling and the task at hand, about the most complex things that should be passing through the pickler are partitioned lists of filenames rather than raw data. E.g. you can have each partition generate a parquet for combining in a final step (pyarrow.concat_tables() looks useful), or if it were some other format you were working with, potentially sending flat arrays back to the parent process as giant bytestrings or similar
This is not to say the limitations don't suck, just that very often there are simple approaches to avoid most of the pain