``` df.with_column( map_multiple( |columns| { let col1 = columns[0].i32()?; let col2 = columns[1].str()?; let col3 = columns[3].f64()?; col1.into_iter() .zip(col2) .zip(col3) .map(|((x1, x2), x3)| { let (x1, x2, x3) = (x1?, x2?, x3?); Some(func(x1, x2, x3)) }) .collect::<StringChunked>() .into_column() }, [col("a"), col("b"), col("c")], GetOutput::from_type(DataType::String), ) .alias("new_col"), ); ```
Not much polars can do about that in Rust, that's just what the language requires. But in Python it would look something like
``` df.with_columns( pl.struct("a", "b", "c") .map_elements( lambda row: func(row["a"], row["b"], row["c"]), return_dtype=pl.String ) .alias("new_col") ) ```
Obviously the performance is nowhere close to comparable because you're calling a python function for each row, but this should give a sense of how much cleaner Python tends to be.
I'm ignorant about the exact situation in Polars, but it seems like this is the same problem that web frameworks have to handle to enable registering arbitrary functions, and they generally do it with a FromRequest trait and macros that implement it for functions of up to N arguments. I'm curious if there are were attempts that failed for something like FromDataframe to enable at least |c: Col<i32>("a"), c2: Col<f64>("b")| {...}
https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...
https://github.com/tokio-rs/axum/blob/86868de80e0b3716d9ef39...
1. There are no variadic functions so you need to take a tuple: `|(Col<i32>("a"), Col<f64>("b"))|`
2. Turbofish! `|(Col::<i32>("a"), Col::<f64>("b"))|`. This is already getting quite verbose.
3. This needs to be general over all expressions (such as `col("a").str.to_lowercase()`, `col("b") * 2`, etc), so while you could pass a type such as Col if it were IntoExpr, its conversion into an expression would immediately drop the generic type information because Expr doesn't store that (at least not in a generic parameter; the type of the underlying series is always discovered at runtime). So you can't really skip those `.i32()?` calls.
Polars definitely made the right choice here — if Expr had a generic parameter, then you couldn't store Expr of different output types in arrays because they wouldn't all have the same type. You'd have to use tuples, which would lead to abysmal ergonomics compared to a Vec (can't append or remove without a macro; need a macro to implement functions for tuples up to length N for some gargantuan N). In addition to the ergonomics, Rust’s monomorphization would make compile times absolutely explode if every combination of input Exprs’ dtypes required compiling a separate version of each function, such as `with_columns()`, which currently is only compiled separately for different container types.
The reason web frameworks can do this is because of `$( $ty: FromRequestParts<S> + Send, )*`. All of the tuple elements share the generic parameter `S`, which would not be the case in Polars — or, if it were, would make `map` too limited to be useful.