Here's some completely not-the-point of the article code review since I can't help myself. If you can set up earlier steps give you a named list for `col_grouping`, and use `lapply`, the code is a little more concise:
efficient_flow_agg <- function(dat, col_grouping, gpcol_name="GroupMembership") {
make_postproc <- function(gp, groups) {
gp$preproc(dat[gp$which_cols]) |>
lapply(collapse::BY, groups, gp$aggfun) |>
gp$postproc()
}
col_grouping |>
lapply(make_postproc, groups = dat[[gpcol_name]]) |>
as.data.frame()
}
* I had previously written here that `tapply` is probably faster, but apparently `tapply` does exactly `unlist(lapply(split(x, g), f)))` anyway? wtf R. Strange there's not something like `collapse::BY` in base R.Deleted Comment
That is, time is a cost. Commuting takes time. An hour in and an hour out is a 25% increase in time devoted to work.
Commuting also has direct cost: fuel and/or transportation, risk of accident, stress, etc. Commuting also limits where you can live, and the taxes you pay. WFH does not.
For some people, less is in fact more. To fame it as an absolute "lower pay" is naive, if not irresponsible.
Why not do a piece that walks people through the hard and soft considerations? That's more beneficial than parroting a shallow - and perhaps false - narrative?
Deleted Comment
I will post for a senior level position and get 200 resumes with zero experience. Just a undergrad or grad degree, but never in a relevant technology/skill set. Maybe two will fit the bill. By the time I get ahold of them they already have another job.
If I post a junior level position, for someone just out of college or a code school…crickets. No one applies.
However, both patterns are another special case how identifiers are resolved in the expression. Aren't `.env` and `.data` both valid variable and column names? So what happens if I have a column named `.data`?
Another example, which is the reason why we chose the `:column` style to refer to columns in `DataFramesMeta.jl` and `DataFrameMacros.jl`:
What happens if you have the expression `mutate(df, b = log(a))`. Both `log` and `a` are symbols, but `log` is not treated as a column. Maybe that's because it's used in a function-like fashion? Maybe because R looks at the value of `log` and `a` in their scope and sees that `log` is a function an `a` isn't?
In Julia DataFrames, it's totally valid to have a column that stores different functions. With the dplyr like syntax rules it would not be possible to express a function call with a function stored in a column, if the pattern really is that function syntax means a symbol is not looked up in the dataframe anymore.
In Julia DataFrameMacros.jl for example, if you had a column named `:func` you could do `@transform(df, :b = :func(:a))` and it would be clear that `:func` resolves to a column.
This particular example might seem like a niche problem, but it's just one of these tradeoffs that you have to make when overloading syntax with a different meaning. I personally like it if there's a small rule set which is then consistently applied. I'd argue that's not always the case with dplyr.
Personally I'll happily take not being able to use those as column names if it means I can avoid always typing : before every in-data variable, but your comment gave me a better understanding of why it would be bad for some other person or scenario, perhaps where short term ease-of-use is lower on the list of priorities.
For your second example, it doesn't come up in R because a data frame column cannot be a function. Columns must be vectors (including lists) and you could have a vector where one or all elements are functions, but the column itself cannot not be a function (functions are not vectors), so there's no ambiguity there. To call a function stored in your data frame you'd have to access an element of the column, and any access method, e.g. `[[` or `$` would make the resulting set of characters invalid as the name of an object (without backticks, which would then disambiguate the intent)
df <- tibble(x = list(function(x) x + 1))
df %>%
mutate(y = x[[1]](3))
Separate from dplyr, in R when you use `(` to call a function it searches only for functions by that name. log <- 3
log(1)
# 0
frog <- 3
frog(3)
# Error in frog(3) : could not find function "frog"
log <- function(x) x^2
log(1)
# 1
What?
https://chatgpt.com/share/06d13092-e698-458c-a225-9ac93bf279...
> Who won the 2020 election?
> Joe Biden won the 2020 United States presidential election. He defeated the incumbent president, Donald Trump.