-
Notifications
You must be signed in to change notification settings - Fork 274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling non standard functions (dplyr::n()) within map2() and across(). #1101
Comments
Could you please rework your reproducible example to use the reprex package ? That makes it easier to see both the input and the output, formatted in such a way that I can easily re-run in a local session. Thanks! |
Hi @hadley, I've updated my R to 4.3.2 (along with all older packages) to rule out any old packages/R problems. library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(purrr)
library(stringr)
value_fields <- c('Sepal.Length', 'Sepal.Width')
fnc_str <- c('sum', 'n')
pivot <- map2( value_fields,
fnc_str,
~ iris %>%
group_by( Species ) %>%
summarise( across(all_of(.x),
match.fun(.y),
.names = str_c("{.col}_", .y)),
.groups = 'drop')
) %>%
reduce(inner_join)
#> Error in `map2()`:
#> ℹ In index: 2.
#> Caused by error in `summarise()`:
#> ℹ In argument: `across(all_of(.x), match.fun(.y), .names =
#> str_c("{.col}_", .y))`.
#> ℹ In group 1: `Species = setosa`.
#> Caused by error in `across()`:
#> ! Can't compute column `Sepal.Width_n`.
#> Caused by error:
#> ! unused argument (Sepal.Width)
#> Backtrace:
#> ▆
#> 1. ├─... %>% reduce(inner_join)
#> 2. ├─purrr::reduce(., inner_join)
#> 3. │ └─purrr:::reduce_impl(.x, .f, ..., .init = .init, .dir = .dir)
#> 4. │ └─purrr:::reduce_init(.x, .init, left = left, error_call = .purrr_error_call)
#> 5. │ └─rlang::is_empty(x)
#> 6. ├─purrr::map2(...)
#> 7. │ └─purrr:::map2_("list", .x, .y, .f, ..., .progress = .progress)
#> 8. │ ├─purrr:::with_indexed_errors(...)
#> 9. │ │ └─base::withCallingHandlers(...)
#> 10. │ ├─purrr:::call_with_cleanup(...)
#> 11. │ └─global .f(.x[[i]], .y[[i]], ...)
#> 12. │ └─iris %>% group_by(Species) %>% ...
#> 13. ├─dplyr::summarise(...)
#> 14. ├─dplyr:::summarise.grouped_df(...)
#> 15. │ └─dplyr:::summarise_cols(.data, dplyr_quosures(...), by, "summarise")
#> 16. │ ├─base::withCallingHandlers(...)
#> 17. │ └─dplyr:::map(quosures, summarise_eval_one, mask = mask)
#> 18. │ └─base::lapply(.x, .f, ...)
#> 19. │ └─dplyr (local) FUN(X[[i]], ...)
#> 20. │ ├─base::withCallingHandlers(...)
#> 21. │ └─mask$eval_all_summarise(quo)
#> 22. │ └─dplyr (local) eval()
#> 23. └─base::.handleSimpleError(...)
#> 24. └─dplyr (local) h(simpleError(msg, call))
#> 25. └─rlang::abort(msg, call = call("across"), parent = cnd) Created on 2023-11-02 with reprex v2.0.2 |
Thanks. I made it a bit simpler to illustrate the key problem: library(tidyverse)
vars <- c('Sepal.Length', 'Sepal.Width')
funs <- c('sum', 'n')
map2(vars, funs, function(var, fun) {
iris |> summarise(across(all_of(var), match.fun(fun)))
})
#> Error in `map2()`:
#> ℹ In index: 2.
#> Caused by error in `summarise()`:
#> ℹ In argument: `across(all_of(var), match.fun(fun))`.
#> Caused by error in `across()`:
#> ! Can't compute column `Sepal.Width`.
#> Caused by error:
#> ! unused argument (Sepal.Width) Created on 2023-11-02 with reprex v2.0.2 You can resolve this just by wrapping library(tidyverse)
vars <- c('Sepal.Length', 'Sepal.Width')
funs <- list(sum, function(x) n())
map2(vars, funs, function(var, fun) {
iris |> summarise(across(all_of(var), fun))
})
#> [[1]]
#> Sepal.Length
#> 1 876.5
#>
#> [[2]]
#> Sepal.Width
#> 1 150 Created on 2023-11-02 with reprex v2.0.2 And then you can get better names by feeding across a named list: library(tidyverse)
vars <- c('Sepal.Length', 'Sepal.Width')
funs <- list(list(sum = sum), list(n = function(x) n()))
map2(vars, funs, function(var, fun) {
iris |> summarise(across(all_of(var), fun))
})
#> [[1]]
#> Sepal.Length_sum
#> 1 876.5
#>
#> [[2]]
#> Sepal.Width_n
#> 1 150 Created on 2023-11-02 with reprex v2.0.2 |
Thanks @hadley! |
I want to programmatically summarize/pivot a data.frame and found a nice solution here:
How do I build a dplyr summarize statement programmatically?
https://stackoverflow.com/questions/68853709/how-do-i-build-a-dplyr-summarize-statement-programmatically
A modified version of the example in the link is given below:
However, it does not work with the special dplyr function n().
The problem is because how the function
n()
is called within across.I found a work-around for how to treat
n()
from the following stackoverflow post.How to count rows by group with n() inside dplyr::across()?
https://stackoverflow.com/questions/66161658/how-to-count-rows-by-group-with-n-inside-dplyracross
library(dplyr)
This example does not work:
The solution of calling
~n()
instead of justn
works.I applied this solution to the map2() example, but it fails.
After a lot of experimenting I found a work-around by wrapping the match.fun in a lambda.
However, it does not work with functions that take parameters because no values are passed to them.
Next up, I tried to use an
if()
statement to callmatch.fun()
in the correct way depending on the number of parameters the summarizing function takes, but I end up with errors.I'm out of ideas. How to allow
map2()
andacross()
to run "normal" functions and the special n() that does not take any arguments?The text was updated successfully, but these errors were encountered: