tidyfst包 - 组内操作 - 《R语言》

group_dt

描述

在指定的组内执行数据操作。

Usage

group_dt(.data, by = NULL, …)

rowwise_dt(.data, …)

Arguments

.data	A data.frame
by	分组变量的无引号名称列表中的分组变量的无引号名称。
…	可以在data.frame上实现的任何数据操作参数。

detail

如果您想在group_dt中使用summarise_dt和mutate_dt，那么最好在这些函数中使用“by”参数，这样会快得多，因为您不需要使用. sd(这会花费额外的时间来复制)。

iris %>% group_dt(by = Species,slice_dt(1:2)) #equal to
iris %>% group_by_dt(Species) %>% group_exe_dt(head(2))
iris %>% group_dt(Species,filter_dt(Sepal.Length == max(Sepal.Length)))
iris %>% group_dt(Species,summarise_dt(new = max(Sepal.Length)))
# you can pipe in the `group_dt`
iris %>% group_dt(Species,
                  mutate_dt(max= max(Sepal.Length)) %>%
                    summarise_dt(sum=sum(Sepal.Length)))
# for users familiar with data.table, you can work on .SD directly
# following codes get the first and last row from each group
iris %>%
  group_dt(
    by = Species,
    rbind(.SD[1],.SD[.N])
  )
#' # for summarise_dt, you can use "by" to calculate within the group
mtcars %>%
  summarise_dt(
    disp = mean(disp),
    hp = mean(hp),
    by = cyl
  )
# equal to
mtcars %>%
  group_dt(cyl,
           summarise_dt(
             disp = mean(disp),
             hp = mean(hp))
    )
mtcars %>%
  group_dt(by =.(vs,am),
           summarise_dt(avg = mean(mpg)))
mtcars %>%
  group_dt(by =.(vs,am),
           summarise_dt(avg = mean(mpg)))
# examples for `rowwise_dt`
df <- data.table(x = 1:2, y = 3:4, z = 4:5)
df %>% mutate_dt(m = mean(c(x, y, z)))
df %>% rowwise_dt(
  mutate_dt(m = mean(c(x, y, z)))
)