如何找到R数据帧的分组汇总统计信息?
为了比较不同的组,我们需要每个组的摘要统计信息。它有助于我们观察两组之间的差异。摘要统计信息提供最小值,第一四分位数,中位数,第三四分位数和最大值。因此,我们可以比较每个组的这些值。要找到R数据帧的逐组汇总统计信息,我们可以使用tapply函数。
示例
请看以下数据帧-
> set.seed(99)
> x1<-sample(1:100,50,replace=TRUE)
> x2<-rep(c("G1","G2","G3","G4","G5"),times=10)
> df<-data.frame(x1,x2)
> head(df,20)
x1 x2
1 48 G1
2 33 G2
3 44 G3
4 22 G4
5 99 G5
6 62 G1
7 98 G2
8 32 G3
9 13 G4
10 20 G5
11 100 G1
12 31 G2
13 68 G3
14 9 G4
15 82 G5
16 88 G1
17 30 G2
18 86 G3
19 84 G4
20 32 G5查找每个组的x1的摘要统计量-
> tapply(df$x1, df$x2, summary) $G1 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 55.0 72.0 67.8 86.5 100.0 $G2 Min. 1st Qu. Median Mean 3rd Qu. Max. 4.0 31.5 60.5 52.4 69.5 98.0 $G3 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 33.5 41.0 46.9 64.5 86.0 $G4 Min. 1st Qu. Median Mean 3rd Qu. Max. 9.00 23.75 53.00 53.30 82.75 97.00 $G5 Min. 1st Qu. Median Mean 3rd Qu. Max. 7.00 31.25 32.00 42.40 44.75 99.00
让我们再看一个例子-
> y1<-rep(c(letters[1:5]),times=5) > y2<-rep(c(14,25,13,12,41,52,44,28,17,30),times=c(2,5,3,3,1,5,1,2,2,1)) > df_y<-data.frame(y1,y2) > head(df_y,20) y1 y2 1 a 14 2 b 14 3 c 25 4 d 25 5 e 25 6 a 25 7 b 25 8 c 13 9 d 13 10 e 13 11 a 12 12 b 12 13 c 12 14 d 41 15 e 52 16 a 52 17 b 52 18 c 52 19 d 52 20 e 44 > tapply(df_y$y2, df_y$y1, summary) $a Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $b Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $c Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 13.0 17.0 23.8 25.0 52.0 $d Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 17.0 25.0 29.6 41.0 52.0 $e Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 25.0 30.0 32.8 44.0 52.0
热门推荐
10 妈妈生日祝福语简短温暖
11 对上司的简短祝福语
12 简短有内涵的祝福语
13 升职祝福语朋友简短精辟
14 新人结婚简短祝福语大全
15 小考加油文案祝福语简短
16 给老板祝福语简短精辟
17 高考已上岸祝福语简短
18 哥哥新婚祝福语创意简短