如何找到R数据帧的分组汇总统计信息?
为了比较不同的组,我们需要每个组的摘要统计信息。它有助于我们观察两组之间的差异。摘要统计信息提供最小值,第一四分位数,中位数,第三四分位数和最大值。因此,我们可以比较每个组的这些值。要找到R数据帧的逐组汇总统计信息,我们可以使用tapply函数。
示例
请看以下数据帧-
> set.seed(99)
> x1<-sample(1:100,50,replace=TRUE)
> x2<-rep(c("G1","G2","G3","G4","G5"),times=10)
> df<-data.frame(x1,x2)
> head(df,20)
x1 x2
1 48 G1
2 33 G2
3 44 G3
4 22 G4
5 99 G5
6 62 G1
7 98 G2
8 32 G3
9 13 G4
10 20 G5
11 100 G1
12 31 G2
13 68 G3
14 9 G4
15 82 G5
16 88 G1
17 30 G2
18 86 G3
19 84 G4
20 32 G5查找每个组的x1的摘要统计量-
> tapply(df$x1, df$x2, summary) $G1 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 55.0 72.0 67.8 86.5 100.0 $G2 Min. 1st Qu. Median Mean 3rd Qu. Max. 4.0 31.5 60.5 52.4 69.5 98.0 $G3 Min. 1st Qu. Median Mean 3rd Qu. Max. 14.0 33.5 41.0 46.9 64.5 86.0 $G4 Min. 1st Qu. Median Mean 3rd Qu. Max. 9.00 23.75 53.00 53.30 82.75 97.00 $G5 Min. 1st Qu. Median Mean 3rd Qu. Max. 7.00 31.25 32.00 42.40 44.75 99.00
让我们再看一个例子-
> y1<-rep(c(letters[1:5]),times=5) > y2<-rep(c(14,25,13,12,41,52,44,28,17,30),times=c(2,5,3,3,1,5,1,2,2,1)) > df_y<-data.frame(y1,y2) > head(df_y,20) y1 y2 1 a 14 2 b 14 3 c 25 4 d 25 5 e 25 6 a 25 7 b 25 8 c 13 9 d 13 10 e 13 11 a 12 12 b 12 13 c 12 14 d 41 15 e 52 16 a 52 17 b 52 18 c 52 19 d 52 20 e 44 > tapply(df_y$y2, df_y$y1, summary) $a Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $b Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 14.0 25.0 26.2 28.0 52.0 $c Min. 1st Qu. Median Mean 3rd Qu. Max. 12.0 13.0 17.0 23.8 25.0 52.0 $d Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 17.0 25.0 29.6 41.0 52.0 $e Min. 1st Qu. Median Mean 3rd Qu. Max. 13.0 25.0 30.0 32.8 44.0 52.0
热门推荐
10 祝福语对联文案简短大气
11 简短祝福语中考女孩的话
12 项目建设春节祝福语简短
13 祝福语女友文案简短霸气
14 薛之谦祝福语简短
15 员工对同事祝福语简短
16 邻居女儿嫁人祝福语简短
17 新婚抖音祝福语简短
18 长辈体检怎么祝福语简短