向量的生成

用c()逐一放在一起

  1. > c(1,2,3)
  2. [1] 1 2 3
  3. > c("a","b","c","d","e")
  4. [1] "a" "b" "c" "d" "e"

连续的数字用冒号“:”

  1. > 1:5
  2. [1] 1 2 3 4 5

重复值用rep()函数

注意参数times和each的区别

  1. > rep(c("a","b"), times = 3)
  2. [1] "a" "b" "a" "b" "a" "b"
  3. > rep(c("a","b"), each = 3)
  4. [1] "a" "a" "a" "b" "b" "b"

等差序列用seq()函数

参数from,to,by可省略

  1. > seq(from=1,to=100,by=5)
  2. [1] 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96
  3. > seq(1,100,5)
  4. [1] 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

随机数用rnorm()函数

生成随机数,mean默认为0,sd默认为1,可根据实际情况修改

  1. > rnorm(n=5)
  2. [1] -0.1902086 -0.3159106 0.3200577 -1.3902890 0.8065010
  3. > rnorm(n=5,mean=100,sd=10)
  4. [1] 101.56500 99.52128 105.64515 77.00335 96.67940

通过组合产生更为复杂的向量

通过paste()和paste0()函数。区别在于:paste()中间间隔空格;paste0()默认直接相连

  1. > paste(rep("sample",n=5),1:5)
  2. [1] "sample 1" "sample 2" "sample 3" "sample 4" "sample 5"
  3. > paste0(rep("sample",n=5),1:5)
  4. [1] "sample1" "sample2" "sample3" "sample4" "sample5"
  5. #修改方法如下:
  6. > paste(rep("sample",n=5),1:5,sep="-")
  7. [1] "sample-1" "sample-2" "sample-3" "sample-4" "sample-5"
  8. > paste(rep("sample",n=5),1:5,sep="")
  9. [1] "sample1" "sample2" "sample3" "sample4" "sample5"

对单个向量进行一些操作

简单的统计函数

  1. > x = c(1,3,5,1)
  2. > x
  3. [1] 1 3 5 1
  4. > max(x) #最大值
  5. [1] 5
  6. > min(x) #最小值
  7. [1] 1
  8. > mean(x) #均值
  9. [1] 2.5
  10. > median(x) #中位数
  11. [1] 2
  12. > var(x) #方差
  13. [1] 3.666667
  14. > sd(x) #标准差
  15. [1] 1.914854
  16. > sum(x) #总和
  17. [1] 10

重要的常用统计函数

  1. > length(x) #长度:向量里面元素的个数
  2. [1] 4
  3. > unique(x) #去重复
  4. [1] 1 3 5
  5. > duplicated(x) #对应元素是否重复
  6. [1] FALSE FALSE FALSE TRUE
  7. > table(x) #重复值统计
  8. x
  9. 1 3 5
  10. 2 1 1
  11. > sort(x)
  12. [1] 1 1 3 5

对两个向量进行一些操作

逻辑比较

逻辑比较,生成等长的逻辑向量

  1. > x = c(1,3,5,1)
  2. > y = c(3,2,5,6)
  3. > x == y #x对应位置与y中的值是否相等
  4. [1] FALSE FALSE TRUE FALSE
  5. > x %in% y #x的每个元素在y中存在吗
  6. [1] FALSE TRUE TRUE FALSE

数学计算

简单加减乘除运算

  1. > x + y
  2. [1] 4 5 10 7

连接

paste()/paste0()函数

  1. > paste(x,y,sep=",")
  2. [1] "1,3" "3,2" "5,5" "1,6"

交集、并集、补集

  1. > intersect(x,y)
  2. [1] 3 5
  3. > union(x,y)
  4. [1] 1 3 5 2 6
  5. > setdiff(x,y)
  6. [1] 1
  7. > setdiff(y,x)
  8. [1] 2 6

两向量长度不一致

循环补齐:短的向量自动补齐至长的向量,并进行比较

  1. > x = c(1,3,5,6,2)
  2. > y = c(3,2,5)
  3. > x == y # 啊!warning!
  4. [1] FALSE FALSE TRUE FALSE TRUE
  5. Warning message:
  6. In x == y : longer object length is not a multiple of shorter object length

向量取子集

根据逻辑值取子集

  1. > x <- 8:12
  2. > #根据逻辑值取子集
  3. > x[x==10]
  4. [1] 10
  5. > x[x<12]
  6. [1] 8 9 10 11
  7. > x[x %in% c(9,13)]
  8. [1] 9

根据位置取子集

  1. > #根据位置取子集
  2. > x[4]
  3. [1] 11
  4. > x[2:4]
  5. [1] 9 10 11
  6. > x[c(1,5)]
  7. [1] 8 12
  8. > x[-4]
  9. [1] 8 9 10 12
  10. > x[-(2:4)]
  11. [1] 8 12

修改向量中的某个/某些元素

原理为:取子集+赋值

改一个元素

  1. > x <- 8:12
  2. > x
  3. [1] 8 9 10 11 12
  4. > x[4] <- 40
  5. > x
  6. [1] 8 9 10 40 12

改多个元素

  1. > x
  2. [1] 8 9 10 40 12
  3. > x[c(1,5)] <- c(80,20)
  4. > x
  5. [1] 80 9 10 40 20

简单的向量作图

  1. > k1 = rnorm(12);k1
  2. [1] -1.1376829 1.2146663 -1.5720444 0.4155029 0.1068474
  3. [6] 0.1332813 -0.1999525 0.3122200 0.4607502 0.5994738
  4. [11] 0.1071601 -0.5560571
  5. > k2 = rep(c("a","b","c","d"),each = 3);k2
  6. [1] "a" "a" "a" "b" "b" "b" "c" "c" "c" "d" "d" "d"
  7. > plot(k1)
  8. > boxplot(k1~k2)

image.png
image.png

order()与match()函数的区别

order()函数返回sort函数排序后的下标

  1. kids = c("jimmy","nicker","lucy","doodle","tony")
  2. > scores = c(100,59,73,95,45)
  3. > sort(scores)
  4. [1] 45 59 73 95 100
  5. > order(scores)
  6. [1] 5 2 3 4 1
  7. > kids[order(scores)]
  8. [1] "tony" "nicker" "lucy" "doodle" "jimmy"

match()函数考察向量的元素在另一向量中的位次

  1. > x <- c("A","B","C","D","E")
  2. > x[c(2, 4, 5, 1, 3)]
  3. [1] "B" "D" "E" "A" "C"
  4. > y <- c("B","D","E","A","C")
  5. > match(y,x)
  6. [1] 2 4 5 1 3
  7. > x[match(y,x)]
  8. [1] "B" "D" "E" "A" "C"