understand object and its name

  • More accurately predict the performance and memory usage of your code.
  • Write faster code by avoiding accidental copies, a major source of slow code.
  • Better understand R’s functional programming tools. ```r x <- c(1, 2, 3)

    以上面的代码为例,这个代码坐了什么事情?

    他创建了一个对象叫做X,这个对象包含了1,2,3

    实际上她应该是坐了两件事,他创建了一个对象,c(1,2,3),然后把这个对象命名为了x

  1. ![image.png](https://cdn.nlark.com/yuque/0/2022/png/21391719/1655724559485-1ef1d75c-57e0-418b-8d5a-a617560949ae.png#clientId=u34e16cd8-cc84-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=ubd63d0ef&margin=%5Bobject%20Object%5D&name=image.png&originHeight=153&originWidth=401&originalType=url&ratio=1&rotation=0&showTitle=false&size=9453&status=done&style=none&taskId=ud198fa29-7de9-4c18-96c5-3a93e9f364a&title=)<br />以上面的图片为例
  2. 如果是数据框修改<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/21391719/1655728628675-8142d557-e042-4eae-b92d-acf10ec37e50.png#clientId=u34e16cd8-cc84-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=332&id=ue5fb18b9&margin=%5Bobject%20Object%5D&name=image.png&originHeight=590&originWidth=531&originalType=url&ratio=1&rotation=0&showTitle=false&size=48694&status=done&style=none&taskId=u6871377e-27ea-4c6b-aa7f-0dcf273ba8b&title=&width=299)<br />这表明列修改,只会修改某一列<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/21391719/1655728668066-9076da12-e641-42e1-95e9-f43a3e034f2e.png#clientId=u34e16cd8-cc84-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=219&id=uab145897&margin=%5Bobject%20Object%5D&name=image.png&originHeight=437&originWidth=814&originalType=url&ratio=1&rotation=0&showTitle=false&size=48595&status=done&style=none&taskId=u1cb2c14b-7d36-4b64-9500-2192f0030b5&title=&width=407)<br />如果修改的是行,则每一行都会修改
  3. <a name="qZHMg"></a>
  4. ## 对象的大小
  5. 列表的元素是对值的引用,因此列表的大小可能比您预期的要小得多:
  6. ```r
  7. x <- runif(1e6)
  8. obj_size(x)
  9. #> 8,000,048 B
  10. y <- list(x, x, x)
  11. obj_size(y)
  12. #> 8,000,128 B
  13. ## y仅比X大 80 字节,x这是一个包含三个元素的空列表的大小
  14. obj_size(list(NULL, NULL, NULL))
  15. #> 80 B

由于 R 使用全局字符串池,字符向量占用的内存比您预期的要少:重复一个字符串 100 次并不会使它占用 100 倍的内存。

  1. banana <- "bananas bananas bananas"
  2. obj_size(banana)
  3. #> 136 B
  4. obj_size(rep(banana, 100))
  5. #> 928 B

R 不是存储序列中的每个数字,而是存储第一个和最后一个数字。这意味着每个序列,无论有多大,都是相同的大小

  1. obj_size(1:3)
  2. #> 680 B
  3. obj_size(1:1e3)
  4. #> 680 B
  5. obj_size(1:1e6)
  6. #> 680 B
  7. obj_size(1:1e9)
  8. #> 680 B

就地修改

修改 R 对象通常会创建一个副本。有两个例外:

  • 具有单个绑定的对象获得了特殊的性能优化。
  • 环境,一种特殊类型的对象,总是在原地修改。 ```r v <- c(1, 2, 3)

v[[3]] <- 4

  1. ![image.png](https://cdn.nlark.com/yuque/0/2022/png/21391719/1655731297331-ba745aee-3eff-4b79-9d55-13ee4ac6792f.png#clientId=u34e16cd8-cc84-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=uc05fc43a&margin=%5Bobject%20Object%5D&name=image.png&originHeight=153&originWidth=401&originalType=url&ratio=1&rotation=0&showTitle=false&size=9222&status=done&style=none&taskId=u21afc11a-c55a-41d2-a162-bd9dec4921d&title=)![image.png](https://cdn.nlark.com/yuque/0/2022/png/21391719/1655731301418-2738c02a-8280-474e-8285-c5f51b1cc1d0.png#clientId=u34e16cd8-cc84-4&crop=0&crop=0&crop=1&crop=1&from=paste&id=u6f99b2ea&margin=%5Bobject%20Object%5D&name=image.png&originHeight=153&originWidth=401&originalType=url&ratio=1&rotation=0&showTitle=false&size=8993&status=done&style=none&taskId=u94b3829e-e09b-4677-9230-f580cf9ddae&title=)<br />可见v还是绑定于0x207
  2. 这里有个for 循环速度的问题:不太懂
  3. ```r
  4. x <- data.frame(matrix(runif(5 * 1e4), ncol = 5))
  5. medians <- vapply(x, median, numeric(1))
  6. for (i in seq_along(medians)) {
  7. x[[i]] <- x[[i]] - medians[[i]]
  8. }
  9. #> tracemem[0x7f80c429e020 -> 0x7f80c0c144d8]:
  10. #> tracemem[0x7f80c0c144d8 -> 0x7f80c0c14540]: [[<-.data.frame [[<-
  11. #> tracemem[0x7f80c0c14540 -> 0x7f80c0c145a8]: [[<-.data.frame [[<-
  12. #> tracemem[0x7f80c0c145a8 -> 0x7f80c0c14610]:
  13. #> tracemem[0x7f80c0c14610 -> 0x7f80c0c14678]: [[<-.data.frame [[<-
  14. #> tracemem[0x7f80c0c14678 -> 0x7f80c0c146e0]: [[<-.data.frame [[<-
  15. #> tracemem[0x7f80c0c146e0 -> 0x7f80c0c14748]:
  16. #> tracemem[0x7f80c0c14748 -> 0x7f80c0c147b0]: [[<-.data.frame [[<-
  17. #> tracemem[0x7f80c0c147b0 -> 0x7f80c0c14818]: [[<-.data.frame [[<-
  18. #> tracemem[0x7f80c0c14818 -> 0x7f80c0c14880]:
  19. #> tracemem[0x7f80c0c14880 -> 0x7f80c0c148e8]: [[<-.data.frame [[<-
  20. #> tracemem[0x7f80c0c148e8 -> 0x7f80c0c14950]: [[<-.data.frame [[<-
  21. #> tracemem[0x7f80c0c14950 -> 0x7f80c0c149b8]:
  22. #> tracemem[0x7f80c0c149b8 -> 0x7f80c0c14a20]: [[<-.data.frame [[<-
  23. #> tracemem[0x7f80c0c14a20 -> 0x7f80c0c14a88]: [[<-.data.frame [[<-
  24. # 如果像上诉一样循环速度很慢
  25. y <- as.list(x)
  26. cat(tracemem(y), "\n")
  27. #> <0x7f80c5c3de20>
  28. for (i in 1:5) {
  29. y[[i]] <- y[[i]] - medians[[i]]
  30. }
  31. #> tracemem[0x7f80c5c3de20 -> 0x7f80c48de210]:
  32. #但如果这样循环,速度是很快的
  33. ???why

Unbinding and the garbage collector