参见:https://mp.weixin.qq.com/s/67rjY7w-Uh0AfnaxNoik8Q

    先前我们介绍过在后台运行R 脚本,对于耗时较长的代码运行,或者复杂的包的安装,我们可以使用该方法,从而不占用前台:

    011. 后台运行R 脚本

    直接安装一下:

    1. remotes::install_github("lindeloev/job")

    ps: 这里发现在win 下安装会发生报错:

    1. > remotes::install_github("lindeloev/job")
    2. 错误: Failed to install 'unknown package' from GitHub:
    3. 畸形'Config/testthat/edit ...'开头行!

    现在我们有更方便的方法了,只需要在代码使用job 包中的函数,就可以实现后台操作了:

    1. job::job(
    2. { tmp <- matrix(sample(letters, 1000, replace = T), ncol = 10) }
    3. )

    使用方式为:

    1. job::job({<your code>})

    其实只是从手动操作,变成了代码:

    06. 用job 包在命令行后台运行命令 - 图1

    如果我们想要将后台运行的结果和前台运行的结果分离,不相互污染,还可以将变量保存在一个新的环境中:

    1. job::job(brm_result = {
    2. fit = brm(model, data)
    3. fit = add_criterion(fit, "loo")
    4. print(summary(fit)) # Show a summary in the job
    5. the_test = hypothesis(fit, "hp > 0")
    6. })

    比如有多个任务:

    06. 用job 包在命令行后台运行命令 - 图2

    此外还有一些有用的信息:

    1. Finer control
    2. RStudio jobs spin up a new session, i.e., a new environment. By default, job::job() will make this environment identical to your current one. But you can fine control this:
    3. import: the default "auto" setting imports all objects that are referenced by the code into the job. Control this using job::job({}, import = c(model, data)). You can also import everything (import = "all") or nothing (import = NULL).
    4. packages: by default, all attached packages are attached in the job. Control this using job::job({}, packages = c("brms")) or set packages = NULL to load nothing. If brms is not loaded in your current session, adding library("brms") to the job code may be more readable.
    5. options: by default, all options are overwritten/inserted to the job. Control this using, e.g., job::job({}, opts = list(mc.cores = 2) or set opts = NULL to use default options. If you want to set job-specific options, adding options(mc.cores = 2) to the job code may be more readable.
    6. export: in the example above, we assigned the job environment to brm_result upon completion. Naturally, you can choose any name, e.g., job::job(fancy_name = {a = 5}). To return nothing, use an unnamed code chunk (insert results to globalenv() and remove everything before return: (job::job({a = 5; rm(list=ls())}). Returning nothing is useful when
    7. your main result is a text output or a file on the disk, or
    8. when the return is a very large object. The underlying rstudioapi::jobRunScript() is slow in the back-transfer so it's usually faster to saveRDS(obj, filename) them in the job and readRDS(filename) into your current session.
    9. Some use cases
    10. Model training, cross validation, or hyperparameter tuning: train multiple models simultaneously, each in their own job. If one fails, the others continue.
    11. Heavy I/O tasks, like processing large files. Save the results to disk and return nothing.
    12. Run unit tests and other code in an empty environment. By default, devtools::test() runs in the current environment, including manually defined variables (e.g., from the last test-run) and attached packages. Call job::job({devtools::test()}, import = NULL, packages = NULL, opts = NULL) to run the test in complete isolation.
    13. Upgrading packages
    14. See also
    15. job::job() is aimed at easing interactive development within RStudio. For larger problems, production code, and solutions that work outside of RStudio, check out:
    16. The future package's %<- operator combined with plan(multisession).
    17. The callr package is a general tool to run code in new R sessions.