检查job生成

前面章节我们已经实现第一个 task(t1.ecf 文件)。t1.ecf 脚本需要经过预处理生成 jobs file。这个过程由 ecflow_server 在将要运行 task 时自动完成。

我们还可以在 suite definition 加载到 ecflow_server 前检查 job creation。

文本方式

检查脚本生成仅在 Python 方式下可用。

如果 ecflow_server 无法定位 ecf script,请参看 ecf file location algorithm

Python

在 suite 定义加载到服务器前可以检查作业生成过程,检查包括:

  1. 定位 ecf 脚本文件,对应 suite 定义中的每个 task
  2. 进行预处理

当 suite definition 较长且包含许多 ecf script 时,这种检查可以节省大量时间。

检查 job creation 时需要注意一下几点:

  1. 检查独立于 ecflow_server,所以 ECF_PORTECF_NODE 将被设为默认值。
  2. job 文件扩展名为 .job0,服务器生成的 job 文件扩展名为 .job<1-n>ECF_TRYNO将不为0.
  3. 默认 job 文件将在 ecf 脚本同样目录下生成,请查看词汇表 ECF_JOB

使用 ecflow.Defs.check_job_creation 进行检查,修改 test.py

  1. import os
  2. from ecflow import Defs,Suite,Task,Edit
  3. print("Creating suite definition")
  4. home = os.path.join(os.getenv("HOME"), "course")
  5. defs = Defs(
  6. Suite('test',
  7. Edit(ECF_HOME=home),
  8. Task('t1')))
  9. print(defs)
  10. print("Checking job creation: .ecf -> .job0")
  11. print(defs.check_job_creation())
  12. # We can assert, so that we only progress once job creation works
  13. # assert len(defs.check_job_creation()) == 0, "Job generation failed"

运行该脚本:

  1. $python3 test.py
  2. Creating suite definition
  3. # 4.8.0
  4. suite test
  5. edit ECF_HOME '../../../build/course'
  6. task t1
  7. endsuite
  8. Checking job creation: .ecf -> .job0

运行上述脚本后,会在 test 目录下生成的 t1.job0,文件内容如下:

  1. #!/bin/ksh
  2. set -e # stop the shell on first error
  3. set -u # fail when using an undefined variable
  4. set -x # echo script lines as they are executed
  5. set -o pipefail # fail if last(rightmost) command exits with a non-zero status
  6. # Defines the variables that are needed for any communication with ECF
  7. export ECF_PORT=3141 # The server port number
  8. export ECF_HOST=localhost # The host name where the server is running
  9. export ECF_NAME=/test/t1 # The name of this current task
  10. export ECF_PASS=vOW08rvF # A unique password
  11. export ECF_TRYNO=0 # Current try number of the task
  12. export ECF_RID=$$ # record the process id. Also used for zombie detection
  13. # Define the path where to find ecflow_client
  14. # make sure client and server use the *same* version.
  15. # Important when there are multiple versions of ecFlow
  16. export PATH=/usr/local/apps/ecflow/4.8.0/bin:$PATH
  17. # Tell ecFlow we have started
  18. ecflow_client --init=$$
  19. # Define a error handler
  20. ERROR() {
  21. set +e # Clear -e flag, so we don't fail
  22. wait # wait for background process to stop
  23. ecflow_client --abort=trap # Notify ecFlow that something went wrong, using 'trap' as the reason
  24. trap 0 # Remove the trap
  25. exit 0 # End the script
  26. }
  27. # Trap any calls to exit and errors caught by the -e flag
  28. trap ERROR 0
  29. # Trap any signal that may cause the script to fail
  30. trap '{ echo "Killed by a signal"; ERROR ; }' 1 2 3 4 5 6 7 8 10 12 13 15
  31. echo "I am part of a suite that lives in ../../../build/course"
  32. wait # wait for background process to stop
  33. ecflow_client --complete # Notify ecFlow of a normal end
  34. trap 0 # Remove all traps
  35. exit 0 # End the shell

强烈建议随后的例子中使用 job creation 检查脚本。

任务

  1. 添加 job creation 检查

  2. 查看job文件 $ECF_HOME/test/t1.job0