检查job生成
前面章节我们已经实现第一个 task(t1.ecf 文件)。t1.ecf 脚本需要经过预处理生成 jobs file。这个过程由 ecflow_server 在将要运行 task 时自动完成。
我们还可以在 suite definition 加载到 ecflow_server 前检查 job creation。
文本方式
检查脚本生成仅在 Python 方式下可用。
如果 ecflow_server 无法定位 ecf script,请参看 ecf file location algorithm。
Python
在 suite 定义加载到服务器前可以检查作业生成过程,检查包括:
- 定位 ecf 脚本文件,对应 suite 定义中的每个 task
- 进行预处理
当 suite definition 较长且包含许多 ecf script 时,这种检查可以节省大量时间。
检查 job creation 时需要注意一下几点:
- 检查独立于
ecflow_server,所以ECF_PORT和ECF_NODE将被设为默认值。 - job 文件扩展名为
.job0,服务器生成的 job 文件扩展名为.job<1-n>,ECF_TRYNO将不为0. - 默认 job 文件将在 ecf 脚本同样目录下生成,请查看词汇表 ECF_JOB。
使用 ecflow.Defs.check_job_creation 进行检查,修改 test.py
import osfrom ecflow import Defs,Suite,Task,Editprint("Creating suite definition")home = os.path.join(os.getenv("HOME"), "course")defs = Defs(Suite('test',Edit(ECF_HOME=home),Task('t1')))print(defs)print("Checking job creation: .ecf -> .job0")print(defs.check_job_creation())# We can assert, so that we only progress once job creation works# assert len(defs.check_job_creation()) == 0, "Job generation failed"
运行该脚本:
$python3 test.pyCreating suite definition# 4.8.0suite testedit ECF_HOME '../../../build/course'task t1endsuiteChecking job creation: .ecf -> .job0
运行上述脚本后,会在 test 目录下生成的 t1.job0,文件内容如下:
#!/bin/kshset -e # stop the shell on first errorset -u # fail when using an undefined variableset -x # echo script lines as they are executedset -o pipefail # fail if last(rightmost) command exits with a non-zero status# Defines the variables that are needed for any communication with ECFexport ECF_PORT=3141 # The server port numberexport ECF_HOST=localhost # The host name where the server is runningexport ECF_NAME=/test/t1 # The name of this current taskexport ECF_PASS=vOW08rvF # A unique passwordexport ECF_TRYNO=0 # Current try number of the taskexport ECF_RID=$$ # record the process id. Also used for zombie detection# Define the path where to find ecflow_client# make sure client and server use the *same* version.# Important when there are multiple versions of ecFlowexport PATH=/usr/local/apps/ecflow/4.8.0/bin:$PATH# Tell ecFlow we have startedecflow_client --init=$$# Define a error handlerERROR() {set +e # Clear -e flag, so we don't failwait # wait for background process to stopecflow_client --abort=trap # Notify ecFlow that something went wrong, using 'trap' as the reasontrap 0 # Remove the trapexit 0 # End the script}# Trap any calls to exit and errors caught by the -e flagtrap ERROR 0# Trap any signal that may cause the script to failtrap '{ echo "Killed by a signal"; ERROR ; }' 1 2 3 4 5 6 7 8 10 12 13 15echo "I am part of a suite that lives in ../../../build/course"wait # wait for background process to stopecflow_client --complete # Notify ecFlow of a normal endtrap 0 # Remove all trapsexit 0 # End the shell
强烈建议随后的例子中使用 job creation 检查脚本。
任务
添加 job creation 检查
查看job文件
$ECF_HOME/test/t1.job0
