学校的服务器以安全起见，不能自行安装软件，但可以

使用 conda 安装虚拟环境。
使用 singularity 运行 docker 容器。

使用conda不必多说，使用singularity就有点麻烦。需要首先在自己的电脑上制作docker镜像，然后上传到docker hub，然后在服务器上指定镜像链接就好了。

登陆ye老师服务器：

srun -p xye --mem=32000 --tasks-per-node=2 --cpus-per-task=2 --gres=gpu:1 --pty bash

只申请一个GPU，gpu:4就是4个。
参考：https://slurm.schedmd.com/tutorials.html

退出重新申请？用exit。

使用CUDA

使用singularity（docker）

编辑一个文件，命名为job.dat ：

#!/bin/bash -l
#
# Usage: gputest.sh
# Change job name and email address as needed 
#        
# -- our name ---
#SBATCH -J cyclegan_color
#SBATCH -o cyclegan_color.output
# partition
#SBATCH -p datasci
# to request a GPU
#SBATCH --gres=gpu:1
###SBATCH --gres=gpu:Rtx2080:TitanRtx：1
#Request an amount of RAM for the job
#SBATCH --mem=16G #16GB requested.
# Run on 1 node
#SBATCH --nodes=1
# Use 1 CPU
#SBATCH --ntasks-per-node=4
/bin/echo Running on host: `hostname`.
/bin/echo In directory: `pwd`
/bin/echo Starting on: `date`
# Load CUDA module
module load singularity
module load cuda     
#This is an example command
singularity exec --nv --bind /xye_data_nobackup docker://xxx/xx sh run.sh

然后提交：sbatch job.dat。

使用conda

那如果不用 singularity 呢？比如我就只想用conda 激活一个虚拟环境，跑下代码。

答：
1.进入ye老师服务器

srun -p xye --mem=16G --tasks-per-node=2 --gres=gpu:Rtx2080:1 --pty bash

2.进去以后module load cuda
3.conda 装pytorch的时候配上cuda10.2

Slurm

Slurm是一个job管理工具，通过它提交job，以后台长期跑一些脚本。
常见命令：

sbatch job.dat      # Job submission
squeue              # Job status
sinfo               # Cluster status
scancel [job id]    # delete job

神经网络学习笔记

学校的服务器使用笔记

登陆ye老师服务器：

使用CUDA

使用singularity（docker）

使用conda

Slurm