PyTorch - pytorch 分布式 - 《Python》

GPU设置
torch.distributed.barrier()
关于指定GPU？
参考

GPU设置

torch.cuda.device_count() # 每台机器有多少个GPU
torch.distributed.get_world_size()  # 进程数，三个worker，每个2个GPU
-- 结果
torch.cuda.device_count()=2,get_word_size=3

torch.distributed.barrier()

当进程进入时，会被阻塞，等待其他所有的进程都进入时，才打开。
所以，下面这个例子的意思是说，如果不是-1或0号进程，就阻塞，等待其他进程，然后此时0号进行并没有被阻塞，继续执行，直到下面if的判断，也进入阻塞，此时，所有进程都进入阻塞，即同步完成，可以都继续执行。

if args.local_rank not in [-1, 0]:
        # Make sure only the first process in distributed training process the dataset,
        # and the others will use the cache
        torch.distributed.barrier()  
... (preprocesses the data and save the preprocessed data)
if args.local_rank == 0:
        torch.distributed.barrier()

参考：https://stackoverflow.com/questions/59760328/how-does-torch-distributed-barrier-work

关于指定GPU？

什么时候需要指定GPU？
torch.cuda.set_device
https://www.cnblogs.com/darkknightzh/p/6836568.html