Rate this Page

torch.cuda.comm.scatter#

torch.cuda.comm.scatter(tensor,devices=None,chunk_sizes=None,dim=0,streams=None,*,out=None)[source]#

Scatters tensor across multiple GPUs.

Parameters
  • tensor (Tensor) – tensor to scatter. Can be on CPU or GPU.

  • devices (Iterable[torch.device,str orint],optional) – an iterable ofGPU devices, among which to scatter.

  • chunk_sizes (Iterable[int],optional) – sizes of chunks to be placed oneach device. It should matchdevices in length and sums totensor.size(dim). If not specified,tensor will be dividedinto equal chunks.

  • dim (int,optional) – A dimension along which to chunktensor.Default:0.

  • streams (Iterable[torch.cuda.Stream],optional) – an iterable of Streams, amongwhich to execute the scatter. If not specified, the default stream willbe utilized.

  • out (Sequence[Tensor],optional,keyword-only) – the GPU tensors tostore output results. Sizes of these tensors must match that oftensor, except fordim, where the total size mustsum totensor.size(dim).

Note

Exactly one ofdevices andout must be specified. Whenout is specified,chunk_sizes must not be specified andwill be inferred from sizes ofout.

Returns

  • Ifdevices is specified,

    a tuple containing chunks oftensor, placed ondevices.

  • Ifout is specified,

    a tuple containingout tensors, each containing a chunk oftensor.