All Reduce Synchronizer¶

AllReduce Synchronizer.

AllReduce Synchronizer.

This AllReduce Synchronizer currently uses TensorFlow’s collective_device_ops to insert their AllReduce ops into our graph.

The class AllReduceSynchronizer class contains the following possible instantiations:

spec=`auto`: single-node multiple devices, or cross-node AllReduce based on collective ops
spec=`nccl`: single-node multiple devices, or cross-node AllReduce based on NCCL
spec=`ring`/’tree’, AllReduce with different reduction structures: ring, tree, etc.

However note that it does not contain the following instantiations:

in_graph_apply(graph_item, var_name)[source]¶

Perform in-graph synchronization based on AllReduce and TensorFlow Collective Ops.

Note that collective ops now only supports dense tensors.

Parameters

Returns

The new graph

Return type

graph_item.GraphItem

assign_cluster_information(num_workers, num_replicas, worker_device, worker_id, canonical_replica_devices, is_chief=False)[source]¶: Store cluster information in the synchronizer.

between_graph_apply(graph_item, var_name)[source]¶: Allreduce synchronizer will do nothing in between-graph synchronization.

classmethod create(name, *args, **kwargs)[source]¶

Create new Synchronizer instance given subclass name.

Parameters

Returns

Synchronizer