All Reduce Synchronizer¶
AllReduce Synchronizer.
-
class
AllReduceSynchronizer
(config: autodist.proto.synchronizers_pb2.AllReduceSynchronizer)[source]¶ Bases:
autodist.kernel.synchronization.synchronizer.Synchronizer
AllReduce Synchronizer.
This AllReduce Synchronizer currently uses TensorFlow’s collective_device_ops to insert their AllReduce ops into our graph.
The class AllReduceSynchronizer class contains the following possible instantiations:
spec=`auto`: single-node multiple devices, or cross-node AllReduce based on collective ops
spec=`nccl`: single-node multiple devices, or cross-node AllReduce based on NCCL
spec=`ring`/’tree’, AllReduce with different reduction structures: ring, tree, etc.
However note that it does not contain the following instantiations:
shuffle reduce (reduce to CPU or GPU as in PS) + AllReduce across nodes
any other types of hybrid reduction of PS and AllReduce.
-
in_graph_apply
(graph_item, var_name)[source]¶ Perform in-graph synchronization based on AllReduce and TensorFlow Collective Ops.
Note that collective ops now only supports dense tensors.
- Parameters
graph_item (graph_item.GraphItem) – the graph_item to be distributed
var_name (str) – the corresponded variable name
- Returns
The new graph
- Return type
-
assign_cluster_information
(num_workers, num_replicas, worker_device, worker_id, canonical_replica_devices, is_chief=False)[source]¶ Store cluster information in the synchronizer.