Ps Synchronizer¶

PS Synchronizer.

class PSSynchronizer(config: autodist.proto.synchronizers_pb2.PSSynchronizer)[source]¶

Bases: autodist.kernel.synchronization.synchronizer.Synchronizer

PS Synchronizer.

Synchronizes gradient updates using a Parameter Server.

For in-graph synchronization, this just aggregates gradients on a worker’s CPU.

For between-graph synchronization, this aggregates gradients on a pre-defined (defined in the Strategy) parameter server.

To keep this gradient aggregation in sync, the chief gives each worker a token for each variable for the workers to mark when their variable update is complete.

in_graph_apply(graph_item, var_name)[source]¶

Apply in-graph ps synchronization.

Parameters

graph_item – the old graph item
var_name – the variable name w/o replica prefix

Returns

graph_item.GraphItem

between_graph_apply(graph_item, var_name)[source]¶

Apply between-graph synchronization to the target ops in the graph.

Parameters

graph_item – The current graph.
var_name – the variable to be synchronized.

Returns

updated graph item.

Return type

graph_item.GraphItem

add_sync_op(graph_item, var_update_op, variable_replicator=None)[source]¶

Adds additional ops needed for synchronous distributed training into current graph.

Main purpose of additional ops are: 1. Initialization 2. Synchronization 3. Gradient aggregation

Parameters

graph_item (graph_item.GraphItem) – the graph
var_update_op – The op
variable_replicator – The dictionary of master variable op name -> list of replicated variables, could be None

Returns

None

assign_cluster_information(num_workers, num_replicas, worker_device, worker_id, canonical_replica_devices, is_chief=False)[source]¶: Store cluster information in the synchronizer.

classmethod create(name, *args, **kwargs)[source]¶

Create new Synchronizer instance given subclass name.

Parameters

name – Name of the Synchronizer subclass (e.g. PSSynchronizer).
*args – Any args for the subclass constructor.
**kwargs – Any kwargs for the subclass constructor.

Returns

Synchronizer

class PSGradientTaskAssigner(op_to_task, agg_grad_ops, apply_grad_ops, ps_device)[source]¶

Bases: object

Make sure that all corresponding PS gradient ops are assigned to the same task.

SHARED_TASK_ID = -1[source]¶

assign()[source]¶: Bi-directionally traverse the graph and assign tasks to ops.