Ps Synchronizer

PS Synchronizer.

class PSSynchronizer(config: autodist.proto.synchronizers_pb2.PSSynchronizer)[source]

Bases: autodist.kernel.synchronization.synchronizer.Synchronizer

PS Synchronizer.

Synchronizes gradient updates using a Parameter Server.

For in-graph synchronization, this just aggregates gradients on a worker’s CPU.

For between-graph synchronization, this aggregates gradients on a pre-defined (defined in the Strategy) parameter server.

To keep this gradient aggregation in sync, the chief gives each worker a token for each variable for the workers to mark when their variable update is complete.

in_graph_apply(graph_item, var_name)[source]

Apply in-graph ps synchronization.

Parameters
  • graph_item – the old graph item

  • var_name – the variable name w/o replica prefix

Returns

graph_item.GraphItem

between_graph_apply(graph_item, var_name)[source]

Apply between-graph synchronization to the target ops in the graph.

Parameters
  • graph_item – The current graph.

  • var_name – the variable to be synchronized.

Returns

updated graph item.

Return type

graph_item.GraphItem

add_sync_op(graph_item, var_update_op, variable_replicator=None)[source]

Adds additional ops needed for synchronous distributed training into current graph.

Main purpose of additional ops are: 1. Initialization 2. Synchronization 3. Gradient aggregation

Parameters
  • graph_item (graph_item.GraphItem) – the graph

  • var_update_op – The op

  • variable_replicator – The dictionary of master variable op name -> list of replicated variables, could be None

Returns

None

assign_cluster_information(num_workers, num_replicas, worker_device, worker_id, canonical_replica_devices, is_chief=False)[source]

Store cluster information in the synchronizer.

classmethod create(name, *args, **kwargs)[source]

Create new Synchronizer instance given subclass name.

Parameters
  • name – Name of the Synchronizer subclass (e.g. PSSynchronizer).

  • *args – Any args for the subclass constructor.

  • **kwargs – Any kwargs for the subclass constructor.

Returns

Synchronizer

class PSGradientTaskAssigner(op_to_task, agg_grad_ops, apply_grad_ops, ps_device)[source]

Bases: object

Make sure that all corresponding PS gradient ops are assigned to the same task.

SHARED_TASK_ID = -1[source]
assign()[source]

Bi-directionally traverse the graph and assign tasks to ops.