Random Axis Partition All Reduce Strategy

Partitioned PS StrategyBuilder with Greedy Load Balancer.

class RandomAxisPartitionAR(chunk_size=128)[source]

Bases: autodist.strategy.base.StrategyBuilder

Partitioned AR StrategyBuilder.

This StrategyBuilder generates a strategy that partitions each variable along its first dimension, and synchronizes them using AllReduce. It might be advantageous for communicating extremely large messages – when synchronizing a single message is bounded by single-flow bandwidth.

It will also sequentially merge collective ops into a single collective group based on chunk_size. This strategy does not support synchronizing sparse updates with >1 nodes due to the TF AllGather bug.

build(graph_item, resource_spec)[source]

Generate the Strategy.

static get_num_shards_and_axis(var, grad)[source]

Gets the minimum number of shards for a variable.