The ordering starts to matter early in the pipeline. How index buffers are chunked, which FIFOs are broadcast to and read from, and which units are locally selected or presumed by the distributor to be handled by a different GPU, are based on the sequentially equivalent behavior of the distributor blocks and their arbiters. The chunking of the primitive stream and handling of primitives that span screen space boundaries can be affected by what each GPU calculates is its particular chunk or FIFO tag. If a hull shader's output is broadcast by GPU A to a FIFO and tagged with ordering number 1000, it doesn't help if GPU B was expecting it at 1001.

Deciding which primitives can be discarded in an overlapping scenario can cause inconsistencies if different tiles do not agree on the order.

Click to expand...