Back in 1967, Gene Amdahl famously pointed out what seemed like a fundamental limit to how fast you can make your concurrent code: Some amount of a program's processing is fully "O(N)" parallelizable (call this portion p), and only that portion can scale directly on machines having more and more processor cores. The rest of the program's work is "O(1)" sequential (s). [1,2] Assuming perfect use of all available cores and no parallelization overhead, Amdahl's Law says that the best possible speedup of that program workload on a machine with N cores is given by

Note that, as N increases to infinity, the best speedup we can ever get is (s+p)/s. Figure 1 illustrates why a program that is half scalably parallelizable and half not won't scale beyond a factor of two even given infinite cores. Some people find this depressing. They conclude that Amdahl's Law means there's no point in trying to write multicore- and manycore-exploiting applications except for a few "embarrassingly parallel" patterns and problem domains with essentially no sequential work at all.

[Click image to view at full size]

Fortunately, they're wrong. If Amdahl's Game is rigged, well then, to paraphrase a line from the movie WarGames: The only way to win is not to play.