Scott Meyers

Fastware is software that's fast — that gets the job done quickly. Low latency is the name of the game, and achieving it calls for insights from software engineering, computer science, and the effective use of C++. This presentation addresses crucial issues in each of these areas, covering topics as diverse as CPU caches, speed-sensitive use of the STL, data structures supporting concurrency, profile-guided optimization, and more.

Much of the material in "Fastware for C++" is unique to this seminar, i.e., unavailable in Scott's publications or his other training courses. However, as the successor to Scott's acclaimed "High-Performance C++ Programming" seminar, "Fastware for C++" also includes updated discussions of topics from that course as well as from Scott's books, Effective C++, More Effective C++, and Effective STL.

Course Highlights

Participants will gain:

Recognition of the importance and implications of treating performance as a correctness criterion.

Understanding of how effective use of third-party APIs can improve system performance.

Knowledge of specific C++ practices that improve the speed of both the language and the STL.

Familiarity with concurrent data structures and algorithms poised to become de facto standards.

Who Should Attend

Systems designers, programmers, and technical managers involved in the design, implementation, and maintenance of performance-sensitive libraries and applications using C++. Participants should already know the basic features of C++ (e.g., classes, inheritance, virtual functions, templates), but expertise is not required. Knowledge of common threading constructs (e.g., threads, mutexes, condition variables, etc.) is helpful. People who have learned C++ recently, as well as people who have been programming in C++ for many years, will come away from this seminar with useful, practical, proven information.

Format

Lecture and question/answer. There are no hands-on exercises, but participants are welcome — encouraged! — to bring computers to experiment with the material as it is presented.

Length

Two full days (six to seven lecture hours per day).

Detailed Topic Outline

Treating speed as a correctness criterion.

Why "first make it right, then make it fast" is misguided.

Latency, initial and total.

Other performance measures.

Designing for speed.

Optimizing systems versus optimizing programs. ◦Most system components are "foreign."

Exercising indirect control over "foreign" components.

Examples.

CPU Caches and why they're important. ◦Data caches, instruction caches, TLBs.

Cache hierarchies, cache lines, prefetching, and traversal orders.

Cache coherency and false sharing.

Cache associativity.

Guidelines for effective cache usage.

Optimizing C++ usage: ◦Move semantics.

Avoiding unnecessary object creations.

When custom heap management can make sense.

Optimizing STL usage: ◦reserve and shrink_to_fit.

Range member functions.

Using function objects instead of function pointers.

Using sorted vectors instead of associative containers.

A comparison of STL sorting-related algorithms.

An overview of concurrent data structures. ◦Meaning of "concurrent data structure."

Use cases.

Common offerings in TBB and PPL.

Writing your own.

An overview of concurrent STL-like algorithms. ◦Thread management and exception-related issues.

Common offerings in TBB and PPL.

OpenMP.

Other TBB and PPL offerings.

Exploiting "free" concurrency.

Meaning of "free."

Multiple-approach problem solving.

Speculative execution.

Making use of PGO (profile-guided optimization) and WPO (whole-program optimization).

Resources for further information.

For more information on this course, contact Scott directly.