Is Parallel Programming Hard, And, If So, What Can You Do About It?

[img_assist|nid=1292|title=|desc=|link=none|align=left|width=584|height=584]Edited by:

Paul E. McKenney
Linux Technology Center
IBM Beaverton

January 5, 2011


The purpose of this book is to help you understand how to program shared-memory parallel machines without risking your sanity. By describing the algorithms and designs that have worked well in the past, we hope to help you avoid at least some of the pitfalls that have beset parallel projects. But you should think of this book as a foundation on which to build, rather than as a completed cathedral. Your mission, if you choose to accept, is to help make further progress in the exciting field of parallel programming, progress that should in time render this book obsolete. Parallel programming is not as hard as it is reputed, and it is hoped that this book makes it even easier for you.

This book follows a watershed shift in the parallel programming field, from being primarily the domain of science, research, and grand-challenge projects to being primarily an engineering discipline. In presenting this engineering discipline, this book will examine the specific development tasks peculiar to parallel programming, and describe how they may be most effectively handled, and, in some surprisingly common special cases, automated.

This book is written in the hope that presenting the engineering discipline underlying successful parallel-programming projects will free a new generation of parallel hackers from the need to slowly and painstakingly reinvent old wheels, instead focusing their energy and creativity on new frontiers. Although the book is intended primarily for self-study, it is likely to be more generally useful. It is hoped that this book will be useful to you, and that the experience of parallel programming will bring you as much fun, excitement, and challenge as it has provided the authors over the years.


  1. Introduction
  2. Hardware and its Habits
  3. Tools of the Trade
  4. Counting
  5. Partitioning and Synchronization Design
  6. Locking
  7. Data Ownership
  8. Deferred Processing
  9. Applying RCU
  10. Validation: Debugging and Analysis
  11. Data Structures
  12. Advanced Synchronization
  13. Ease of Use
  14. Time Management
  15. Conflicting Visions of the Future
  • A Important Questions
  • B Synchronization Primitives
  • C Why Memory Barriers?
  • D Read-Copy Update Implementations
  • E Formal Verification
  • F Answers to Quick Quizzes
  • G Glossary
  • H Credits


Download PDF.


This is a good book, explains a number of things integral to parallel programming in the classical task model. But, why not consider (relatively) new programming paradigms, which are now being introduced to programmers in main stream languages (e.g., go) such as, CSP (handshake message passing), stream computing (where all the parallelization is visible), synchronous and globally asynchronous locally synchronous programming languages, pi-calculus, etc. I truly believe that sticking to the classical task model of locks and semaphores is now outdated. It is still the most popular but, by introducing these new paradigms to programmers and hiding the ugly locking mechanisms underneath these design principles, parallel programming can be made much more easier than it seems currently.

In fact, I will go ahead and state that the classical task model that is described in this book does not bode well with safety critical system (like fly by wire, where I have used type safe stream and synchronous languages) and also, high performance computing (where again I have used and am using streaming languages on GPU/CPU combinations), because streaming languages are just much easier to parallelize and compile for both; throughput and safety requirements. But, then I guess not many programmers are targeting these two very niche areas.


Yes, why not consider modern languages which make the parallel programming easier, like Scala?

CPUs are not ideal for parallel programs. GPUs on the other side have been designed right for that purpose. I will go with OpenCL...