For session 6 of the Eleuther AI ML Scalability & Performance reading group, I gave a presentation covering Zero Bubble Pipeline Parallelism. I also covered 2 key pieces of prior work which provide context, to understand what the limitations were of those prior approaches and put the innovations of Zero Bubble PP in context.

My annotated versions of these papers can be found be found on my Github here.

Papers:

  1. GPipe: Easy Scaling with Micro-Batch Pipeline Parallelism
  2. PipeDream: Fast and Efficient Pipeline Parallel DNN Training
  3. Zero Bubble Pipeline Parallelism

Recording:

ML Scalability & Performance Reading Group Session 6: Zero Bubble Pipeline Parallelism