Call for Papers

Parallel hardware has reshaped machine learning, motivating techniques that parallelize computations once assumed to be “inherently sequential.” Researchers working on recurrent neural networks, diffusion models, Markov chain Monte Carlo, and more have independently converged on similar algorithmic breakthroughs—unlocking order-of-magnitude speedups on modern accelerators—yet best practices remain siloed across these communities. This workshop brings researchers together for concrete discussions about shared techniques, open problems, and future breakthroughs.

Topics of interest

Concrete themes of interest include:

Deep sequence modeling

  • Parallelizing nonlinear RNNs via fixed-point iteration or other techniques
  • Block-parallel and chunked recurrent architectures
  • Hardware-efficient kernels and implementations on modern accelerators

Deep generative modeling

  • Parallel sampling from diffusion models (consistency models, distillation, parallel ODE/SDE solvers)
  • Speculative and parallel decoding for autoregressive models
  • Parallel-in-time methods for flow- and score-based generative models

Classical probabilistic modeling

  • Parallel Kalman filtering and smoothing
  • Parallel particle filters and sequential Monte Carlo
  • Parallel Markov chain Monte Carlo

Theory and limits of parallelizability

  • Computational complexity of Transformers, SSMs, and recurrent architectures
  • Tradeoffs between parallelizability and expressivity
  • Hardness results and lower bounds for sequential tasks
  • Toward a rigorous definition of “inherently sequential”

Cross-cutting

  • Cross-pollination between scientific computing (parareal, MGRIT, multigrid-in-time) and ML
  • Profiling, benchmarking, and best practices for parallel-in-time methods on GPUs/TPUs

Scope

In scope: Works that reduce or circumvent sequential dependencies over sequence length or time — via fixed-point reformulations, parallel scan, block-parallel structures, implicit layers, reordered updates, or algorithmic and architectural innovation — yielding measurable speedups on modern accelerators. Also in scope: theoretical or empirical results on the limits of parallelizability.

Out of scope: Works that solely increase data or model parallelism without reducing sequential dependency.

Submission format

  • Length: 4 pages maximum, excluding references and appendices.
  • Style: NeurIPS 2026 LaTeX style (template will be linked alongside the submission portal).
  • Review: Double-blind; please anonymize submissions.
  • Non-archival: The workshop is non-archival. Accepted papers will be posted on OpenReview for community visibility, but authors retain the right to publish their work elsewhere. Previously published work is not eligible.

AI policy

AI-written submissions and AI-generated reviews are not accepted. AI assistance (e.g., for editing or coding support) is permitted, in line with standard guidelines used at MIT and NYU. Authors are responsible for the content of their submissions.

Important dates

  • Submission deadline: August 29, 2026 (AOE)
  • Reviewer bidding: August 30 – September 4, 2026
  • Review deadline: September 22, 2026 (AOE)
  • Author notification: September 29, 2026 (AOE)
  • Workshop: December 11–13, 2026 (co-located with NeurIPS 2026)

Submission portal

Submissions will be handled through OpenReview. The submission link will be posted here closer to the deadline.

Contact

For any questions, please contact the organizers at xavier18@stanford.edu or scott.linderman@stanford.edu.