Hiroki Naganuma

Day1

Opening Remark / David Kanter / MLCommons

Competition Blog Post

Intro Talk: The AlgoPerf Benchmark / Frank Schneider / University of Tübingen

QA

Next TODO:

Spotlight Talk I: External Tuning Track Winner / Scaling Beyond Diagonal Preconditioners for Training Neural Networks At-Scale / Anna Cai & Michael Shi / Meta AI

  1. correcting scaling across blocks
  2. corecting preconditioner staleness

Eigenvalue-corrected shampoo

Spotlight Talk II: Self-Tuning Track Winner Schedules & Schedule-Free Learning / Aaron Defazio / Meta AI

Two theory practice mismatch

[1] Folk-law: sqrt t schedule is bad, a flat schedule is worse

  1. why does not polyak averaging work well

Lightning Talks / AlgoPerf Submissions & their Follow-Ups

Niccolò Ajroldi: Weight Averaging Techniques on AlgoPerf

Niccolò Ajroldi

  1. speedup training
  1. improve generalization
  2. replace lr schedule
David Tweedle: Applying Randomized Singular Value Decomposition During Training

David Tweedle

Sourabh Medapati: Lessons from competing in AlgoPerf v0.5

Roundtable Discussion

A moderated open audience discussion on “The Future of Training Algorithms”

Day2

Invited Talk I / How does Gradient Descent Work? / Jeremy Cohen - Flatiron Institute

Invited Talk II / What is the best O(n) Hessian query? / Madeleine Udell - Stanford University

Challenges in Training PINNs: A Loss Landscape Perspective

low rank aproximation of curvature

Invited Talk III / Stochastic-Gradient-based Algorithms for Nonconvex Constrained Optimization and Learning / Frank E. Curtis - Lehigh University

Roundtable Discussion / A moderated open audience discussion on “Training Algorithms in Production” / Michael Rabbat - Meta AI

Panel Discussion / The Future of AlgoPerf / George Dahl - Google DeepMind

Topics:

Other Referece

Acknowlegements