ASAP Seminar Series

Advances in Sequence modeling from Algorithmic Perspectives

About The Seminar

The ASAP is a fully virtual seminar that, while emphasizing efficiency as its name suggests, takes a distinct approach from traditional ML systems seminars. As foundation models grow increasingly powerful through scaling of data and model size, their capabilities remain inherently limited by their architectures. This seminar explores novel architectural designs and paradigms for both training and inference, examining sequence modeling approaches from an algorithmic perspective. The seminar serves as a bridge between theoretical, algorithmic, and systems research communities, tackling fundamental challenges in Transformer models—including reasoning capabilities, generalization power, memorization, and expressiveness. By adopting first-principles approaches, it seeks solutions that address these limitations while maintaining computational efficiency on modern hardware, with the ultimate goal of motivating the next generation of foundation models that overcome current limitations.

Sources

Discord
Don't wait, join our discussion on Discord!
Zoom
Join Zoom for our seminar!
Email
Subscribe our Email List to get notified of the speaker and livestream link every week!
Youtube
Watch the recordings of past talks on Youtube!

Want to present?

1. Get in touch

Contact us via email or any other method with the topic you'd like to present and your preferred time slot.

2. We add you

We will add you to an available time slot.

3. Present

Present your paper or a topic of interest.

Event Schedule

Time
Presenter
Topic
Source
10:00 PM EDT
3/27/2025
MoBA: Mixture of Block Attention for Long-Context LLMs
2:00 PM EDT
3/20/2025
Forgetting Transformer: Softmax Attention with a Forget Gate
1:00 PM EDT
3/18/2025
What's so interesting about models with recurrent depth?
1:30 PM EST
3/12/2025
B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory
1:30 PM EST
3/5/2025
State Tracking in Scalable Linear RNNs
1:30 PM EST
3/3/2025
Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study
4:00 PM EST
2/24/2025
GSM-Infinite: How Do Your LLMs Behave over Infinitely Increasing Context Length and Reasoning Complexity?
1:30 PM EST
2/19/2025
Test-time regression: a unifying framework for designing sequence models with associative memory

Organizers

This seminar is organized by

Speaker 3

Songlin Yang

Massachusetts Institute of Technology

Speaker 2

Malachy Yang

Carnegie Mellon University

Speaker 1

Han Guo

Massachusetts Institute of Technology

Speaker 1

Simran Arora

Stanford University