-
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
International Conference on Learning Representations (ICLR), 2025
We theoretically and empirically reveal and tackle locality and over-smoothing bottlenecks for state space models.
[Paper]
[Code]

-
Polynomial Width is Sufficient for Set Representation with High-dimensional Features
International Conference on Learning Representations (ICLR), 2024
We theoretically show that polynomial many neurons are sufficient for set representation with the DeepSets architecture.
[Paper]
-
Learning to Grow Pretrained Models for Efficient Transformer Training
International Conference on Learning Representations (ICLR), 2023, Spotlight
This paper proposes to accelerate transformer training by re-using pretrained models via a learnable, linear and sparse model growth operator.
[MIT News]
[Paper]
[Code]

-
Is Attention All That NeRF Needs?
International Conference on Learning Representations (ICLR), 2023
We present Generalizable NeRF Transformer (GNT), a pure, unified transformer-based architecture that efficiently reconstructs Neural Radiance Fields (NeRFs) on the fly.
[Project Page]
[Paper]
[Code]

-
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
International Conference on Learning Representations (ICLR), 2022
We prove that self-attention is no more than low-pass filter, and propose two simple yet effective methods to counteract excessive smoothening.
[Paper]
[Code]

Keep thy heart with all diligence; for out of it are the issues of life.
My journey in computer programming began at ten years old when my father introduced me to Scratch programming. I spent my leisure time during high school developing a database management software myBase 7. At my college, I was fortunate to work with Prof. Jingyi Yu on 3D vision and computational imaging, and studied algebra from Prof. Manolis C. Tsakiris. I also worked with Prof. Jianbo Shi on graph learning and spectral graph theory. It is always hard to be a starter, hereby I would express my sincere gratitude to those who guided me walk through the novice village of academia.