-
Beyond Test-Time Training: Learning to Reason via Hardware-Efficient Optimal Control
Peihao Wang,
Shan Yang,
Xijun Wang,
Tesi Xiao,
Xin Liu,
Changlong Yu,
Yu Lou,
Pan Li,
Atlas Wang,
Ming Lin,
Rene Vidal
preprint, 2026
We propose the Test-Time Control (TTC) for LLM reasoning, which formulates iterative reasoning as an optimal control problem solvable with hardware-efficient algorithms.
[Paper]
[Code]

-
VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
Zhiwen Fan,
Jian Zhang,
Renjie Li,
Junge Zhang,
Runjin Chen,
Hezhen Hu,
Kevin Wang,
Peihao Wang,
Huaizhi Qu,
Shijie Zhou,
Dilin Wang,
Zhicheng Yan,
Hongyu Xu,
Justin Theiss,
Tianlong Chen,
Jiachen Li,
Zhengzhong Tu,
Zhangyang Wang,
Rakesh Ranjan
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2026
We proposed a unified Vision-Language Model (VLM) framework integrating 3D reconstructive instruction tuning for deep spatial understanding from monocular video.
[Project Page]
[Paper]
[Code]

-
∇-Reasoner: LLM Reasoning via Test-Time Gradient Descent in Latent Space
International Conference on Learning Representations (ICLR), 2026
We propose ∇-reasoner, an iterative decoding approach with policy refinement by test-time gradient descent on textual representations to improve LLM reasoning.
[Paper]
[Code]

-
Understanding the Mixture-of-Experts with Nadaraya-Watson Kernel
Chuanyang Zheng,
Jiankai Sun,
Yihang Gao,
Enze Xie,
Yuehao Wang,
Peihao Wang,
Ting Xu,
Matthew Chan,
Liliang Ren,
Jingyao Li,
Jing Xiong,
Kashif Rasul,
Mac Schwager,
Anderson Schneider,
Atlas Wang,
Yuriy Nevmyvaka,
International Conference on Learning Representations (ICLR), 2026
We rethink the MoE with Nadaraya-Watson Kernel and propose KERN router to replace Softmax router to achieve better performance.
[Paper]
-
Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models
Advances in Neural Information Processing Systems (NeurIPS), 2025
We introduce Graph-KV, a zero-shot method that injects graph structures into large language models to improve long-context, multi-hop, and retrieval-augmented reasoning.
[Paper]
[Code]

-
SAS: Simulated Attention Score
Chuanyang Zheng,
Jiankai Sun,
Yihang Gao,
Yuehao Wang,
Peihao Wang,
Jing Xiong,
Liliang Ren,
Hao Cheng,
Janardhan Kulkarni,
Yelong Shen,
Atlas Wang,
Mac Schwager,
Anderson Schneider,
Xiaodong Liu,
Jianfeng Gao
Advances in Neural Information Processing Systems (NeurIPS), 2025
We introduce Simulated Attention Score (SAS), which boosts LLM performance by "simulating" a larger number of attention heads and hidden features with a compact model.
[Paper]
-
Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights
Zhiyuan Liang,
Dongwen Tang,
Yuhao Zhou,
Xuanlei Zhao,
Mingjia Shi,
Wangbo Zhao,
Zekai Li,
Peihao Wang,
Konstantin Schürholt,
Damian Borth,
Michael Bronstein,
Yang You,
Atlas Wang,
Kai Wang
Advances in Neural Information Processing Systems (NeurIPS), 2025
We develop Drag-and-Drop LLMs, a zero-shot prompt-to-weights approach that enables on-the-fly model adaptation without fine-tuning.
[Project Page]
[Paper]
[Code]

-
CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy
IEEE International Conference on Computer Vision (ICCV), 2025
We train an end-to-end reconstruction network that recovers 3D protein structures from low SNR cryogenic electronic images in seconds.
[Paper]
-
Rethinking Addressing in Language Models via Contextualized Equivariant Positional Encoding
International Conference on Machine Learning (ICML), 2025
We enhance the reasoning ability of large language models via contextualizing positional encoding with equivariance constraints.
[Paper]
[Code]

-
Why Neural Network Can Discover Symbolic Structures with Gradient-based Training: An Algebraic and Geometric Foundation for Neurosymbolic Reasoning
International Conference on Neuro-symbolic Systems (NeuS), 2025, Disruptive Idea Awards
We theoretically demonstrate that symbolic structures are inherent in neural weight space and gradient descent under geometric constraints can find these solutions.
[Paper]
-
Steepest Descent Density Control for Compact 3D Gaussian Splatting
Peihao Wang*,
Yuehao Wang*,
Dilin Wang,
Sreyas Mohan,
Zhiwen Fan,
Lemeng Wu,
Ruisi Cai,
Yu-Ying Yeh,
Atlas Wang,
Qiang Liu,
Rakesh Ranjan
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025
We demystify the role of densification for Gaussian splatting in escaping saddle points during optimization and propose an optimal density control strategy for the steepest loss descent.
[Project Page]
[Paper]
[Code]

-
FlexGS: Train Once, Deploy Everywhere with Many-in-One Flexible 3D Gaussian Splatting
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2025
We present an elastic inference method for 3DGS that requires one-time training while enabling an adaptive subset selection of point clouds for dynamic inference-time budgets.
[Project Page]
[Paper]
[Code]

-
Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing
International Conference on Learning Representations (ICLR), 2025
We theoretically and empirically reveal and tackle locality and over-smoothing bottlenecks for state space models.
[Paper]
[Code]

-
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
Peihao Wang,
Zhiwen Fan,
Dejia Xu,
Dilin Wang,
Sreyas Mohan,
Forrest Iandola,
Rakesh Ranjan,
Yilei Li,
Qiang Liu,
Atlas Wang,
Vikas Chandra
International Conference on Artificial Intelligence and Statistics (AISTATS), 2025
We adopt control variate method constructed by Stein identity to reduce variance in Monte Carlo estimation for text-to-3D score distillation.
[MarkTechPost]
[Project Page]
[Paper]
[Code]

-
READ-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design
Advances in Neural Information Processing Systems (NeurIPS), 2024
We refactorize LLMs to MoE models with a system-friendly pre-computed routing policy.
[Paper]
[Code]

-
Large Spatial Model: Real-time Unposed Images to Semantic 3D
Zhiwen Fan*,
Jian Zhang*,
Wenyan Cong,
Peihao Wang,
Renjie Li,
Kairun Wen,
Shijie Zhou,
Achuta Kadambi,
Atlas Wang,
Danfei Xu,
Boris Ivanovic,
Marco Pavone,
Yue Wang
Advances in Neural Information Processing Systems (NeurIPS), 2024
We propose Large Spatial Model, an all-in-one pipeline accomplishing uncalibrated 3D reconstruction, understanding, and interaction in real time.
[Project Page]
[Paper]
[Code]

-
Perspective-Aligned AR Mirror with Under-Display Camera
SIGGRAPH Asia (Journal Track), ACM Transactions on Graphics, 2024, Best Paper Award
We propose an efficient learning-based algorithm that restores images formed by under-display cameras.
[Award Press]
[Paper]
-
VersatileGaussian: Real-time Neural Rendering for Versatile Tasks using Gaussian Splatting
European Conference on Computer Vision (ECCV), 2024
We enable multi-task representation learning with 3D Gaussian Splatting, which achieves real-time multi-task decoding for novel views.
[Project Page]
[Paper]
-
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
Peihao Wang,
Dejia Xu,
Zhiwen Fan,
Dilin Wang,
Sreyas Mohan,
Forrest Iandola,
Rakesh Ranjan,
Yilei Li,
Qiang Liu,
Atlas Wang,
Vikas Chandra
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
We derive an entropy-maximizing score distillation rule that fosters view diversity and addresses the multi-face problem for text-to-3D generation.
[Project Page]
[Paper]
[Code]

-
Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3D
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024
We propose Lift3D that lifts arbitrary 2D backbone to generate view-consistent predictions without any retraining.
[Project Page]
[Paper]
[Code]

-
Polynomial Width is Sufficient for Set Representation with High-dimensional Features
International Conference on Learning Representations (ICLR), 2024
We theoretically show that polynomial many neurons are sufficient for set representation with DeepSets architecture.
[Paper]
-
Patch Diffusion: Faster and More Data-Efficient Training of Diffusion Models
Advances in Neural Information Processing Systems (NeurIPS), 2023
We show that diffusion models can be well trained with merely image patches, which reduces training costs while improving data efficiency.
[Paper]
[Code]

-
Vision HGNN: An Image is More than a Graph of Nodes
Yan Han,
Peihao Wang,
Souvik Kundu,
Ying Ding,
Atlas Wang
IEEE International Conference on Computer Vision (ICCV), 2023, Oral
In this paper, we model an image with a hypergraph to capture high-order interactions, and further employ Hypergraph Neural Networks (HGNN) for learning representations.
[Paper]
[Code]

-
Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts
Wenyan Cong*,
Hanxue Liang*,
Peihao Wang,
Zhiwen Fan,
Tianlong Chen,
Mukund Varma T,
Yi Wang,
Atlas Wang
IEEE International Conference on Computer Vision (ICCV), 2023
We scale up and improve the generalization ability of Generalizable NeRF Transformer (GNT) via plugging in the Mixture-of-Expert (MoE) layers.
[Paper]
[Code]

-
Data Efficient Neural Scaling Law via Model Reusing
Peihao Wang,
Rameswar Panda,
Atlas Wang
International Conference on Machine Learning (ICML), 2023
We empirically reveal that the existing power laws under-estimate the data inefficiency of large transformers, and leverage model reusing to reproduce the power law under the data scarcity regime.
[Paper]
[Code]

-
NeuralLift-360: Lifting An In-the-wild 2D Photo to A 3D Object with 360 Views
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2023, Highlight
We study how to lift a single image to a 3D object and generate its 360° views that correspond well with the given reference image.
[Project Page]
[Paper]
[Code]

-
Equivariant Hypergraph Diffusion Neural Operators
International Conference on Learning Representations (ICLR), 2023
This work proposes a new hypergraph neural network architecture, which provably represents any continuous equivariant hypergraph diffusion operators.
[Paper]
[Code]

-
Is Attention All That NeRF Needs?
International Conference on Learning Representations (ICLR), 2023
We present Generalizable NeRF Transformer (GNT), a pure, unified transformer-based architecture that efficiently reconstructs Neural Radiance Fields (NeRFs) on the fly.
[Project Page]
[Paper]
[Code]

-
NeRF-SOS: Any-View Self-supervised Object Segmentation on Complex Scenes
International Conference on Learning Representations (ICLR), 2023
We propose a novel collaborative contrastive loss for NeRF to segment objects in complex real-world scenes, without any annotation.
[Project Page]
[Paper]
[Code]

-
Learning to Grow Pretrained Models for Efficient Transformer Training
International Conference on Learning Representations (ICLR), 2023, Spotlight
This paper proposes to accelerate transformer training by re-using pretrained models via a learnable, linear and sparse model growth operator.
[MIT News]
[Paper]
[Code]

-
Signal Processing for Implicit Neural Representations
Advances in Neural Information Processing Systems (NeurIPS), 2022
We propose a theoretically grounded signal processing framework for Implicit Neural Representations (INR), which analytically manipulates INRs on the weight space through differential operators.
[Project Page]
[Paper]
[Code]

-
Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again
Advances in Neural Information Processing Systems (NeurIPS), 2022
We derive a topology-aware isometric initialization and a Dirichlet Energy guided achitectural rewiring technique that boost vanilla-GCNs to be the state-of-the-art.
[Paper]
[Code]

-
Unified Implicit Neural Stylization
European Conference on Computer Vision (ECCV), 2022
This work explores stylizing an implicit neural representation, using a generalized approach that can apply to various 2D and 3D representations.
[Project Page]
[Paper]
[Code]

-
SinNeRF: Training Neural Radiance Fields on Complex Scenes from a Single Image
European Conference on Computer Vision (ECCV), 2022
We present Single View NeRF (SinNeRF) consisting of thoughtfully designed semantic and geometry regularizations to train neural radiance field using only a single view.
[Project Page]
[Paper]
[Code]

-
Neural Implicit Dictionary Learning via Mixture-of-Expert Training
International Conference on Machine Learning (ICML), 2022
We present Neural Implicit Dictionary (NID) that learns and represents implicit neural representation as a sparse mixture of expert networks.
[Paper]
[Code]

-
Bag of Tricks for Training Deeper Graph Neural Networks: A Comprehensive Benchmark Study
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
We present the first fair and reproducible benchmark dedicated to assessing the "tricks" of training deep GNNs.
[Paper]
[Code]

-
Aug-NeRF: Training Stronger Neural Radiance Fields with Triple-Level Augmentations
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022
We propose Aug-NeRF which augments NeRF with worst-case perturbations in three distinct levels with physical grounds.
[Paper]
[Code]

-
CADTransformer: Panoptic Symbol Spotting Transformer for CAD Drawings
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022, Oral
We present CADTransformer, a transformer based framework, to tackle the panoptic symbol spotting task for computer-aided design drawings.
[Paper]
[Code]

-
Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice
International Conference on Learning Representations (ICLR), 2022
We prove that self-attention is no more than low-pass filter, and propose two simple yet effective methods to counteract excessive smoothening.
[Paper]
[Code]

-
Delayed Propagation Transformer: A Universal Computation Engine towards Practical Control in Cyber-Physical Systems
Advances in Neural Information Processing Systems (NeurIPS), 2021
We presents the Delayed Propagation Transformer (DePT) that specializes in the global modeling of CPS while taking into account the immutable constraints from the physical world.
[Paper]
[Code]

-
SoGCN: Second-Order Graph Convolutional Networks
arXiv preprint, 2021
We prove that second-order graph convolution is the maximally localized kernel with full representation power.
[Paper]
[Code]

-
TightCap: 3D Human Shape Capture with Clothing Tightness Field
ACM Transactions on Graphics (TOG), 2021
We propose a data-driven approach to capture both the human shape and dressed garments with only a single 3D human scan, by predicting clothing tightness.
[Project Page]
[Paper]
[Code]

Copyright © 2026 Peihao Wang. All Rights Reserved.
Powered by UIkit