Gongwei Chen

gongweichen.jpg

I am currently an associate professor at School of Information Science and Technology, Harbin Institute of Technology (Shenzhen). Before that, I worked as a PostDoc in Prof. Liqiang Nie’s group. I received my PhD degree from Institute of Computing Technology, Chinese Academy of Sciences in 2023, supervised by Prof. Shuqiang Jiang, and my bachelor degree from University of Science and Technology Beijing in 2016. I also had close collaboration with Prof. Rui Shao, Prof. Miao Zhang, and Prof. Xinhang Song.

My research interests focus on the broad areas of multimodal learning, AI agent, efficient learning, and scene understanding. Recently, I focus on

  • Multimodal Large Language Models (MLLM)
  • MLLM-based Agent
  • MLLM-based Multimodal Retrieval
  • Dataset Distillation

news

Oct 14, 2025 One paper about GUI Agent Evaluation is accepted by NeurIPS 2025! :sparkles:
Jul 06, 2025 One paper about MLLM-based Unified Multimodal Retrieval is accepted by ACM MM 2025! :sparkles:
Jun 26, 2025 Two papers about MLLM, GUI Agent (Highlight) are accepted by ICCV 2025! :sparkles:
May 16, 2025 One paper about GUI Agent are accepted by ACL main 2025! :sparkles:
Feb 11, 2025 One paper about GUI Agent Benchmark is accepted by ICLR 2025 as Spotlight! :sparkles:
Feb 05, 2025 Two papers about AI Agent, Dataset Distillation are accepted by CVPR 2025! :sparkles:

selected publications

  1. Enhancing GUI Agent with Uncertainty-Aware Self-Trained Evaluator
    Gongwei Chen, Lirong Jie, Lexiao Zou, Weili Guan, Miao Zhang, and Liqiang Nie
    In Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025
  2. Less is More: Empowering GUI Agent with Context-Aware Simplification
    Gongwei Chen, Xurui Zhou, Rui Shao, Yibo Lyu, Kaiwen Zhou, Shuai Wang, Wentao Li, Yinchuan Li, and 2 more authors
    In International Conference on Computer Vision (ICCV), Highlight , 2025
  3. MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models
    Leyang Shen*, Gongwei Chen*, Rui Shao, Weili Guan, and Liqiang Nie
    In Thirty-eighth Annual Conference on Neural Information Processing Systems (NeurIPS), 2024
  4. LION: Empowering multimodal large language model with dual-level visual knowledge
    Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, and Liqiang Nie
    In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024